Compare GPT-4o, Claude 3.5 Sonnet, Gemini, and 5 more models. Enter your token volume and request count — see exact monthly costs. Free, no signup, runs in your browser.
Select model · Enter token volume · Results update live
All prices in USD per 1 million tokens, pay-as-you-go, June 2026.
| Model | Input / 1M | Output / 1M | Context | Best for |
|---|---|---|---|---|
| GPT-4o miniCHEAPEST | $0.15 | $0.60 | 128K | High-volume tasks |
| Gemini 1.5 Flash | $0.075 | $0.30 | 1M | Ultra-high volume |
| Claude 3.5 Haiku | $0.80 | $4.00 | 200K | Quality on a budget |
| GPT-4o | $2.50 | $10.00 | 128K | General flagship |
| Claude 3.5 Sonnet | $3.00 | $15.00 | 200K | Coding & reasoning |
| Gemini 1.5 Pro | $1.25 | $5.00 | 1M | Long context tasks |
| OpenAI o1 | $15.00 | $60.00 | 200K | Hard reasoning |
| Claude 3 Opus | $15.00 | $75.00 | 200K | Most capable Claude |
OpenAI, Anthropic, and Google all offer batch processing at 50% discount for async workloads (up to 24h latency). Zero quality difference.
Monthly cost for common production workloads.
| Workload | Volume | GPT-4o mini | GPT-4o | Claude Sonnet |
|---|---|---|---|---|
| Customer support bot | 10K req/day · 1K+300 tok | $4.05 | $82.50 | $97.50 |
| Document summarizer | 1K docs/day · 4K+800 tok | $9.36 | $204 | $246 |
| Code review assistant | 500 req/day · 3K+1K tok | $9.45 | $195 | $247.50 |
| RAG Q&A system | 5K req/day · 2K+500 tok | $22.50 | $450 | $562.50 |
Ready to build? Start with free cloud credits:
GPT-4o costs $2.50 per million input tokens and $10.00 per million output tokens (pay-as-you-go, June 2026).
Gemini 1.5 Flash at $0.075/$0.30 per million tokens is the cheapest capable model. GPT-4o mini ($0.15/$0.60) is the cheapest OpenAI option.
Claude 3.5 Sonnet costs $3.00 per million input tokens and $15.00 per million output tokens via the Anthropic API.
Use cheaper models (GPT-4o mini vs GPT-4o is 16× cheaper). Enable Batch API for 50% off async workloads. Cache repeated prompts. Trim system prompts. Set max_tokens explicitly.
A token is roughly 4 characters or 0.75 words. A 1,000-word document is approximately 1,333 tokens. Most LLM APIs charge separately for input (prompt) and output (completion) tokens.