Estimate Anthropic Claude 3.5 Sonnet, Haiku, and Opus API costs instantly — including prompt caching savings. Free, no signup, all calculations run in your browser.
Includes prompt caching savings estimate — Claude's biggest cost advantage
Claude's prompt caching stores repeated context (system prompts, RAG documents, few-shot examples) on Anthropic's servers. Cache hits are billed at 10% of normal input price. The calculator above models this in real time.
Anthropic API rates in USD per million tokens, June 2026.
| Model | Input / 1M | Cached Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Claude 3.5 HaikuCHEAPEST | $0.80 | $0.08 | $4.00 | 200K |
| Claude 3.5 Sonnet | $3.00 | $0.30 | $15.00 | 200K |
| Claude 3 Opus | $15.00 | $1.50 | $75.00 | 200K |
Anthropic's Message Batches API processes requests asynchronously (up to 24h) at 50% off standard pricing — same as OpenAI's Batch API. Any async workload should use this.
Claude 3.5 Haiku: $0.80/$4.00 per million tokens. Claude 3.5 Sonnet: $3.00/$15.00. Claude 3 Opus: $15.00/$75.00. Cached inputs are 90% cheaper. All prices are pay-as-you-go USD (June 2026).
Prompt caching stores repeated context server-side. Cache hits cost 10% of normal input price — $0.30/M instead of $3.00/M on Sonnet. For a 2,000-token system prompt on 100K requests/month: saves ~$540/month on Sonnet. Enable via cache_control: {type: "ephemeral"} in your API call.
On raw pricing, GPT-4o ($2.50/$10.00) is cheaper than Sonnet ($3.00/$15.00). But with prompt caching on large system prompts, Sonnet's effective input cost drops to $0.30/M — cheaper than GPT-4o. For workloads with significant repeated context, Claude wins on cost.
Claude 3.5 Sonnet for most tasks — best balance of quality and cost. Claude 3.5 Haiku for high-volume simple tasks (3.75× cheaper). Claude 3 Opus for maximum capability — use sparingly. With caching, Sonnet is often the cheapest for large-context workloads.