Estimate DeepSeek V3 and R1 API costs instantly. DeepSeek V3 at $0.27/1M input tokens is one of the cheapest frontier models — 9× less than GPT-4o. Free, no signup.
Includes context cache hit savings — enter your system prompt size to model real costs
DeepSeek API rates in USD per million tokens, June 2026. Accessed via api.deepseek.com.
| Model | Input / 1M | Cache Hit / 1M | Output / 1M | Context | Best for |
|---|---|---|---|---|---|
| DeepSeek-V3CHEAPEST | $0.27 | $0.07 | $1.10 | 64K | General tasks, coding, summarization |
| DeepSeek-R1 | $0.55 | $0.14 | $2.19 | 64K | Reasoning, math, complex analysis |
DeepSeek's context cache (cache hit) prices repeated tokens at ~74% below standard input price. For agentic workloads with large system prompts or RAG contexts, this compounds into massive savings at scale.
Cost comparison at 50,000 requests/month with 2,000 input + 500 output tokens.
| Model | Input / 1M | Output / 1M | Monthly cost | vs DeepSeek V3 |
|---|---|---|---|---|
| DeepSeek-V3CHEAPEST | $0.27 | $1.10 | $54.50 | — |
| DeepSeek-R1 | $0.55 | $2.19 | $109.75 | 2× more |
| Gemini 2.0 Flash | $0.10 | $0.40 | $20.00 | 0.37× less |
| GPT-4o mini | $0.15 | $0.60 | $30.00 | 0.55× less |
| Claude Haiku 4.5 | $0.80 | $4.00 | $180.00 | 3.3× more |
| GPT-4o | $2.50 | $10.00 | $625.00 | 11.5× more |
| Claude Sonnet 4.6 | $3.00 | $15.00 | $875.00 | 16× more |
DeepSeek-V3 is highly capable but may underperform on nuanced instruction following, long-form creative tasks, and complex multi-step reasoning compared to GPT-4o or Claude Sonnet. For cost-sensitive, high-volume workloads (classification, extraction, summarization), V3 is an excellent choice.
DeepSeek-V3: $0.27/1M input, $1.10/1M output. Cache hits: $0.07/1M. DeepSeek-R1: $0.55/1M input, $2.19/1M output. Cache hits: $0.14/1M. All prices are pay-as-you-go USD via api.deepseek.com (June 2026).
Yes — DeepSeek-V3 is about 9× cheaper than GPT-4o on both input ($0.27 vs $2.50/1M) and output ($1.10 vs $10.00/1M). For 50,000 requests/month with 2K input + 500 output tokens: DeepSeek costs ~$55/month vs GPT-4o's ~$625/month. However, Gemini 2.0 Flash ($0.10/$0.40) is cheaper than V3 for simple tasks.
DeepSeek caches repeated context (system prompts, documents) server-side. Cache hits are billed at $0.07/1M for V3 — 74% off the standard $0.27/1M input rate. Enable by structuring your API calls to put cacheable content at the start of your context. This is especially effective for RAG pipelines where the same documents are retrieved repeatedly.
Yes. DeepSeek-V3 and R1 are open-weight models available on Hugging Face. You can self-host on GPU instances — a single H100 (~$3/hr on Lambda Labs) can serve V3 at roughly 15–20 tokens/second for low-concurrency workloads. At 50M+ tokens/month, self-hosting typically breaks even vs the API. Below that, the managed API is usually more cost-effective.