DeepSeek V3 vs R1 — which should I use?

Use DeepSeek-V3 for general tasks: coding, writing, summarization, classification. It is faster and cheaper ($0.27/$1.10 per 1M). Use DeepSeek-R1 for complex reasoning, math, and multi-step problem solving — it uses chain-of-thought reasoning and costs $0.55/$2.19 per 1M. R1 is comparable to o1-mini but at roughly 5× lower cost.

DeepSeek API Cost Calculator 2026 — DeepSeek V3 & R1 Pricing

Q: How much does the DeepSeek API cost?

DeepSeek API pricing (June 2026): DeepSeek-V3 costs $0.27/1M input and $1.10/1M output tokens. Cache hits are $0.07/1M input. DeepSeek-R1 (reasoning model) costs $0.55/1M input and $2.19/1M output. Both models are accessed via api.deepseek.com.

Q: Is DeepSeek cheaper than GPT-4o?

Yes, significantly. DeepSeek-V3 at $0.27/$1.10 per 1M tokens is about 9× cheaper on input and 9× cheaper on output compared to GPT-4o ($2.50/$10.00). For high-volume workloads like classification or extraction, DeepSeek can reduce costs by 85–90%.

Q: What is DeepSeek context caching?

DeepSeek caches frequently repeated context (system prompts, documents). Cache hits are billed at $0.07/1M for V3 and $0.14/1M for R1 — roughly 75% off normal input price. This makes DeepSeek extremely cost-effective for RAG and agentic workloads with large repeated context.

Key Takeaways

DeepSeek-V3 costs $0.27/1M input and $1.10/1M output — roughly 9× cheaper than GPT-4o on both input and output
DeepSeek-R1 (reasoning model) at $0.55/$2.19 per 1M is comparable to o1-mini but 5× cheaper
Context caching drops V3 input to $0.07/1M — making RAG and agentic workloads extremely cost-effective
DeepSeek models are open-weight — you can self-host to reduce costs further on high volume

🐋 DeepSeek API Cost Estimator

Includes context cache hit savings — enter your system prompt size to model real costs

Model

Input tokens / request

Output tokens / request

Requests / month

Cached tokens / request (system prompt / docs)

Period

Estimated cost

— per month

Input / output per 1M—

Uncached input cost—

Cache savings—

Output cost—

Total with caching—

2026 DeepSeek API Pricing

DeepSeek API rates in USD per million tokens, June 2026. Accessed via api.deepseek.com.

Model	Input / 1M	Cache Hit / 1M	Output / 1M	Context	Best for
DeepSeek-V3CHEAPEST	$0.27	$0.07	$1.10	64K	General tasks, coding, summarization
DeepSeek-R1	$0.55	$0.14	$2.19	64K	Reasoning, math, complex analysis

✦ Context Cache = 74% off input

DeepSeek's context cache (cache hit) prices repeated tokens at ~74% below standard input price. For agentic workloads with large system prompts or RAG contexts, this compounds into massive savings at scale.

DeepSeek vs GPT-4o vs Claude Sonnet

Cost comparison at 50,000 requests/month with 2,000 input + 500 output tokens.

Model	Input / 1M	Output / 1M	Monthly cost	vs DeepSeek V3
DeepSeek-V3CHEAPEST	$0.27	$1.10	$54.50	—
DeepSeek-R1	$0.55	$2.19	$109.75	2× more
Gemini 2.0 Flash	$0.10	$0.40	$20.00	0.37× less
GPT-4o mini	$0.15	$0.60	$30.00	0.55× less
Claude Haiku 4.5	$0.80	$4.00	$180.00	3.3× more
GPT-4o	$2.50	$10.00	$625.00	11.5× more
Claude Sonnet 4.6	$3.00	$15.00	$875.00	16× more

⚠ Note on quality

DeepSeek-V3 is highly capable but may underperform on nuanced instruction following, long-form creative tasks, and complex multi-step reasoning compared to GPT-4o or Claude Sonnet. For cost-sensitive, high-volume workloads (classification, extraction, summarization), V3 is an excellent choice.

DeepSeek V3 vs R1: Which to Use?

DeepSeek-V3 — $0.27/1M in

General coding and debugging
Text summarization and extraction
Classification at high volume
RAG retrieval and Q&A
Content generation and rewriting
Customer support automation

DeepSeek-R1 — $0.55/1M in

Complex mathematical reasoning
Multi-step logical inference
Algorithm design and analysis
Scientific problem solving
Financial modeling and analysis
Tasks that benefit from chain-of-thought

Frequently Asked Questions

How much does the DeepSeek API cost? +

DeepSeek-V3: $0.27/1M input, $1.10/1M output. Cache hits: $0.07/1M. DeepSeek-R1: $0.55/1M input, $2.19/1M output. Cache hits: $0.14/1M. All prices are pay-as-you-go USD via api.deepseek.com (June 2026).

Is DeepSeek cheaper than GPT-4o? +

Yes — DeepSeek-V3 is about 9× cheaper than GPT-4o on both input ($0.27 vs $2.50/1M) and output ($1.10 vs $10.00/1M). For 50,000 requests/month with 2K input + 500 output tokens: DeepSeek costs ~$55/month vs GPT-4o's ~$625/month. However, Gemini 2.0 Flash ($0.10/$0.40) is cheaper than V3 for simple tasks.

What is DeepSeek context caching? +

DeepSeek caches repeated context (system prompts, documents) server-side. Cache hits are billed at $0.07/1M for V3 — 74% off the standard $0.27/1M input rate. Enable by structuring your API calls to put cacheable content at the start of your context. This is especially effective for RAG pipelines where the same documents are retrieved repeatedly.

Can I self-host DeepSeek to reduce costs further? +

Yes. DeepSeek-V3 and R1 are open-weight models available on Hugging Face. You can self-host on GPU instances — a single H100 (~$3/hr on Lambda Labs) can serve V3 at roughly 15–20 tokens/second for low-concurrency workloads. At 50M+ tokens/month, self-hosting typically breaks even vs the API. Below that, the managed API is usually more cost-effective.

Compare Other LLM APIs

🤖

All LLM Cost Calculator

Compare DeepSeek, GPT, Claude, Gemini side by side

💬

ChatGPT API Calculator

GPT-4o, o1, GPT-4o mini pricing

🧠

Claude API Calculator

Sonnet 4.6, Haiku 4.5, Opus 4.8 pricing

✨

Gemini API Calculator

Gemini 2.0 Flash, 2.5 Pro cost estimation

📉

Reduce LLM API Costs

Caching, batching, model routing strategies

📖

LLM API Cost Guide 2026

Complete breakdown of every major model