💰 COST 🏗️ INFRASTRUCTURE

The Real Cost of Building an AI SaaS in 2026 — A Complete Breakdown

June 16, 2026 · 14 min read · APICalculators Blog

Everyone talks about AI SaaS potential. Few talk about what it actually costs to run one. This guide cuts through the hype with real monthly dollar figures across every major infrastructure layer — LLM APIs, vector databases, authentication, hosting, payment processing, and serverless functions — at three production scales: Startup (1,000 users), Growth (10,000 users), and Scale (100,000 users).

All prices are sourced from official provider documentation as of June 2026. We use a consistent AI SaaS model throughout: a product where users send 5 queries per active day, each requiring an LLM call (500 prompt tokens + 300 output tokens), vector similarity search over a growing knowledge base, and occasional background serverless processing. Revenue model assumes 3% conversion to paid plans at $29/month average.

TL;DR: A lean AI SaaS at 1,000 users costs $57–$145/month. At 10,000 users: $448–$1,283/month. At 100,000 users: $4,301–$14,493/month. The 3x multiplier between budget and premium stacks comes almost entirely from three decisions: model choice, auth provider, and vector database deployment model.

1. LLM API Costs

LLM API spend is the largest and most variable line item in any AI SaaS. The model you choose has a 15x cost multiplier between the cheapest and most capable options. At scale, this single decision determines whether your infrastructure is profitable or catastrophic.

For our model: 30 active days × 5 queries/active user/day × (500 prompt tokens + 300 output tokens) per query.

ScaleActive Users/DayMonthly QueriesGPT-5.4 nano ($0.20/$1.25 /1M)Claude Sonnet 4.6 ($3.00/$15.00 /1M)
Startup (1K)10015,000$7/mo$90/mo
Growth (10K)1,000150,000$71/mo$900/mo
Scale (100K)10,0001,500,000$713/mo$9,000/mo

The math: GPT-5.4 nano at Scale = 1.5M queries × 500 prompt tokens / 1M × $0.20 + 1.5M × 300 output tokens / 1M × $1.25 = $150 + $562.50 = $712.50/month. Claude Sonnet 4.6 = $2,250 + $6,750 = $9,000/month.

The smart approach: Route 80% of queries to GPT-5.4 nano (simple retrieval, formatting, FAQ answers) and 20% to Sonnet 4.6 (complex reasoning, code generation, analysis). At Scale, this mixed approach costs approximately $1,942/month — 78% cheaper than all-Sonnet, with minimal quality impact on routine queries.

Use the LLM API cost calculator to model your specific token ratios and model mix.

2. Vector Database Costs

Vector databases store and query your AI's knowledge base. Cost grows with vector count (corpus size) and query volume (user activity). We estimate corpus size scaling with user count: 1K users → 1M vectors, 10K users → 5M vectors, 100K users → 50M vectors (at 1,536 dimensions, OpenAI text-embedding-3-small format).

ScaleVectorsPinecone ServerlessQdrant Self-Hosted (Hetzner)
Startup (1K)1M~$4/mo$5/mo (CX22)
Growth (10K)5M~$15/mo$9/mo (CX32)
Scale (100K)50M~$100/mo$20/mo (CX42)

Pinecone Serverless charges $0.33/GB/month storage (1M vectors at 1,536 dim ≈ 6GB = $1.98) plus $0.16 per million read units. At low query volumes, Pinecone is cost-competitive. At Scale with 1.5M queries/month, the read charges add up. Self-hosted Qdrant on Hetzner's CX42 ($19.90/month, 8 vCPU, 16GB RAM) handles 50M vectors with headroom and has zero per-query fees.

Compare options in detail at the vector database cost calculator.

3. Authentication Costs

Authentication is the most common source of infrastructure cost surprise. Clerk's free tier is generous at 10,000 MAU — but the jump at scale is dramatic. Supabase Auth's free tier covers 50,000 MAU, and the Pro plan covers 100,000 MAU for a flat $25/month.

ScaleMAUClerkSupabase Auth
Startup (1K)1,000$0 (free <10K)$0 (free <50K)
Growth (10K)10,000$25/mo (Pro, 10K included)$0 (free <50K)
Scale (100K)100,000$1,825/mo ($25 + 90K×$0.02)$25/mo (Pro)

At 100,000 MAU, Supabase Auth is 73x cheaper than Clerk. The trade-off: Clerk provides polished pre-built UI components for React/Next.js that can save 2–3 days of development. If you are building with a custom UI or using Supabase for your database anyway, Supabase Auth is an obvious choice. If you need Clerk's component library, start on Clerk and plan a migration before 10K MAU to avoid the scaling cliff.

Model your MAU trajectory with the auth provider cost calculator.

4. Hosting Costs

For an AI SaaS serving HTTP requests, a VPS or container platform is standard. We compare Hetzner Cloud (European provider with strong price/performance) against AWS EC2 (the default enterprise choice). Note: if you use Vercel or similar platforms for serverless hosting, your hosting cost may be $0 at low scale — see Section 6.

ScaleHetzner CloudAWS EC2 (equiv.)
Startup (1K)$4.50/mo (CX22, 2 vCPU, 4GB)$15/mo (t3.small)
Growth (10K)$8.80/mo (CX32, 4 vCPU, 8GB)$30/mo (t3.medium)
Scale (100K)$19.90/mo (CX42, 8 vCPU, 16GB)$120/mo (t3.xlarge)

Hetzner delivers 3–6x better price/performance than AWS EC2 for web serving workloads. The trade-off is geography (primary data centers in Germany and Finland, with US locations available) and the absence of managed services like RDS, ElastiCache, and EKS. For an early-stage AI SaaS where a single developer manages infrastructure, Hetzner is the default recommendation.

Compare providers side by side at the cloud VPS comparison calculator.

5. Payment Processing Costs

Payment processor fees scale directly with revenue. We calculate against our modeled revenue: 3% of users converting to $29/month paid plans. At Startup: 30 paying users × $29 = $870 MRR. At Growth: 300 paying × $29 = $8,700 MRR. At Scale: 3,000 paying × $29 = $87,000 MRR.

ScaleRevenueStripe (2.9% + $0.30)Paddle (5.0% + $0.50)
Startup (1K)$870$34/mo$58/mo
Growth (10K)$8,700$342/mo$585/mo
Scale (100K)$87,000$3,423/mo$5,850/mo

Stripe is cheaper in pure fee terms. Paddle's 2.1% premium buys you global VAT/GST compliance as a merchant of record — worth it if you have EU, Australian, or Canadian customers. At $87,000 MRR, the Paddle premium is $2,427/month. A dedicated tax compliance solution (Stripe Tax + Avalara) for the same markets runs $500–$1,500/month, narrowing the gap. For a global consumer AI SaaS at Scale, the effective Paddle vs Stripe difference may be as little as $1,000/month in real total cost.

Run the numbers for your specific transaction count and geography at the payment processor fee calculator.

6. Serverless Function Costs

AI SaaS applications commonly use serverless functions for background jobs: embedding generation, webhook processing, scheduled re-indexing, and async LLM calls. We model 10 invocations per active user per day at 200ms average duration on 256MB memory.

ScaleMonthly InvocationsAWS LambdaCloudflare Workers
Startup (1K)30K<$1/mo (free tier)$0 (free tier)
Growth (10K)300K~$1/mo$0 (free tier)
Scale (100K)3M~$20/mo$5/mo (Paid plan)

Serverless costs are negligible relative to LLM and auth spend at every scale in this model. The main optimization is using Cloudflare Workers at Scale for workloads that fit within its CPU limits — at 3M invocations/month, you stay well within the Paid plan's 10M included requests at $5/month. AWS Lambda remains preferable for longer-running jobs (over 30 seconds) or those requiring more than 128MB memory.

Model your exact configuration at the serverless cost calculator.

7. Total Monthly Cost Summary

Two stacks: Budget uses GPT-5.4 nano, Qdrant self-hosted on Hetzner, Supabase Auth, Hetzner VPS, Stripe, and AWS Lambda. Premium uses Claude Sonnet 4.6, Pinecone Serverless, Clerk, AWS EC2, Stripe, and Lambda.

ComponentStartup (1K) BudgetStartup (1K) PremiumGrowth (10K) BudgetGrowth (10K) PremiumScale (100K) BudgetScale (100K) Premium
LLM API$7$90$71$900$713$9,000
Vector DB$5$4$9$15$20$100
Auth$0$0$0$25$25$1,825
Hosting$5$15$9$30$20$120
Payments$34$34$342$342$3,423$3,423
Serverless<$1<$1$1$1$20$20
TOTAL / MONTH$51$143$432$1,313$4,201$14,488

Note: payments are not pure cost — they are a percentage of revenue. At Scale, $3,423 in Stripe fees comes out of $87,000 MRR. Net infrastructure cost (excluding payment processing) is $778/month on the budget stack at 100K users — approximately 0.9% of MRR. That is an extremely healthy unit economics profile.

8. How to Cut This Bill in Half

The total cost difference between budget and premium stacks is driven by three decisions. Here are the four highest-ROI optimizations, ranked by impact:

1. Model tiering — saves 60–90% on LLM costs. Audit your query types. Most AI SaaS workloads break down as: 70% retrieval augmentation (nano-class model sufficient), 20% summarization/formatting (mini-class model), 10% complex reasoning (frontier model warranted). Implement a classifier that routes to the cheapest model that can handle the task. This alone reduces LLM costs from $9,000/month to under $2,000/month at Scale.

2. Switch auth providers before you hit the wall — saves $1,800/month at Scale. The Clerk-to-Supabase migration takes 3–5 days of engineering. Do it before 10K MAU, not after 50K MAU when migration risk is higher and urgency is already creating pressure. Every month you wait past 10K MAU on Clerk costs $0.02 per additional user.

3. Self-host your vector database — saves $80–$200/month. Qdrant on a $20/month Hetzner CX42 handles 50M vectors. The operational overhead is one Docker Compose file and a weekly backup cron job. If your team already manages any containerized services, this is a 2-hour setup with 90% cost reduction versus Pinecone at scale.

4. Enable prompt caching for repeated context — saves 40–90% on cached tokens. If your system prompt is large (500+ tokens) and your app processes many sequential messages per user session, Anthropic's prompt caching (cached tokens billed at 90% discount after the first call) and OpenAI's analogous feature can dramatically reduce per-session LLM costs. For chatbot-style applications with multi-turn conversations, this is often the highest-ROI LLM optimization after model tiering.

Frequently Asked Questions

How much does it cost to build an AI SaaS at 1,000 users?+

A lean AI SaaS at 1,000 users costs approximately $51–$143 per month for infrastructure. The range depends on model choice: GPT-5.4 nano ($0.20/1M input tokens) keeps LLM costs under $10/month at this scale, while Claude Sonnet 4.6 ($3.00/1M input tokens) pushes LLM costs to $90/month. Hosting (Hetzner $4.50), auth (Supabase Auth free under 50K MAU), and serverless (Lambda, effectively free at low volume) add minimal cost.

What is the biggest cost driver for AI SaaS applications?+

LLM API costs dominate at every scale, often representing 60–80% of total infrastructure spend. The model you choose has a 15x cost multiplier: GPT-5.4 nano at $0.20/1M input tokens versus Claude Sonnet 4.6 at $3.00/1M input tokens. At 100,000 users, the LLM line item alone ranges from $713/month to $9,000/month. Auth costs become the second-largest surprise at scale — specifically Clerk at $1,825/month for 100K MAU.

Pinecone vs self-hosted Qdrant — which is cheaper?+

Self-hosted Qdrant on Hetzner is significantly cheaper at every scale. At 100K users (approximately 50M vectors), Pinecone serverless costs around $100/month in storage and query charges. Self-hosted Qdrant on a Hetzner CX42 ($19.90/month) handles the same workload with no per-query fees. The trade-off is operational complexity: your team manages upgrades, backups, and scaling.

Should I use Clerk or Supabase Auth for my AI SaaS?+

Supabase Auth wins on cost at scale. At 100,000 MAU, Clerk costs $1,825/month ($25 + 90K × $0.02). Supabase Auth Pro costs $25/month and includes up to 100,000 MAU — 73x cheaper. Choose Clerk only if you need its pre-built React/Next.js UI components and plan to stay under 10K MAU long-term.

Stripe vs Paddle for an AI SaaS — which should I choose?+

For US-only or B2B sales, Stripe (2.9% + $0.30) is cheaper than Paddle (5.0% + $0.50). For global consumer sales where VAT/GST compliance matters, Paddle as a merchant of record handles international tax automatically — worth the 2.1% premium when you factor in tax software and accountant time. At $10,000 MRR, Paddle costs roughly $250 more per month than Stripe.

How can I cut my AI SaaS infrastructure bill in half?+

The four highest-ROI optimizations: (1) Route 80% of queries to nano/mini models — saves 60–90% on LLM costs. (2) Switch to Supabase Auth before hitting 10K MAU — saves $1,800/month at 100K users. (3) Self-host Qdrant on Hetzner instead of Pinecone — saves $80–$200/month. (4) Enable prompt caching for repeated system prompts — saves 40–90% on cached tokens. Combined, these reduce total infrastructure cost by 50–70%.