Skip to main content

AI SaaS unit economics

How to Estimate AI SaaS API Costs and Gross Margin

AI SaaS pricing needs a usage model, not a single model rate. Tie inference cost to active behavior, add variable infrastructure, and compare the result with revenue per paying account.

Formula-driven examplesSource-linked pricing snapshots

Build a per-active-user cost

Define requests per active user, input and output tokens per request, active days, and the mix of models or modalities. Segment power users because averages can hide a loss-making tail.

Apply the same assumptions to free trials, paid tiers, and internal usage. Rate limits and included credits can make variable cost more predictable.

gross margin = (revenue − variable API and infrastructure cost) ÷ revenue

Stress-test the assumptions

Run a base case, high-usage case, and provider-price-change case. Include retries, failed generations, background jobs, evaluation traffic, and customer support usage.

Worked example

GPT-4.1 mini: 1M input tokens + 1M output tokens

Using the versioned rates below, this example workload is estimated at $2.00. This isolates provider usage only and does not include taxes, regional premiums, retries, storage, network traffic, or unrelated infrastructure.

Current pricing references

These versioned records support the examples above. Check the date and provider source before using them in a production forecast.

Provider / modelInput or unitOutputStatusSource

OpenAI

GPT-4.1 mini

$0.40 per 1M tokens$1.60 / 1MVerifiedOpenAI model pricing

Checked Jun 21, 2026

Google Gemini

Gemini 2.5 Flash

$0.30 per 1M tokens$2.50 / 1MVerifiedGoogle Gemini API pricing

Checked Jun 21, 2026

Anthropic

Claude Sonnet 4.6

$3.00 per 1M tokens$15.00 / 1MVerifiedAnthropic Claude API pricing

Checked Jun 21, 2026

Frequently asked questions

Should AI API cost be part of cost of goods sold?

Usage-driven inference and related infrastructure are commonly treated as variable service-delivery costs for unit-economics analysis. Confirm accounting treatment with a qualified professional.

How do I control heavy-user cost?

Use transparent quotas, credits, rate limits, model routing, caching, and tier-specific limits informed by measured usage distributions.

Related calculators and guides