AI SaaS unit economics

How to Estimate AI SaaS API Costs and Gross Margin

AI SaaS pricing needs a usage model, not a single model rate. Tie inference cost to active behavior, add variable infrastructure, and compare the result with revenue per paying account.

Formula-driven examplesSource-linked pricing snapshots

Build a per-active-user cost

Define requests per active user, input and output tokens per request, active days, and the mix of models or modalities. Segment power users because averages can hide a loss-making tail.

Apply the same assumptions to free trials, paid tiers, and internal usage. Rate limits and included credits can make variable cost more predictable.

gross margin = (revenue − variable API and infrastructure cost) ÷ revenue

Stress-test the assumptions

Run a base case, high-usage case, and provider-price-change case. Include retries, failed generations, background jobs, evaluation traffic, and customer support usage.

Worked example

GPT-4.1 mini: 1M input tokens + 1M output tokens

Using the versioned rates below, this example workload is estimated at $2.00. This isolates provider usage only and does not include taxes, regional premiums, retries, storage, network traffic, or unrelated infrastructure.

Current pricing references

These versioned records support the examples above. Check the date and provider source before using them in a production forecast.

Provider / model	Input or unit	Output	Status	Source
OpenAI GPT-4.1 mini	$0.40 per 1M tokens	$1.60 / 1M	Verified	OpenAI model pricing Checked Jun 21, 2026
Google Gemini Gemini 2.5 Flash	$0.30 per 1M tokens	$2.50 / 1M	Verified	Google Gemini API pricing Checked Jun 21, 2026
Anthropic Claude Sonnet 4.6	$3.00 per 1M tokens	$15.00 / 1M	Verified	Anthropic Claude API pricing Checked Jun 21, 2026

Provider / model

Input or unit

Output

Status

Source

OpenAI

GPT-4.1 mini

$0.40 per 1M tokens

$1.60 / 1M

Verified

OpenAI model pricing

Checked Jun 21, 2026

Google Gemini

Gemini 2.5 Flash

$0.30 per 1M tokens

$2.50 / 1M

Verified

Google Gemini API pricing

Checked Jun 21, 2026

Anthropic

Claude Sonnet 4.6

$3.00 per 1M tokens

$15.00 / 1M

Verified

Anthropic Claude API pricing

Checked Jun 21, 2026

Frequently asked questions

Should AI API cost be part of cost of goods sold?

Usage-driven inference and related infrastructure are commonly treated as variable service-delivery costs for unit-economics analysis. Confirm accounting treatment with a qualified professional.

How do I control heavy-user cost?

Use transparent quotas, credits, rate limits, model routing, caching, and tier-specific limits informed by measured usage distributions.

Related calculators and guides

Compare SaaS revenue with API cost

Model break-even and cost per account.

Open

Set daily API limits

Translate budget into operational limits.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Output tokens

Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.

Open

Cost per request

Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.

Open