Use-case cost estimate

AI Agency Client Project Cost Estimate

Agency estimates should separate one-time prototype traffic from recurring production usage and leave room for evaluation, monitoring, retries, and support.

Default workload assumptions

These values make the example reproducible. They are planning assumptions, not measured usage from your application.

Prototype requests: 1,000
Production requests per month: 20,000
Input tokens per request: 1,500
Output tokens per request: 500
Example model: Claude Sonnet 4.6
Maintenance reserve: 10% of production model cost

Calculator-style cost example

Estimate

Prototype model usage: $12.001,000 one-time prototype requests
Production model usage: $240.0020,000 recurring requests per month
Maintenance usage reserve: $24.0010% planning assumption; not a provider fee

Estimated monthly cost

$276.00

Estimated yearly cost

$3,312.00

Claude Sonnet 4.6

Last verified Jun 21, 2026 · Anthropic Claude API pricing

Verified

The maintenance reserve is a TokenMath planning assumption.
Agency fees, hosting, storage, and support labor are excluded.

Formula

first-month estimate = prototype usage + production usage + maintenance reserve

Main cost drivers

Discovery and evaluation traffic before launch
Production request volume and response length
Retries, monitoring, and client support
Additional modalities and infrastructure

Ways to reduce cost

Price prototype and production phases separately
Use recorded evaluation prompts rather than ad hoc retries
Include usage limits and change-control assumptions
Report provider cost separately from agency fees

Frequently asked questions

Is the maintenance reserve a provider fee?

No. It is a visible planning assumption for extra model usage during monitoring, prompt updates, and support.

Should agency fees be included in API cost?

Keep provider usage, infrastructure, and professional services as separate line items so clients can understand each cost.

Related pricing pages

OpenAI API pricing

Review source-linked OpenAI model and service rates.

Open

Gemini API pricing

Compare Flash, Pro, image, and embedding records.

Open

Anthropic Claude pricing

Review Haiku, Sonnet, and Opus pricing snapshots.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Requests per day

Requests per day is the number of billable API calls made during a day. TokenMath commonly derives it from requests per active user multiplied by active users.

Open

Cost per request

Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.

Open