Use-case cost estimate
AI Agency Client Project Cost Estimate
Agency estimates should separate one-time prototype traffic from recurring production usage and leave room for evaluation, monitoring, retries, and support.
Default workload assumptions
These values make the example reproducible. They are planning assumptions, not measured usage from your application.
- Prototype requests
- 1,000
- Production requests per month
- 20,000
- Input tokens per request
- 1,500
- Output tokens per request
- 500
- Example model
- Claude Sonnet 4.6
- Maintenance reserve
- 10% of production model cost
Calculator-style cost example
Estimate- Prototype model usage
- $12.001,000 one-time prototype requests
- Production model usage
- $240.0020,000 recurring requests per month
- Maintenance usage reserve
- $24.0010% planning assumption; not a provider fee
Estimated monthly cost
$276.00
Estimated yearly cost
$3,312.00
Claude Sonnet 4.6
Last verified Jun 21, 2026 · Anthropic Claude API pricing
- The maintenance reserve is a TokenMath planning assumption.
- Agency fees, hosting, storage, and support labor are excluded.
Formula
Main cost drivers
- Discovery and evaluation traffic before launch
- Production request volume and response length
- Retries, monitoring, and client support
- Additional modalities and infrastructure
Ways to reduce cost
- Price prototype and production phases separately
- Use recorded evaluation prompts rather than ad hoc retries
- Include usage limits and change-control assumptions
- Report provider cost separately from agency fees
Frequently asked questions
Is the maintenance reserve a provider fee?
No. It is a visible planning assumption for extra model usage during monitoring, prompt updates, and support.
Should agency fees be included in API cost?
Keep provider usage, infrastructure, and professional services as separate line items so clients can understand each cost.
Related pricing pages
Related glossary terms
Input tokens
Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.
OpenRequests per day
Requests per day is the number of billable API calls made during a day. TokenMath commonly derives it from requests per active user multiplied by active users.
OpenCost per request
Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.
Open