Skip to main content

AI API cost glossary

Output tokens

Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.

Why it matters for API cost

Output rates are often higher than input rates, so response length can dominate an application's model bill.

Formula

output cost = output tokens ÷ 1,000,000 × output rate

Example

A 500-token response at $10 per million output tokens costs an estimated $0.005.

Frequently asked questions

Does a maximum output limit guarantee that many tokens?

No. It is a cap. Actual responses can stop earlier, but planning near observed usage is more useful than assuming the maximum every time.