Skip to main content

AI API cost glossary

Tokens per minute

Tokens per minute is a throughput or rate-limit measure describing how many input or output tokens an account or model can process during a minute.

Why it matters for API cost

A workload can fit its dollar budget but still exceed provider throughput limits during traffic spikes.

Formula

required tokens per minute = peak requests per minute × average tokens per request

Example

Twenty peak requests per minute with 1,300 combined tokens each require about 26,000 tokens per minute before retries.

Frequently asked questions

Is tokens per minute a price?

No. It is a throughput measure. Token rates and account limits are related operational constraints but not the same thing.