Skip to main content

Use-case cost estimate

How Much Does a RAG App Cost?

A retrieval-augmented generation budget has at least three layers: embedding the corpus, storing vectors, and generating answers. Query, reranking, hosting, and network charges can add more.

Default workload assumptions

These values make the example reproducible. They are planning assumptions, not measured usage from your application.

Documents
10,000
Chunks per document
5
Tokens per chunk
250
Corpus refreshes per month
1
RAG answers per month
10,000
Vector dimensions
1,536

Calculator-style cost example

Estimate
Embedding refresh
$0.275One full corpus refresh per month
Vector storage assumption
$0.116$0.25/GiB-month assumption; not a provider quote
Generated answers
$12.8010,000 RAG answers per month

Estimated monthly cost

$13.19

Estimated yearly cost

$158.29

text-embedding-3-small

Last verified Jun 21, 2026 · OpenAI embedding model pricing

Verified

GPT-4.1 mini

Last verified Jun 21, 2026 · OpenAI model pricing

Verified
  • Vector storage uses a $0.25 per GiB-month planning assumption.
  • Query, compute, reranking, hosting, and network costs are excluded.

Formula

monthly RAG estimate = embedding refresh + vector storage assumption + answer-generation cost

Main cost drivers

  • Corpus size, chunk overlap, and refresh policy
  • Vector dimensions, metadata, replicas, and index overhead
  • Retrieved context tokens per answer
  • Query, reranking, and managed database charges

Ways to reduce cost

  • Embed only changed documents
  • Reduce redundant chunk overlap
  • Retrieve fewer high-quality passages
  • Tune dimensions, replicas, and metadata intentionally

Frequently asked questions

Is vector storage pricing a provider quote here?

No. The example uses a clearly labeled storage-rate assumption so the storage utility can demonstrate sizing. Replace it with your vendor's rate.

Does the estimate include vector queries?

No. Query units, compute, reranking, and network costs vary by service and are not represented in the current pricing catalog.