Use Case

RAG chatbot cost calculator

Estimate spend for retrieval-heavy assistants that repeatedly pull large context windows.

Default workload

These presets are meant to shorten planning time, and you can edit every assumption in the calculator below.

Lowest-cost model

Gemini 2.5 Flash

Balanced model

Claude Sonnet 4.6

Default MAU

12,000

Optimization ideas

These notes are static guidance tied to the selected use case.

  • Cache repeated context windows whenever the same documents are reused
  • Trim retrieval payloads before they bloat every request

Calculator

Start from the preset, then tune it to your own workload.

Workload preset
Model

Estimated monthly spend

$15,658

Claude Sonnet 4.6 · Anthropic

Cost per request

$0.01

Cost per user / month

$1.30

Cost per day

$522

Annual run rate

$187,898

Requests / month: 1,080,000

Effective input price: $3.00 / 1M

Effective cached input price: $0.30 / 1M

Effective output price: $15.00 / 1M

Need saved estimates or manual price alerts?

Email us if you want a saved estimate, a manual alert request, or to flag a pricing correction.

contact@modelcostwatch.com

Contact us to save estimates or request alerts

Need a second model?

Open a dedicated comparison page or keep browsing the pricing directory before you lock in a model.