Use Case
RAG chatbot cost calculator
Estimate spend for retrieval-heavy assistants that repeatedly pull large context windows.
Default workload
These presets are meant to shorten planning time, and you can edit every assumption in the calculator below.
Lowest-cost model
Gemini 2.5 Flash
Balanced model
Claude Sonnet 4.6
Default MAU
12,000
Optimization ideas
These notes are static guidance tied to the selected use case.
- Cache repeated context windows whenever the same documents are reused
- Trim retrieval payloads before they bloat every request
Calculator
Start from the preset, then tune it to your own workload.
Estimated monthly spend
$15,658
Claude Sonnet 4.6 · Anthropic
Cost per request
$0.01
Cost per user / month
$1.30
Cost per day
$522
Annual run rate
$187,898
Requests / month: 1,080,000
Effective input price: $3.00 / 1M
Effective cached input price: $0.30 / 1M
Effective output price: $15.00 / 1M
Need saved estimates or manual price alerts?
Email us if you want a saved estimate, a manual alert request, or to flag a pricing correction.
contact@modelcostwatch.com
Need a second model?
Open a dedicated comparison page or keep browsing the pricing directory before you lock in a model.