OpenAI cost and lifecycle tracker

Track the OpenAI changes that can force a budget or migration call.

This page is built for the moments when a pricing row moves, a tool charge appears, or a shutdown date lands. It keeps the official OpenAI signals in one operating view so the next decision can close faster.

Last checked

March 12, 2026

Source set

4 official pages

Deprecation cluster

4 live briefs / 0 tracked risks

Current operating readout

Price, tool, and lifecycle signals that can change a real OpenAI decision.
Live
This page now separates model pricing by family so a reader can scan text, image, audio, and embedding costs without losing the lifecycle and tool-cost context.

Representative text row

gpt-5.4 standard short context: $2.50 input / $0.25 cached / $15.00 output per 1M tokens

This is a representative flagship row, not a stand-in for the whole OpenAI matrix.

Representative image row

GPT Image 1.5 low quality: $0.009 per 1024x1024 image

Image pricing is now shown as its own family instead of being buried under text-token tables.

Nearest tracked shutdown

Selected legacy GPT snapshots: March 26, 2026

Legacy GPT snapshot pins are the earliest OpenAI lifecycle date currently tracked on this page.

Risk watch

Read the live changes in one vertical pass before opening any deeper page.

This section is the reading spine of the page. Start here when the question is whether cost, migration timing, or replacement burden has already changed enough to alter the plan.

Open the linked brief only when the issue changes implementation, migration sequence, or cost shape. Otherwise, keep scanning vertically and stay in the main read.

Live change

Legacy GPT snapshots now have a fixed March deadline.

Selected legacy GPT snapshots and older preview aliases are scheduled for shutdown on March 26, 2026.

What changes now

Teams still pinned to dated GPT-4 snapshot names need a deliberate move to the gpt-5 or gpt-4.1 family before the cutoff.

Live change

Assistants API is now on a fixed shutdown path.

The deprecations page lists August 26, 2026 as the shutdown date and points new work toward the Responses API plus Conversations API.

What changes now

Any new OpenAI build should treat Assistants as a migration topic, not as a fresh platform choice.

Live change

Realtime beta has a published end date.

The beta Realtime interface is deprecated with a shutdown date of May 7, 2026.

What changes now

Teams still budgeting or integrating against beta-era docs should migrate and price the GA Realtime path instead.

Live change

Preview audio and realtime model names also have a cutoff.

OpenAI schedules May 7, 2026 as the shutdown date for the current gpt-4o preview audio and realtime model group.

What changes now

Teams still pinned to preview model names should move to the listed current realtime or audio families, and beta-interface users may need a separate interface migration as well.

Live change

File search cost is now a separate operating question.

Current OpenAI pricing splits file search into storage, Responses API tool calls, and model-token exposure rather than a single retrieval fee.

What changes now

A file-search estimate needs storage footprint, call volume, and chosen-model token pressure in the same read or it will miss the real bill.

Live change

Web search cost now depends on the search path, not just the model row.

Current OpenAI pricing separates standard web search, preview reasoning web search, and preview non-reasoning web search, and search content tokens are not billed the same way on each path.

What changes now

A web-search estimate now needs call volume and the exact search path before the model row can be trusted.

Key cost signals

Read the rows most likely to change an estimate before opening the full matrix.

Most decisions do not need every OpenAI row at once. Start with a small benchmark set across text, image, audio, and embeddings, then expand the full pricing matrix only when family-level verification is necessary.

Flagship text anchor

gpt-5.4 short: $2.50 input / $15.00 output per 1M tokens

Long context raises the same row to $5.00 input and $22.50 output, so context length alone can materially change the estimate.

Open GPT-5.4 pricing

Practical low-cost text row

gpt-5-mini: $0.25 input / $2.00 output per 1M tokens

This is the first row many teams should benchmark before paying for higher-end reasoning or larger context windows.

Open GPT-5 mini pricing

Image generation floor

GPT Image 1.5 low: $0.009 per 1024x1024 image

Image workloads should be estimated separately from text-token workloads rather than folded into the same mental model.

Realtime audio exposure

gpt-realtime: $40.00 input / $80.00 output per 1M audio tokens

Voice and realtime workloads leave the text-price range quickly, which is why audio needs its own benchmark row up front.

Embedding ingestion baseline

text-embedding-3-large: $0.13 standard / $0.065 batch per 1M tokens

Batch pricing materially changes large ingestion jobs, so the cheaper path may come from workflow design rather than model choice alone.

Full pricing matrix

Expand the model family you need to verify. Keep the rest collapsed.

The complete matrix stays available for verification, but it no longer needs to dominate the page before a decision is framed.

Text and reasoning models

Core GPT-5, o-series, codex, search, and computer-use rows that show up in most direct OpenAI comparisons.

2 tables50 rowsExpand

Standard pricing

ModelContextInputCached inputOutput
gpt-5.4Short$2.50$0.25$15.00
gpt-5.4Long$5.00$0.50$22.50
gpt-5.4-proShort$30.00N/A$180.00
gpt-5.4-proLong$60.00N/A$270.00
gpt-5.2Standard$1.25$0.13$10.00
gpt-5.1Standard$1.25$0.13$10.00
gpt-5Standard$1.25$0.13$10.00
gpt-5-miniStandard$0.25$0.03$2.00
gpt-5-nanoStandard$0.05$0.01$0.40
gpt-5.3-chat-latestStandard$1.50$0.15$12.00
gpt-5.2-chat-latestStandard$1.50$0.15$12.00
gpt-5.1-chat-latestStandard$1.50$0.15$12.00
gpt-5-chat-latestStandard$1.50$0.15$12.00
gpt-5.3-codexStandard$1.50$0.15$6.00
gpt-5.2-codexStandard$1.50$0.15$6.00
gpt-5.1-codex-maxStandard$20.00N/A$100.00
gpt-5.1-codexStandard$1.50$0.15$6.00
gpt-5-codexStandard$1.50$0.15$6.00
gpt-5.2-proStandard$15.00N/A$120.00
gpt-5-proStandard$15.00N/A$120.00
gpt-4.1Standard$2.00$0.50$8.00
gpt-4.1-miniStandard$0.40$0.10$1.60
gpt-4.1-nanoStandard$0.10$0.03$0.40
gpt-4oStandard$2.50$1.25$10.00
gpt-4o-2024-05-13Standard$5.00N/A$15.00
gpt-4o-miniStandard$0.15$0.08$0.60
o1Standard$15.00$7.50$60.00
o1-proStandard$150.00N/A$600.00
o3-proStandard$20.00N/A$80.00
o3Standard$2.00$0.50$8.00
o3-deep-researchStandard$10.00$2.50$40.00
o4-miniStandard$1.10$0.28$4.40
o4-mini-deep-researchStandard$2.00$0.50$8.00
o3-miniStandard$1.10$0.55$4.40
o1-miniStandard$1.10$0.55$4.40
gpt-5.1-codex-miniStandard$0.40$0.04$1.50
codex-mini-latestStandard$1.50$0.15$6.00
gpt-5-search-apiStandard$2.50$0.25$15.00
gpt-4o-mini-search-previewStandard$0.15$0.08$0.60
gpt-4o-search-previewStandard$2.50N/A$10.00
computer-use-previewStandard$3.00N/A$12.00

This table covers the text-token rows currently listed for text and reasoning models. It excludes image, audio, and embedding families on purpose.

Batch pricing

ModelContextInputCached inputOutput
gpt-5.4Short only$1.25$0.13$7.50
gpt-5.4-proShort only$15.00N/A$90.00
gpt-5.2Standard$0.63$0.06$5.00
gpt-5.1Standard$0.63$0.06$5.00
gpt-5Standard$0.63$0.06$5.00
gpt-5-miniStandard$0.13$0.01$1.00
gpt-5-nanoStandard$0.03$0.00$0.20
o3Standard$1.00$0.25$4.00
o4-miniStandard$0.55$0.14$2.20

OpenAI currently lists a narrower Batch matrix than the Standard matrix. Models not shown here are not padded with guessed batch rows.

Image models

Image-family pricing is split between image-token billing and per-image generation pricing.

2 tables26 rowsExpand

Image token pricing

ModeModelInputCached inputOutput
Batchgpt-image-1.5$4.00$1.00$16.00
Batchchatgpt-image-latest$4.00$1.00$16.00
Batchgpt-image-1$2.50$0.63$10.00
Batchgpt-image-1-mini$0.25$0.06$1.00
Standardgpt-image-1.5$8.00$2.00$32.00
Standardchatgpt-image-latest$8.00$2.00$32.00
Standardgpt-image-1$5.00$1.25$20.00
Standardgpt-image-1-mini$0.50$0.13$2.00
Standardgpt-realtime$5.00$0.50N/A
Standardgpt-realtime-1.5$5.00$0.50N/A
Standardgpt-realtime-mini$0.60$0.06N/A

Realtime rows are included here because OpenAI lists image-token pricing for them in the image-token section.

Image generation pricing

ModelQuality or modeBase listed sizeLargest listed size
GPT Image 1.5Low$0.009 at 1024x1024$0.013 at 1024x1536 or 1536x1024
GPT Image 1.5Medium$0.035 at 1024x1024$0.053 at 1024x1536 or 1536x1024
GPT Image 1.5High$0.14 at 1024x1024$0.21 at 1024x1536 or 1536x1024
GPT Image latestLow$0.008 at 1024x1024$0.012 at 1024x1536 or 1536x1024
GPT Image latestMedium$0.032 at 1024x1024$0.048 at 1024x1536 or 1536x1024
GPT Image latestHigh$0.13 at 1024x1024$0.19 at 1024x1536 or 1536x1024
GPT Image 1Low$0.011 at 1024x1024$0.016 at 1024x1536 or 1536x1024
GPT Image 1Medium$0.042 at 1024x1024$0.063 at 1024x1536 or 1536x1024
GPT Image 1High$0.17 at 1024x1024$0.25 at 1024x1536 or 1536x1024
GPT Image 1 MiniLow$0.009 at 1024x1024$0.013 at 1024x1536 or 1536x1024
GPT Image 1 MiniMedium$0.035 at 1024x1024$0.053 at 1024x1536 or 1536x1024
GPT Image 1 MiniHigh$0.14 at 1024x1024$0.21 at 1024x1536 or 1536x1024
DALL-E 3Standard$0.04 at 1024x1024$0.08 at 1024x1792 or 1792x1024
DALL-E 3HD$0.08 at 1024x1024$0.12 at 1024x1792 or 1792x1024
DALL-E 2Standard$0.016 at 256x256$0.020 at 1024x1024

DALL-E sizes differ from the GPT Image family, so the row shows the lowest and largest listed sizes rather than forcing a single shared size grid.

Audio models

Voice workflows need both audio-token pricing and speech or transcription pricing in view.

2 tables17 rowsExpand

Audio token pricing

ModelInputCached inputOutput
gpt-realtime$40.00$2.50$80.00
gpt-realtime-1.5$40.00$2.50$80.00
gpt-realtime-mini$10.00$0.30$20.00
gpt-4o-realtime-preview$100.00$20.00$200.00
gpt-4o-mini-realtime-preview$10.00$0.30$20.00
gpt-audio$40.00$2.50$80.00
gpt-audio-1.5$40.00$2.50$80.00
gpt-audio-mini$10.00$0.30$20.00
gpt-4o-audio-preview$100.00N/A$200.00
gpt-4o-mini-audio-preview$10.00N/A$20.00

OpenAI lists some preview rows without cached-input pricing. Those cells are left as N/A rather than inferred.

Speech and transcription pricing

ModelText inputText outputAudio inputAudio outputEstimate or note
gpt-4o-mini-tts$0.60N/AN/A$12.00$0.015 per minute
gpt-4o-transcribe$2.50$10.00$6.00N/A$0.006 per minute
gpt-4o-transcribe-diarize$2.50$10.00$6.00N/A$0.006 per minute
gpt-4o-mini-transcribe$1.25$5.00$3.00N/A$0.003 per minute
WhisperN/AN/AN/AN/A$0.006 per minute transcription
TTSN/AN/AN/AN/A$15.00 per 1M characters
TTS HDN/AN/AN/AN/A$30.00 per 1M characters

This table keeps the official speech and transcription billing lines together even though some are per token, per minute, or per character.

Embedding models

Embedding pricing is simpler than generation pricing, but the batch discount still matters in large-volume ingestion.

1 tables3 rowsExpand

Embedding pricing

ModelStandardBatch
text-embedding-3-large$0.13 per 1M tokens$0.065 per 1M tokens
text-embedding-3-small$0.02 per 1M tokens$0.01 per 1M tokens
text-embedding-ada-002$0.10 per 1M tokensN/A

OpenAI currently lists batch pricing for the text-embedding-3 family but not for text-embedding-ada-002.

Decision constraints

Read the hidden pressure before trusting the cheapest visible row.

Limits, hosted tools, and shutdown timing are what usually turn a cheap-looking option into the wrong production path. Scan the pressure points first, then expand the evidence tables only when a decision needs proof.

This section now works as a second reading spine. It is for deciding whether the estimate still holds under context limits, tool billing, and lifecycle risk.

Context fit

The cheaper row can still fail on usable context or tool coverage.

gpt-5.4 keeps a 1,048,576-token context window and the widest built-in tool set in the current summary rows, while gpt-5-mini narrows context to 400,000 tokens.

Open limits brief

Hosted tools

OpenAI-hosted tools create separate bill lines before token spend is finished.

Web search, file search storage, file search tool calls, and code interpreter runtime each add non-trivial cost pressure outside the base model row.

Open container pricing

Shelf life

A low current price is not a win if the path is already on a shutdown clock.

Assistants, Realtime beta, preview audio and realtime aliases, and legacy GPT snapshots all need lifecycle checks before they are treated as stable choices.

Constraint evidence

Expand only the evidence table that answers the current risk.

This keeps the page readable when the answer is obvious, while leaving the source-shaped tables in place for a harder review.

Limits and context coverage

Model availability and usable limits belong next to price because they determine whether the cheaper row is actually usable.

3 rows4 columnsExpand
ModelContext windowMax outputBuilt-in tools
gpt-5.41,048,576 tokens128,000 tokensFunctions, web search, file search, skills, image generation, code interpreter, MCP
gpt-5-mini400,000 tokens128,000 tokensFunctions, web search, file search, MCP
gpt-4.11,047,576 tokens32,768 tokensFunctions, fine-tuning, web search, file search, image generation

A lower-cost row is not the better choice if it misses the context size or tool path the app actually needs.

Tool-cost coverage

Hosted tool charges materially change the OpenAI bill and still need to stay separate from the model-family pricing matrix.

4 rows3 columnsExpand
ToolCurrent costNotes
Web search$10 per 1K callsSearch content tokens are billed at the model's token rates for all models.
File search storage$0.10 per GB per dayThe first 1 GB is free.
File search tool calls$2.50 per 1K callsApplies to the Responses API path.
Code interpreter container$0.03 per containerStarting March 31, 2026, usage shifts to $0.03 per container per 20 minutes.

If a workflow uses OpenAI-hosted tools, the budgeting question is tokens plus model-family rows plus tool rows plus runtime behavior.

Deprecation coverage

Shutdown dates and replacement paths belong in the same view as price whenever the integration has a shelf life.

4 rows4 columnsExpand
SystemShutdown dateReplacementNotes
Selected legacy GPT snapshotsMarch 26, 2026gpt-5 or gpt-4.1 familyOpenAI places legacy GPT-4 snapshot IDs and older preview aliases on a March 26, 2026 shutdown path, with replacement guidance pointing to the gpt-5 or gpt-4.1 family.
gpt-4o preview audio and realtime modelsMay 7, 2026gpt-realtime-1.5, gpt-realtime-mini, gpt-audio-1.5, or gpt-audio-miniOpenAI has also dated the shutdown of several gpt-4o preview realtime and audio models, with replacement paths now pointing to the current realtime and audio families.
Realtime API betaMay 7, 2026Realtime APIOpenAI marks the beta Realtime interface deprecated and routes teams to the GA Realtime API before the shutdown date.
Assistants APIAugust 26, 2026Responses API plus Conversations APIOpenAI marks the Assistants API deprecated and directs new work toward the Responses API plus the Conversations API.

A lower current price does not reduce risk if the chosen API path or snapshot is already on a shutdown clock.

Sources of record

Verify the exact row only after the main read has identified the question.

This list stays narrow on purpose. It sits after cost and constraint review so the docs support the decision instead of replacing the page flow.

Pricing

API pricing

Use this when a hosted tool charge, model row, or token rule needs confirmation against the source of record.

Open official page
Models

Models guide

Use this when a cheap-looking model still needs validation against support, availability, or family fit.

Open official page
Deprecations

Deprecations

Use this when the pricing question is no longer enough because the current path may already be on a shutdown clock.

Open official page
Changelog

Developer changelog

Use this when you need to confirm whether the change is newly published and still active in the current API surface.

Open official page

Workload compare preview

A worked example should tell a team what matters before a calculator exists.

This sample compare uses one realistic OpenAI workload to show where model price helps, where hosted tools dominate, and what should be checked before switching downmarket.

Worked example

Keep this section as a worked example. It should teach the cost shape of a real decision without pretending a live calculator is already shipped.

Monthly workload

20M input tokens and 4M output tokens on a current Responses API path.

Hosted retrieval

40,000 file search tool calls per month with an average 30 GB vector-store footprint.

Compared options

gpt-5.4 short context versus gpt-5-mini on the same workload.

Lifecycle posture

No dated shutdown pressure in this sample, so the call is cost and support fit rather than migration timing.

Model option

gpt-5.4

~$297 per month

Model spend

~$110 model spend plus the same file-search envelope.

Tool cost exposure

~$187 in hosted retrieval cost, driven by 40,000 tool calls and 30 GB average storage.

Constraint risk

Lower support risk in this sample because the current summary rows keep the widest context window and tool coverage.

Recommended next check

Use this path only if the larger context window or broader built-in tool set is actually required by the app.

Model option

gpt-5-mini

~$200 per month

Model spend

~$13 model spend plus the same file-search envelope.

Tool cost exposure

~$187 in hosted retrieval cost, which means the model swap saves less than the retrieval layer costs.

Constraint risk

Higher fit risk because the current summary rows narrow context to 400,000 tokens even though file search support remains in place.

Recommended next check

Choose this row only after confirming the smaller context window and lighter tool envelope still support the actual workload.

Estimated monthly cost

The swap from gpt-5.4 to gpt-5-mini saves about $97 per month in this sample, but most of the bill still comes from file search rather than model tokens.

Tool cost exposure

Hosted retrieval is the first optimization target because it stays roughly $187 per month on both options before token differences are even considered.

Recommended next check

Validate whether the workload truly needs gpt-5.4-level context and tool breadth; if not, the cheaper model can work, but the larger savings may come from reducing retrieval calls or storage footprint.

This is a sample compare, not a live calculator. It combines the current page's published model rows with the listed file-search storage and tool-call charges to show how a real estimate can tilt.

Next move

Use the current OpenAI readout to close the next decision, not to start another round of doc hunting.

These checks are meant to turn the current facts into a concrete budget, comparison, or migration call.

Step 1

Start with the rows you would actually pay.

If the workflow uses web search, file search, or code interpreter, add those charges before comparing OpenAI to another provider.

Step 2

Check model limits before choosing by price.

Context window, output cap, and tool support can invalidate a low-cost option before it reaches production.

Step 3

Treat shutdown dates as part of the selection process.

Before committing to an API path, confirm whether it is current, deprecated, or already on a migration timeline.

Popular decisions

These pages answer the queries most likely to become their own search session: a direct model compare, a cheapest-fit use case, and a transparent pricing calculator.

Model compare

GPT-5.4 vs GPT-5 mini

Use the side-by-side brief when the decision depends on price, context, output caps, and tool support together.

Open GPT-5.4 vs GPT-5 mini

Use case

Cheapest OpenAI model for extraction

Open the extraction brief when the real question is cheapest viable structured extraction, not cheapest generic chat row.

Open extraction recommendation

Calculator

OpenAI API pricing calculator

Run a transparent estimate with token, search, file, and container inputs when the budget now needs a line-item number.

Open pricing calculator

OpenAI work queue

Next pages to add to the OpenAI coverage.

Each one is chosen for a specific pricing, tool-cost, fit, or migration question that still needs its own page.

The current OpenAI coverage is clear for now. New pages will be added when another cost, fit, or lifecycle question needs its own page.

Continue the site

Keep moving through the decision from here.

Use the groups below to move laterally through the decision, not back out into another doc hunt.

Related pages

Stay in the same decision neighborhood instead of backing out to search.

Pricing / Costs

Model pricing, hosted-tool costs, and fit constraints that materially change the operating estimate.

Open page

Deprecations / Migrations

Shutdown dates, migration paths, and replacement decisions grouped into live provider clusters.

Open page

Comparisons

Side-by-side model comparisons and scenario recommendation pages for cost-sensitive decisions.

Open page

Compare pages

Open the pages that turn this topic into a side-by-side decision.

GPT-5.4 vs GPT-5 mini

Side-by-side comparison of GPT-5.4 and GPT-5 mini across price, fit, and tool pressure.

Open page

Cheapest OpenAI model for extraction

Scenario recommendation page for choosing the cheapest workable OpenAI extraction model.

Open page

Replacement pages

Use the likely substitutes, migration targets, or fallback choices as the next click.

GPT-5.4 context and tool support

Limits brief for GPT-5.4 versus GPT-5 mini context windows, output caps, and tool support.

Open page

OpenAI API pricing calculator

Interactive calculator for model tokens, hosted tools, and runtime in one estimate.

Open page

Source category pages

Trace the source families behind this page instead of opening random docs in isolation.

Pricing sources

Official pricing pages used to support model, tool-cost, and calculator estimates.

Open page

Model sources

Official model pages used for context windows, output caps, and built-in tool coverage.

Open page

Deprecation and migration sources

Official shutdown, migration, and replacement references behind lifecycle pages.

Open page

Changelog sources

Official changelog sources used to confirm whether a pricing or lifecycle change is current.

Open page