LLM API pricing, side by side

Token prices for six providers, taken from their official pricing pages and nothing else. Standard tier, USD per 1 million tokens, with the caching and long-context caveats that quietly change your bill.

Runs entirely in your browser. Nothing you enter leaves this page.

Every number verified against the linked source on 2026-07-25. The spread is real: DeepSeek V4 Flash costs $0.42 per combined 1M in + 1M out, GPT-5.5 Pro costs $210.00.

Anthropic

official pricing page

Anthropic model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
Claude Fable 5	$10.00	$1.00	$50.00	$60.00	1M
Claude Mythos 5	$10.00	$1.00	$50.00	$60.00	1M
Claude Opus 4.8	$5.00	$0.50	$25.00	$30.00	1M
Claude Opus 5	$5.00	$0.50	$25.00	$30.00	1M
Claude Sonnet 4.6	$3.00	$0.30	$15.00	$18.00	1M
Claude Sonnet 5	$2.00	$0.20	$10.00	$12.00	1M
Claude Haiku 4.5	$1.00	$0.10	$5.00	$6.00	-

Cached input is the cache read price (0.1x input). Cache writes cost extra: 1.25x input for 5-minute TTL, 2x for 1-hour TTL.
Batch API: 50 percent off input and output.
Claude Sonnet 5 is introductory-priced at $2 in / $10 out through 2026-08-31; standard pricing rises to $3 / $15 on 2026-09-01.
Fable 5, Mythos 5, Opus 4.8, Sonnet 5, and Sonnet 4.6 include the full 1M-token context window at standard pricing.

OpenAI

official pricing page

OpenAI model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
GPT-5.5 Protiered	$30.00	-	$180.00	$210.00	-
GPT-5.5tiered	$5.00	$0.50	$30.00	$35.00	-
GPT-5.4tiered	$2.50	$0.25	$15.00	$17.50	-
GPT-5.3-Codex	$1.75	$0.175	$14.00	$15.75	-
GPT-5.4 mini	$0.75	$0.075	$4.50	$5.25	-
GPT-5.4 nano	$0.20	$0.02	$1.25	$1.45	-

GPT-5.5, GPT-5.5 Pro, and GPT-5.4 have a separate long-context tier (roughly 2x input, 1.5x output). The threshold is not stated on the pricing page.
Batch and Flex: 50 percent off. Priority tier costs roughly 2x standard.

Google

official pricing page

Google model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
Gemini 3.1 Pro (preview)tiered	$2.00	$0.20	$12.00	$14.00	-
Gemini 3.5 Flash	$1.50	$0.15	$9.00	$10.50	-
Gemini 3.6 Flash	$1.50	$0.15	$7.50	$9.00	-
Gemini 2.5 Protiered	$1.25	$0.125	$10.00	$11.25	-
Gemini 3.5 Flash-Lite	$0.30	$0.03	$2.50	$2.80	-

Gemini 3.1 Pro and 2.5 Pro charge higher rates for prompts above 200K tokens (3.1 Pro: $4.00 in / $18.00 out).
Flash models charge more for audio input than text input. Rates shown are text/image/video.
Context caching storage costs $1.00 per 1M tokens per hour on top of cached-input rates.

DeepSeek

official pricing page

DeepSeek model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
DeepSeek V4 Pro	$0.44	$0.0036	$0.87	$1.31	1M
DeepSeek V4 Flash	$0.14	$0.0028	$0.28	$0.42	1M

Cached input is the cache-hit price; cache misses pay the normal input rate.
deepseek-chat and deepseek-reasoner aliases deprecate 2026-07-24.

Mistral

official pricing page

Mistral model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
Magistral Medium	$2.00	-	$5.00	$7.00	-
Mistral Medium 3.5	$1.50	-	$7.50	$9.00	-
Mistral Large 3	$0.50	-	$1.50	$2.00	-
Codestral	$0.30	-	$0.90	$1.20	-
Mistral Small 4	$0.15	-	$0.60	$0.75	-

Mistral Large 3 is priced below the newer Mistral Medium 3.5 on the official page (verified twice on 2026-06-10).
Batch API: 50 percent off.

xAI

official pricing page

xAI model prices, USD per million tokens
Model	Input /1M	Cached /1M	Output /1M	1M in + 1M out	Context
Grok 4.5tiered	$2.00	$0.30	$6.00	$8.00	500K
Grok 4.20 (reasoning)	$1.25	$0.20	$2.50	$3.75	1M
Grok 4.3	$1.25	$0.20	$2.50	$3.75	1M
Grok Build 0.1	$1.00	$0.20	$2.00	$3.00	256K

Batch API: 20 to 50 percent off standard rates, varies per model.
Grok 4.5 has a long-context tier: rates roughly double above 128K tokens ($4 in / $12 out).

How to read this table honestly

Cached input is not one thing. Anthropic and DeepSeek list a cache-read price; Anthropic also bills cache writes (1.25x input), Google bills cache storage per hour. Two providers with the same "cached" number can produce different bills.
"Tiered" means the listed price is the floor. OpenAI's GPT-5.5/5.4 and Google's Pro models charge roughly double for long-context requests, which is exactly what agent workflows produce.
Batch discounts are large. Anthropic, OpenAI, and Mistral all offer 50 percent off for async batch workloads; xAI 20 to 50 percent.
Excluded providers are excluded for a reason. Meta Llama API, Cohere, and Amazon Nova render their prices client-side, so they could not be verified from source. No number on this page is quoted from a third-party blog.

LLM API pricing, side by side

Anthropic

OpenAI

Google

DeepSeek

Mistral

xAI

How to read this table honestly

More free tools

GitHub AI Credits Calculator

Anthropic API cost analyzer

How much are you spending on AI tools?

AI model API prices over time

Codex Credits Calculator

Claude Code spend analyzer

GitHub Copilot bill migrator

Copilot usage CSV analyzer