Claude Sonnet 5 pricing: cost per 1M tokens and cost scenarios

Claude Sonnet 5 is Anthropic's second Mythos-class model, and the first to land at a mid-tier price: $4/1M input, $20/1M output — squarely between Sonnet 4.6 and Opus 4.8. It scores 89.4% on SWE-bench Verified, edging out Opus 4.8's 88.6%, while carrying a 200K max output limit more than three times Sonnet 4.6's.

By the benchr team · · Figures verified against official sources, July 1, 2026 · View changelog

Input / 1MAnthropic · July 2026
Output / 1MAnthropic
SWE-benchverified
Context200K max output

Head-to-head: for the case-by-case breakdown of what the extra dollar buys over Sonnet 4.6, and where Opus 4.8 still wins, see the sections below. The launch coverage has the fuller capability picture.

Pricing breakdown

claude-sonnet-5 — official Anthropic pricing, July 1, 2026
TierRate / 1M tokens
Standard input$4.00
Standard output$20.00
Cached input$0.40
Batch (50% off)$2.00 / $10.00
Context window1,000,000 tokens
Max output200,000 tokens

A new price tier, not a new number after "Sonnet"

Claude Sonnet 5 is Anthropic's second model built on the Mythos-class architecture it introduced with Claude Fable 5 on June 9. Rather than call this "Sonnet 4.7" or "4.8", Anthropic rolled the new architecture down to a mid-tier price point: $4/1M input and $20/1M output, landing between Sonnet 4.6's $3/$15 and Opus 4.8's $5/$25. Cached input runs $0.40/1M, and the batch API cuts both rates in half to $2/$10 — the same 50% batch convention used across the rest of the Claude line.

Beating last month's flagship on one number

Claude Sonnet 5 scores 89.4% on SWE-bench Verified, which edges out Claude Opus 4.8's 88.6% — a mid-tier model outscoring last month's flagship on the benchmark that best predicts real-world coding-agent success. It also posts 71.8% on SWE-bench Pro, 85.6% on Terminal-Bench 2.1, 96.0% on HumanEval, and 93.5% on MATH. The one place Opus 4.8 keeps its lead is GPQA Diamond: 93.6% against Sonnet 5's 92.0%. If your workload is dominated by graduate-level science reasoning rather than coding, that 1.6-point gap is the reason to stay on Opus.

The rest of the scorecard: LMSYS Arena 1435, MMLU 93.8%, ARC-AGI-2 20.0, and 42.5% on Humanity's Last Exam without tools — a strong general profile for a model priced below the flagship.

200K max output: the practical headline feature

Claude Sonnet 5 supports 200,000 output tokens per response — more than three times Sonnet 4.6's 64,000-token ceiling. For workloads that generate long documents, large diffs, or multi-file code changes in a single call, this removes a real constraint: tasks that used to require splitting a response across multiple Sonnet 4.6 calls now fit in one Sonnet 5 call, with the full 1,000,000-token context available to read from.

Adaptive thinking, and the same safety fallback as Fable 5

Like Fable 5, Claude Sonnet 5 inherits the Mythos-class architecture's adaptive thinking: the model decides its own reasoning depth per request, and there's no manual extended-thinking toggle to configure. It also inherits the same safety-classifier behavior — offensive-cyber requests, most biology and chemistry questions, and distillation attempts are automatically routed to Claude Opus 4.8 instead of being answered directly. Budget at the Sonnet 5 rate; treat anything that falls back to Opus 4.8 pricing as the exception, not the rule.

Cost scenarios

A 1M-token session (800K input, 200K output — a single call that uses the full max-output ceiling): 800,000/1M × $4 + 200,000/1M × $20 = $3.20 + $4.00 = $7.20. Route the same session through the batch API and it drops to $1.60 + $2.00 = $3.60, exactly half.

Add caching: if a stable system prompt or repo snapshot covers 720K of that 800K input (90% cache hit), the input leg becomes 720,000/1M × $0.40 + 80,000/1M × $4 = $0.288 + $0.32 = $0.608, for a session total of $0.608 + $4.00 = $4.61 — a 36% reduction from the uncached $7.20.

A typical day of usage: a coding-agent team running 25 of those 1M-token sessions a day pays 25 × $7.20 = $180/day uncached, or 25 × $3.60 = $90/day on batch. At a smaller, monthly scale — 20M input plus 5M output tokens, the same volume used to compare Sonnet 4.6 and Opus 4.8 — Sonnet 5 costs 20 × $4 + 5 × $20 = $80 + $100 = $180/month. Sonnet 4.6 at that volume is $60 + $75 = $135/month; Opus 4.8 is $100 + $125 = $225/month. Three tiers, one clean linear step in both price and capability.

Use-case fit

Best for: Sonnet 4.6 users who want a real capability step without paying Opus pricing; coding-agent pipelines that benefit from the 200K max output ceiling; teams whose workload is coding-heavy, where Sonnet 5's SWE-bench Verified score now edges out Opus 4.8.

Skip if: Your workload leans on graduate-level science and reasoning tasks — Opus 4.8 still leads GPQA Diamond by 1.6 points; you need the absolute reasoning ceiling regardless of price; or your traffic is high-volume and routine, where Sonnet 4.6 remains the cheaper default.

Decision checklist

Coming from Sonnet 4.6: run your hardest coding-task eval on both. If Sonnet 5's SWE-bench-level gains show up in your results, the 33% price increase is easy to justify against the 200K output ceiling alone.

Comparing against Opus 4.8: if your tasks are coding-dominant, Sonnet 5 is both cheaper and higher-scoring on SWE-bench Verified. If they're reasoning-heavy in the GPQA sense, Opus 4.8's 93.6% still buys something Sonnet 5 doesn't match.

Frequently asked

How does Claude Sonnet 5 pricing compare to Sonnet 4.6 and Opus 4.8?

Claude Sonnet 5 sits exactly between them: $4/1M input and $20/1M output, versus Sonnet 4.6's $3/$15 and Opus 4.8's $5/$25. At 20M input plus 5M output tokens a month, that's $135 on Sonnet 4.6, $180 on Sonnet 5, and $225 on Opus 4.8 — a clean, linear step up in both price and capability.

Does Claude Sonnet 5 actually beat Opus 4.8 on benchmarks?

On one important one, yes. Claude Sonnet 5 scores 89.4% on SWE-bench Verified against Opus 4.8's 88.6% — a mid-tier model beating last month's flagship on coding. But Opus 4.8 still leads on GPQA Diamond, 93.6% versus Sonnet 5's 92.0%, so for the hardest reasoning-heavy work Opus 4.8 remains the pick.

What is adaptive thinking and why is there no extended-thinking toggle?

Claude Sonnet 5 inherits the Mythos-class architecture introduced with Claude Fable 5: it decides its own reasoning depth per request instead of exposing a manual extended-thinking switch. It also inherits the same safety-classifier behavior — offensive-cyber, most biology and chemistry, and distillation requests fall back to Claude Opus 4.8 automatically.

Changelog

  • — Published at launch. Pricing, benchmarks, context, and max-output figures verified against Anthropic's announcement and the official models documentation.

Sources

  • Anthropic API pricing — anthropic.com/pricing (verified July 1, 2026)
  • SWE-bench Verified leaderboard — swebench.com (verified July 1, 2026)
  • benchr models.json — verified July 1, 2026