Which AI model is the best overall in 2026?

By capability, Claude Fable 5 and Claude Opus 4.8 lead. Switch the rating to its Value lens — which rewards low price — and cheaper frontier-open models like DeepSeek V4-Pro rise to the top. All rankings are computed from data and are never paid placements.

Which AI model is cheapest?

DeepSeek V4-Flash costs $0.14 per million input tokens and $0.28 per million output tokens — the cheapest commercially available API in this set. Llama 4 Scout and Maverick are free under Meta's community license if you self-host.

How often are model rankings updated?

benchr updates the data when new models ship or prices change, typically same-day for major frontier releases. Every entry in models.json carries an 'updated' timestamp. The goal is to be the fastest neutral reference, not a slow quarterly report.

Are these rankings paid or sponsored?

No. benchr rankings are computed entirely from data — capability scores, benchmark results, and pricing. No provider has paid for placement. The methodology is documented on this page and the formula runs in open JavaScript.

Rankings · Updated June 2026

AI model rankings

Name: benchr AI Model Rankings
Creator: benchr
License: https://creativecommons.org/licenses/by/4.0/

26 models across the frontier, mid, and open-weight tiers. Ranked by capability by default — flip the “Rank by” toggle to a Value lens that factors in price. Capability is benchr's synthesis of coding, reasoning, and writing, not an independent lab score. Sort any column, filter by type, license, or provider country, and open any model for the full review.

Data from models.json Rankings computed from data — never paid placements

#	Model	benchr Rating	SWE-bench %	Input $/1M	Output $/1M	Context	Tok/s	Released
1	Claude Fable 5Anthropicfrontier	9.8	91.0%	$10.00	$50.00	1M	58 tok/s	Jun 2026
2	Claude Sonnet 5Anthropicfrontier	9.5	89.4%	$2.00	$10.00	1M	75 tok/s	Jul 2026
3	Claude Opus 4.8Anthropicfrontier	9.5	88.6%	$5.00	$25.00	1M	68 tok/s	May 2026
4	Claude Opus 4.7Anthropicfrontier	9.5	87.6%	$5.00	$25.00	1M	65 tok/s	Apr 2026
5	GPT-5.6OpenAIfrontier	9.4	89.8%	$5.00	$30.00	1.1M	85 tok/s	Jul 2026
6	GPT-5.5OpenAIfrontier	9.3	84.0%	$5.00	$30.00	1.1M	82 tok/s	Apr 2026
7	DeepSeek V4-ProDeepSeekfrontier open	9.1	80.6%	$0.435	$0.870	1M	95 tok/s	Apr 2026
8	Grok 4.5xAIfrontier	9.1	84.0%	$2.00	$6.00	500K	135 tok/s	Jul 2026
9	GPT-5.4OpenAIfrontier	9.0	80.0%	$2.50	$15.00	1M	84 tok/s	Mar 2026
10	Claude Sonnet 4.6Anthropicmid	8.8	79.6%	$3.00	$15.00	1M	95 tok/s	Feb 2026
11	GPT-5OpenAIfrontier	8.8	74.9%	$1.25	$10.00	400K	90 tok/s	Aug 2025
12	GLM-5.2Z.AIfrontier open	8.8	83.0%	$1.40	$4.40	1M	95 tok/s	Jun 2026
13	MiniMax M3MiniMaxfrontier open	8.7	82.0%	$0.300	$1.20	1M	110 tok/s	Jun 2026
14	Gemini 3.5 FlashGooglemid	8.6	80.6%	$1.50	$9.00	1.0M	289 tok/s	May 2026
15	Gemini 3.1 ProGooglefrontier	8.6	80.6%	$2.00	$12.00	1M	108 tok/s	Feb 2026
16	Mistral Medium 3.5Mistralfrontier open	8.6	77.6%	$1.50	$7.50	256K	105 tok/s	Apr 2026
17	Qwen3.6-27BAlibabaopen	8.6	77.2%	Free	Free	262K	105 tok/s	Apr 2026
18	DeepSeek V4-FlashDeepSeekopen	8.5	79.0%	$0.140	$0.280	1M	135 tok/s	Apr 2026
19	Grok 4.3xAIfrontier	8.2	68.0%	$1.25	$2.50	1M	120 tok/s	Apr 2026
20	Llama 4 MaverickMetafrontier open	8.0	66.0%	Free	Free	1M	120 tok/s	Apr 2025
21	Kimi K2.6Moonshot AIfrontier open	7.8	80.2%	$0.950	$4.00	262K	100 tok/s	Apr 2026
22	Mistral Large 3Mistralfrontier open	7.8	62.0%	$0.500	$1.50	256K	115 tok/s	Dec 2025
23	Claude Haiku 4.5Anthropicsmall	7.6	73.3%	$1.00	$5.00	200K	145 tok/s	Oct 2025
24	Phi-4Microsoftsmall open	7.4	30.0%	Free	Free	16K	220 tok/s	Dec 2024
25	GPT-5 MiniOpenAIsmall	7.3	48.0%	$0.250	$2.00	400K	160 tok/s	Aug 2025
26	Llama 4 ScoutMetaopen	7.3	56.0%	Free	Free	10M	180 tok/s	Apr 2025

How the benchr Rating works

The benchr Rating shows one number, and you choose what it means with the Rank by toggle. By default it's pure capability — what the model can do, independent of price. Switch to Value and price is folded in, rewarding cheap and free APIs. It's not a poll and no provider has paid for their position; the capability half is benchr's own editorial read, so treat it as an opinion with its math shown, not a lab measurement.

Quality — the default

Pure capability, the same composite the rest of the site uses. Built from benchmark results and the public record — inspectable in models.json, not from marketing materials.

capability = (coding × 0.40) + (reasoning × 0.40) + (writing × 0.20)

Value — the optional lens

Capability blended with price efficiency, for "most capability per dollar." Price is the blended API rate (average of input and output per million tokens); free and self-hosted models score 100, scaling from $0.50 (full) down to $30.00 (zero).

blended = (input_per_million + output_per_million) / 2 price_score = max(0, min(100, 100 × (1 − max(0, blended − 0.50) / 29.50))) value_score = round(capability × 0.65 + price_score × 0.35)

Both run in assets/js/models.js, so you can read and verify them yourself. Each produces a 0–100 value shown on a 0–10 scale, and price is shown directly in the Input/Output columns regardless of mode. For verified official pricing and benchmark figures, see model-figures.json.

Methodology as of June 1, 2026. Formula may be revised as the model landscape changes — check the changelog for updates.

Frequently asked questions

What is the benchr Rating?

By default it's pure capability (coding, reasoning, writing). A “Rank by” toggle switches it to a Value lens that also folds in price efficiency (most capability per dollar). Both run in open JavaScript you can read in assets/js/models.js; price is shown in its own columns too.

Which AI model is best in 2026?

Depends on your budget and task. For no-budget-limit capability: Claude Opus 4.8. For the best capability-per-dollar: DeepSeek V4-Pro or Gemini 3.5 Flash. For free self-hosted: Llama 4 Maverick. Use the model recommender to get a personalized pick.

Are rankings ever paid or sponsored?

No. Rankings are computed purely from data in models.json. No provider has paid for placement. See editorial standards for the full policy.

How often is the data updated?

The goal is same-day updates when major models ship or providers change prices. The updated field in models.json shows the last data refresh. Spot an error? File a correction · Contact· Privacy · Terms.

Why are benchmark scores labeled as "editorial estimates"?

Many benchmark figures aren't comparable across providers — test sets differ, conditions differ, and some numbers are self-reported. benchr's capability scores are built from available benchmark data but treated as estimates rather than certified figures. The methodology page explains the process. For verified official figures, see model-figures.json.

Other tools

Charts → Intelligence-vs-price quadrant and weighted benchmark explorer Cost calculator → Enter your token usage, get monthly cost ranked cheapest-first Model recommender → Answer three questions, get your best-fit pick with a reason Side-by-side compare → Pick up to five models and compare every dimension Pricing Index → View complete model pricing and cost optimization guide Cheapest Leaderboard → Compare cheapest AI models by token cost Coding Leaderboard → Rank coding models by SWE-bench Verified Reasoning Leaderboard → Rank models by GPQA and reasoning capabilities Context Window Leaderboard → Rank models by context token sizes