Review·May 2026

DeepSeek-V4, reviewed

Name: DeepSeek-V4, reviewed
Item: DeepSeek-V4
Rating: 4.5
Author: benchr

An MIT-licensed model that codes like a paid one and costs nothing to download. The real decision is how you run it.

By the benchr team · Reviewed May 30, 2026 · View changelog · Figures verified against official sources, 30 May 2026

benchr rating: 4.5 / 5

Weights license $0 MIT open weights, free to download and self-host

SWE-bench Verified 80.6% V4-Pro, DeepSeek-reported, not yet reproduced

Context window 1M 1,000,000 tokens, 384K max output

V4-Flash output / 1M $0.28 Hosted API, against dollars for closed frontier

Here's the pitch in one line: you can download DeepSeek-V4 for free, point it at your repository, and get coding output that DeepSeek says tops the open-source field. The weights ship under the MIT license. There's no per-seat fee, no API key required to start, no terms that say you can't use it commercially. For a model competing with closed frontier coders that bill by the token, "free to download" is the headline, and it's true.

DeepSeek announced V4 as a preview on April 24, 2026, per its official launch note, in two open-weight sizes: DeepSeek-V4-Pro, the large flagship, and DeepSeek-V4-Flash, the smaller default. Both are mixture-of-experts models with a 1M-token context and 384K max output. Both are on Hugging Face under MIT. So the interesting question stopped being "is it good" and became "what does it cost you to run," because that answer changes completely depending on which of three paths you pick.

This review treats that as the decision. Self-host the open weights, call DeepSeek's hosted API, or keep paying a closed frontier model. The facts below come from DeepSeek's official API docs, pricing page, and model cards, with one flag you should read first: V4 is a preview release, and the coding score everyone quotes is DeepSeek's own number, not yet independently reproduced.

Which version this is

The current DeepSeek to care about is V4, not the V3.x line. V3.1 shipped in August 2025 and V3.2 in December 2025; both are superseded by the V4 preview in the official API changelog. When you provision the API, the model IDs are deepseek-v4-pro and deepseek-v4-flash. The older deepseek-chat and deepseek-reasoner names still work but now route to V4-Flash's non-thinking and thinking modes, and they're slated for deprecation, so write new integrations against the V4 names.

One honest caveat before the benchmarks: V4 is labeled a preview. Whether a non-preview stable build exists yet isn't confirmed in the official sources. If you're shipping to production, treat the preview tag as a reason to pin your version and watch the changelog, not as a blocker.

Codes like a paid model, on DeepSeek's own numbers

The headline coding result is 80.6% on SWE-bench Verified for DeepSeek-V4-Pro, which DeepSeek presents as open-source state of the art on agentic coding. That's the number doing the heavy lifting in the "competes with paid frontier" claim, and it's worth being precise about where it comes from. It's a vendor-reported figure, drawn from DeepSeek's technical report and model card. As of late May 2026 it had not been independently reproduced on a public leaderboard. So it's a strong signal, not a settled fact.

What's not in dispute is the shape of the lineup. V4-Pro is the large model (announced at 1.6T total parameters, around 49B active) for the hardest work; V4-Flash is the smaller default (announced at 284B total, around 13B active) for volume. Both carry the 1M-token context, which matters for coding because a million tokens is enough to hold a real codebase in the prompt instead of stitching it together with retrieval. benchr's survey of the open-weight tier right now covers how V4 stacks against the other free-to-download options, and the Qwen review is the natural cross-shop if you're choosing between open coding models.

Three cost paths, one model

This is the decision. The same model, run three ways, has wildly different economics. Below is what each path costs and what you trade for it.

Three ways to run DeepSeek-V4, May 2026. API rates from the official pricing page.
Path	What you pay	What you trade
Self-host (open weights)	$0 license + your hardware	A multi-GPU server and the ops to keep it running; pays off only at high, steady volume or when data can't leave your network
Hosted DeepSeek API	V4-Flash $0.14 in / $0.28 out; V4-Pro $0.435 in / $0.87 out (per 1M)	No control over the infra; data goes to DeepSeek; V4-Pro rate is currently promotional (ends May 31 UTC), though DeepSeek's changelog says the post-promo official price will be the same level
Closed frontier model	Dollars per 1M output, often $15–$25	The most money, for independently tested accuracy and vendor support you don't have to reproduce yourself

Look at the middle row first, because it's where most people should land. V4-Flash bills $0.28 per million output tokens on the hosted API. A closed frontier coder like Claude Opus 4.8 bills $25 per million output. Same kind of task, and the gap is not a few percent.

89× How much cheaper DeepSeek-V4-Flash's hosted output ($0.28/1M) is than a closed frontier model billing $25/1M. The price gap is the whole story.

Even the flagship widens, not closes, the gap: V4-Pro at $0.87 per million output is still roughly 29 times cheaper than a $25 closed model, and that's at the promotional rate. One thing to understand about the V4-Pro rates: they reflect a 75% promotional discount scheduled to end at 15:59 UTC on May 31, 2026. DeepSeek's own changelog says that after the promotion closes, pricing will be "officially adjusted to 1/4 of the original price" — which is exactly the promotional rate in effect today. The promo price becomes the official price; these rates are not scheduled to rise. Still verify the live number on the pricing page before you build a budget on it, in case anything changes.

When self-hosting makes sense

"Free to download" is real, but free weights are not the same as free to run. Self-hosting V4-Pro means standing up a multi-GPU box big enough to hold a 1.6T-parameter mixture-of-experts model and the engineering to keep it serving. At low or bursty volume, the hosted API at cents per million tokens beats your own electricity bill before you've even paid for the hardware. benchr's guide to running models on your own machine walks through where that line sits for open weights this size, and the short version is that it sits a lot higher than people expect.

Self-hosting wins in two cases. The first is data that legally can't leave your network, where the hosted API is off the table regardless of price. The second is very high, very steady volume, where amortizing fixed hardware beats per-token billing. Outside those, the MIT license is best understood as insurance: it means you can take the weights and run, so you're never locked in, even if you never exercise the option.

When to keep paying for closed frontier

The case for a paid closed model is narrower than it was a year ago, but it isn't gone. You're paying for accuracy that's been independently tested rather than vendor-reported, for support when something breaks, and for the smaller variance that comes with a heavily-exercised production model. If a missed bug on your codebase is expensive, that premium can be worth it, and the 80.6% being DeepSeek's own number is exactly the kind of thing that argues for caution on the most critical work.

For sizing that decision by workload rather than sticker price, benchr's price-per-use-case breakdown is the right tool: it maps which model wins once you account for how much you call it, which is where an 89× per-token gap either dominates or barely registers.

The verdict

DeepSeek-V4 is the strongest argument yet that a free, open-weight model can sit in the same coding conversation as the paid frontier. The 4.5 here reflects that: the value is genuine, the price gap is enormous, and the one thing holding it back from a higher mark is that the headline coding score is still DeepSeek's own number and the flagship pricing is mid-promotion.

Go with the hosted API if you're most people: it's cents per million tokens, drop-in compatible with OpenAI and Anthropic interfaces, and removes the operations burden entirely. Self-host the open weights when data can't leave your network or when volume is high enough to amortize a GPU server. Stick with a closed frontier model when you need accuracy that's been tested by someone other than the vendor, and a one-point benchmark claim isn't enough to bet a production codebase on. The decision is no longer about whether an open model can code. It's about which of three bills you'd rather pay.

Frequently asked

Is DeepSeek-V4 free?

The weights are free. DeepSeek-V4-Pro and DeepSeek-V4-Flash are published under the MIT license on Hugging Face, so you can download them and self-host at no licensing cost. The free web and app chat at chat.deepseek.com also lets you use V4 without paying. What is not free is the hosted API, billed per token, and the hardware you need to run the open weights yourself.

How good is DeepSeek-V4 at coding?

DeepSeek reports DeepSeek-V4-Pro scores 80.6% on SWE-bench Verified, which it presents as open-source state of the art on agentic coding. Treat that as a vendor-reported figure: it comes from DeepSeek's own technical report and model card and had not been independently reproduced as of late May 2026. It puts V4-Pro in the conversation with closed frontier coding models, but you should benchmark it on your own repository before committing.

What does the DeepSeek API cost?

On the official pricing page, V4-Flash is the default at $0.14 per million input tokens on a cache miss and $0.28 per million output. V4-Pro, the flagship, is $0.435 per million input on a cache miss and $0.87 per million output. These V4-Pro rates reflect a 75% promotional discount ending at 15:59 UTC on May 31, 2026. DeepSeek's changelog states that after the promotion, pricing will be "officially adjusted to 1/4 of the original price" — the same level as today's rate, so the promo price becomes the official price. Still confirm the live rate on the pricing page before you build a budget on it.

What is the difference between DeepSeek-V4-Pro and V4-Flash?

Both are open-weight mixture-of-experts models with a 1M-token context and 384K max output. V4-Pro is the large flagship (announced at 1.6T total parameters, around 49B active) aimed at the hardest coding and reasoning work. V4-Flash is the smaller, cheaper default (announced at 284B total, around 13B active) for high-volume work where latency and price matter more than topping a benchmark. The legacy deepseek-chat and deepseek-reasoner API names now map to V4-Flash's non-thinking and thinking modes.

Should I self-host DeepSeek-V4 or use the API?

Use the hosted API unless you have a specific reason not to. Self-hosting a 1.6T-parameter model means a multi-GPU server and the engineering to keep it running, which only pays off at very high, steady volume or when data cannot leave your network. For everyone else, the API costs cents per million tokens and removes the operations burden. Reach for the open weights when control or privacy is the requirement, not raw price.

Changelog

May 30, 2026 — Originally published and corrected same day. Version, pricing, context window, and vendor-reported coding score verified against DeepSeek's official API changelog (api-docs.deepseek.com/updates), pricing page, launch note (news260424), and the deepseek-ai Hugging Face model cards. The 80.6% SWE-bench Verified figure is labeled DeepSeek-reported and not yet independently reproduced. A prior version incorrectly stated V4-Pro prices would "rise" after May 31; corrected to reflect DeepSeek's changelog language: after the 75% promotional discount ends at 15:59 UTC May 31, pricing is "officially adjusted to 1/4 of the original price" — the same rate as today's promotional level, not a price increase.

References

DeepSeek, "API updates / changelog," api-docs.deepseek.com/updates, accessed May 2026.
DeepSeek, "Models & pricing," api-docs.deepseek.com/quick_start/pricing, accessed May 2026.
DeepSeek, "DeepSeek-V4 announcement," api-docs.deepseek.com/news/news260424, April 24, 2026.
DeepSeek, home, deepseek.com, accessed May 2026.
DeepSeek, "DeepSeek-V4-Pro model card," huggingface.co/deepseek-ai/DeepSeek-V4-Pro, accessed May 2026.
DeepSeek, "DeepSeek-V4-Flash model card," huggingface.co/deepseek-ai/DeepSeek-V4-Flash, accessed May 2026.