Three weeks ago, Anthropic shipped its strongest model and then watched the U.S. government switch it off. Now OpenAI has done the inverse: it has shipped its frontier series with the gate built in from the start. On June 26, OpenAI previewed GPT-5.6 — Sol, Terra, and Luna — not to the public, not even to the usual API waitlist, but to roughly 20 trusted partners that the U.S. government has signed off on. If the Fable 5 saga was a recall, this is a release that began life on a leash. The two events rhyme, and they rhyme for the same reason: a June 2 Executive Order that now sits over every frontier launch in the United States.
Three models, and what each is for
GPT-5.6 is a series, not a single model. OpenAI split it three ways by cost and capability:
- Sol — the flagship. This is the model OpenAI is making frontier claims about, and the one the government review is built around.
- Terra — the balanced tier. OpenAI describes it as roughly GPT-5.5-class capability at about half the cost, which is the headline efficiency story of the release.
- Luna — the cheapest and fastest. The high-volume option for work that doesn't need the flagship.
The naming is a deliberate break from OpenAI's "Mini / Nano" suffixes. Sol, Terra, and Luna read as a tiered family rather than shrunk-down variants — a structural echo of how Anthropic ranks Opus, Sonnet, and Haiku.
Two new reasoning modes: "max" and "ultra"
Alongside the three models, GPT-5.6 introduces two reasoning modes that change how a model works rather than which model you pick. Max dials up reasoning effort — deeper, longer deliberation on a single line of work. Ultra goes wider instead of deeper: it spins up subagents to parallelize a complex task across multiple workers, which is the same architectural idea behind OpenAI's recent agentic Codex push.
What OpenAI has not said is what either mode does to your bill or your latency. Deeper reasoning and parallel subagents both tend to burn more output tokens, sometimes far more, but no per-mode pricing or token-multiplier has been published. Treat the cost of "max" and "ultra" as unknown until OpenAI documents it.
The pricing — announced, not yet official
Here is what OpenAI stated in its preview post and help center, per million tokens. We are reproducing it because it's the company's own number — but the single most important fact on this page is that, as of June 28, 2026, none of it appears on OpenAI's official pricing page. That page is what API tooling reads, and it still lists the previous generation. Until GPT-5.6 lands there, these are quotes from a blog post, not a contract.
| Model | Input | Output | Cached input (implied) |
|---|---|---|---|
| Sol (flagship) | $5.00 | $30.00 | ~$0.50 |
| Terra (balanced) | $2.50 | $15.00 | ~$0.25 |
| Luna (cheapest) | $1.00 | $6.00 | ~$0.10 |
The cached-input column is implied, not published: OpenAI says cache reads get the standard 90% cached-input discount, which would put Sol at roughly $0.50, Terra at $0.25, and Luna at $0.10 — but no explicit cached dollar figure has been released. Cache writes bill 1.25x the uncached input rate.
The sticker prices are most legible against the rest of the field, so here is where each tier lands. Sol's $5 / $30 is the identical sticker to GPT-5.5 ($5 / $30) — OpenAI is holding the flagship line on price while claiming to move it on capability. Terra's $2.50 / $15 matches GPT-5.4, which is what makes the "GPT-5.5-class at half the cost" pitch interesting if it holds up. Luna's $1 / $6 threads between GPT-5 ($1.25 / $10) and GPT-5 Mini ($0.25 / $2). For external anchors: Claude Opus 4.8 is $5 / $25, Gemini 3.1 Pro is $2 / $12, and DeepSeek V4-Pro is $0.435 / $0.87. The full breakdown, with the caching math and these caveats restated, lives on the GPT-5.6 pricing page.
Who can actually use it: the ~20-partner gate
This is the part that separates GPT-5.6 from a normal OpenAI launch. The preview is not open. OpenAI is releasing it to roughly 20 trusted partners that the U.S. government has approved, reachable through the OpenAI API and Codex. General availability is promised "in the coming weeks," with ChatGPT access arriving later still. If you are not one of those vetted partners, you cannot run Sol, Terra, or Luna today — not on the API, not in ChatGPT, not anywhere.
OpenAI is explicit about why. It says the restriction is in place "at the request of the U.S. government," citing national security, and tied to the June 2, 2026 Executive Order on "covered frontier models." Under that order, a model that meets a classified cyber-capability threshold can be designated a "covered frontier model," which triggers an up-to-30-day pre-release government review. GPT-5.6 Sol went through that process. OpenAI also went on record saying this kind of gating should not become the default for frontier releases — a notable thing to say while complying with it.
The benchmarks OpenAI published
At the preview, the numbers were a black box — but OpenAI's GPT-5.6 preview system card publishes the scorecards, and they back the headline claim. On Terminal-Bench 2.1, the agentic command-line benchmark, Sol sets a new state of the art — and in ultra mode it pulls clear of the field. Here is the full ranking, reproduced from OpenAI's own charts.
TerminalBench 2.1
Agentic command line · higher is betterSource: OpenAI GPT-5.6 preview system card. Purple = GPT-5.6 family; Sol's ultra mode runs subagents in parallel.
The other two benchmarks are efficiency frontiers — they plot score against the output tokens spent, so a curve that sits higher and further left is doing more with less. On both, Sol leads, and the three GPT-5.6 tiers fan out by how much reasoning each is willing to spend.
GeneBench v1
Biology · score vs output tokensPeaks: Sol ~30.7% · Terra ~28.3% · GPT-5.5 ~23% · Luna ~14.5%. Curves reproduced from OpenAI's preview system card; the point is the shape — Sol reaches the top scores spending the fewest output tokens.
ExploitGym
Cyber · intended exploits vs output tokens (6h limit)Peaks (6-hour limit): Sol ~33.7% · Terra ~23.3% · GPT-5.5 ~15.2% · Luna ~12.4% · GPT-5.4 ~7%. Dashed = the 6-hour-budget frontier; reproduced from OpenAI's preview system card.
So the SOTA claim holds up: Sol leads Terminal-Bench 2.1 outright, and ultra mode stretches the lead to 91.9%. Read the field carefully, though — Terra lands at 84.3%, exactly tying Claude Fable 5 and barely ahead of GPT-5.5's 83.4%, while Luna's 82.5% slots just below GPT-5.5. The series wins at the top; in the middle it's a crowded few points.
What's still missing matters as much as the scores. Even with the benchmarks out, OpenAI has not published the context window, the maximum output length, or the exact API model IDs for any of the three models. Third-party blogs floating "gpt-5.6-sol / terra / luna" identifiers and a 1.5M-token context are unconfirmed — we will not state them as fact, and neither should anyone planning around this release. For a series being pitched on efficiency and agentic work, a missing context window is not a footnote; it's a core spec you can't yet plan against — which is also why benchr holds GPT-5.6 out of the interactive ranking tool until SWE-bench Verified and a context window are official.
What it means for you
For almost everyone reading this, the practical answer today is: you can't use GPT-5.6, so don't rebuild around it yet. The preview is a ~20-partner, government-approved program; "in the coming weeks" is the only timeline, and frontier timelines have slipped before. Even when GA arrives, the export-control backdrop means non-U.S. developers should not assume automatic access — the Fable 5 episode showed how fast a government order can carve out foreign nationals.
If you're an existing OpenAI shop, the useful read is the price structure, not the access. Terra at GPT-5.4 pricing with claimed GPT-5.5-class capability is the tier to watch, because if it ships at that price it's a genuine cost cut on capable inference. But wait for two things before you budget: the prices appearing on the official pricing page, and a published context window. Until both exist, GPT-5.6 is an announcement, not a tool you can deploy.