Anthropic invalid_request_error: meaning, cause, and fix

The 400 that changed meaning in 2026. Your JSON is probably fine; the parameters and tricks that worked on older Claude models are what's failing.

By the benchr team · · Verified against Anthropic's API error documentation, June 12, 2026

AnthropicHTTP 400severity: mediumrequest format

Three new ways to fail in 2026

1. Sampling parameters on Opus 4.7 and later

Anthropic deprecated temperature, top_p, and top_k on Claude Opus 4.7 and everything after it, Opus 4.8 included. Set any of them to a non-default value and the call fails with a 400 instead of the field being quietly ignored. The fix is removal, not tuning: drop the parameters and steer variability through the prompt.

2. Prefilled assistant messages

Ending the conversation with a partial assistant turn used to be the go-to trick for forcing output shape. Current models reject it. Claude Fable 5, Claude Mythos 5, Claude Mythos Preview, Claude Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 all return a 400 with this exact message:

"Prefilling assistant messages is not supported for this model."

Anthropic's docs name three replacements: structured outputs, system-prompt instructions, or output_config.format. If your code starts replies with { to coax out JSON, that's the line to delete.

3. Edited thinking blocks

Extended thinking comes with a strict round-trip rule. If thinking or redacted_thinking blocks in the latest assistant message were edited, reordered, filtered, or reconstructed, the API returns a 400. The message starts with the offending block's position, such as messages.1.content.0, and then states the rule:

`thinking` or `redacted_thinking` blocks in the latest assistant message
cannot be modified. These blocks must remain as they were in the
original response.

With tool use, every thinking block must be passed back exactly as received, including empty ones. History-trimming middleware that strips "useless" blocks to save tokens is the usual culprit here, and it breaks quietly until the first tool call.

The old-fashioned causes

Once the three modern traps are cleared, what's left is the original meaning of the error: malformed JSON, a missing required field like model or max_tokens, a wrong type, or a broken message structure. Whatever the trigger, the response rides the same envelope, and the request_id is your ticket number if you end up writing to support:

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "..."
  },
  "request_id": "req_..."
}

The two-line fix

For the most common 2026 case, the entire repair is deleting fields. Before and after for an Opus 4.8 call:

# BEFORE: returns 400 on claude-opus-4-8
{
  "model": "claude-opus-4-8",
  "max_tokens": 1024,
  "temperature": 0.7,
  "top_p": 0.9,
  "messages": [{"role": "user", "content": "Summarize this contract."}]
}

# AFTER: same call, sampling parameters removed
{
  "model": "claude-opus-4-8",
  "max_tokens": 1024,
  "messages": [{"role": "user", "content": "Summarize this contract."}]
}

If the old temperature: 0.2 was there for consistency, say so in the prompt instead — "give the single most likely reading, don't brainstorm alternatives" — and reach for structured outputs when a parser consumes the result.

If you're migrating off Opus 4 or 4.1

The sampling-parameter trap is the number-one post-migration 400. Opus 4 retired June 15, 2026, Opus 4.1 follows August 5, and nearly every integration written for them sets temperature somewhere, because 2025-era guides said to. When you flip the model ID to claude-opus-4-8, sweep the request builder in the same commit. The Opus 4 and 4.1 retirement guide walks the full timeline, and the Opus 4.8 pricing page covers the much smaller bill waiting on the other side.

Frequently asked

Why does temperature break Opus 4.8 when GPT accepts it?

Provider divergence. OpenAI still honors sampling parameters; Anthropic deprecated temperature, top_p, and top_k on Opus 4.7 and later, and any non-default value returns a 400 by design. Same field name, different contract, so build requests per provider.

Can I still get consistent output without temperature?

Yes. Ask the prompt for the single most likely answer and forbid creative variation. For anything a parser consumes, structured outputs constrain the response more reliably than a sampling knob ever did.

Why does my agent loop hit a 400 right after tool use?

Almost always modified thinking blocks. Frameworks that trim or reorder history violate the rule that thinking blocks in the latest assistant message must come back unchanged. Pass them back verbatim, empty ones included.

Changelog

  • — Published. Prefill restriction, thinking-block rule, and the Opus 4.7+ sampling-parameter deprecation verified against Anthropic's API error and deprecation docs.

Sources

  • Anthropic API errors · platform.claude.com/docs/en/api/errors (verified June 12, 2026)
  • Anthropic model deprecations · platform.claude.com/docs/en/about-claude/model-deprecations (verified June 12, 2026)
  • benchr api-errors.json · structured entry for this error