Anthropic request_too_large: meaning, cause, and fix

Claude Sonnet 4.6 will hold a million tokens of context. Cloudflare won't pass 33 megabytes of JSON. The 413 is about the second number, and the second number is the one nobody measures.

By the benchr team · · Verified against Anthropic's API error documentation, June 12, 2026

AnthropicHTTP 413severity: mediumrequest size

Bytes, not tokens

Token limits and byte limits fail at different doors. A context-window overflow happens inside the model's accounting, after your request has been accepted and read. A 413 happens at the curb: the raw size of the request body gets checked against a per-endpoint cap, and on the direct API that check belongs to Cloudflare, which rejects the request before it ever reaches Anthropic's servers. On the Messages API, the cap is 32 MB.

That split is why this error confuses careful people. A prompt sitting comfortably inside the context window still bounces, because tokens and megabytes measure different things. Plain text almost never gets you to 32 MB — attachments do. Binary content rides inside the JSON body as base64, which adds roughly a third to its size, so a stack of images plus a long history reaches the wall far sooner than the token count suggests.

The wall by endpoint

Anthropic request-size caps by endpoint, per the API error docs
EndpointMax request size
Messages API32 MB
Token Counting API32 MB
Batch API256 MB
Files API500 MB

Note the first two rows match: the Token Counting API shares the Messages cap, so an oversized payload can't even be size-checked by sending it there. It bounces at the same wall, which means the counting has to happen on your side of the wire.

What you'll get back

{
  "type": "error",
  "error": {
    "type": "request_too_large",
    "message": "Request exceeds the maximum allowed number of bytes."
  },
  "request_id": "req_011CSHoEeqs5C35K2UUqR7Fy"
}

Same envelope as every Anthropic error: branch on the type field, and in SDK code catch the typed exception class for the status rather than string-matching the message. Responses carry a req_-prefixed request-id header that the SDKs expose; quote it if the failure turns into a support thread.

Shrink or relocate

The fix starts with a measurement the SDK won't do for you, since the client libraries send whatever you hand them. One function in your wrapper settles it:

# Python: measure the body before the edge does
import json

CAP_MB = 32  # Messages API ceiling, in bytes rather than tokens

def body_size_mb(payload: dict) -> float:
    return len(json.dumps(payload).encode("utf-8")) / 1_048_576

size = body_size_mb(payload)
if size >= CAP_MB:
    # usual culprit: base64 images inline in content blocks
    reroute(payload)  # assets to the Files API, bulk to Batch

When the number comes back big, the culprit is nearly always embedded media, and the relocation map follows the table above. Big assets belong in the Files API at its 500 MB cap, uploaded once and referenced from the message instead of pasted into it. Bulk jobs belong in the Batch API, which takes 256 MB per request and runs at a 50% discount on standard Claude pricing. Anthropic's docs also steer long-running work, especially anything past 10 minutes, toward streaming or the Batch API rather than one enormous synchronous call.

If your failure is token-shaped instead of byte-shaped, that's a different page: the request fit down the wire but overflowed the model's window. The context_length_exceeded breakdown covers the token-side playbook, and the context-window comparison shows which models give you room to stop trimming.

Frequently asked

Why did I get a 413 when my tokens fit the window?

Because the cap counts bytes, not tokens. Attachments travel as base64 inside the JSON body and inflate it well past what the token count implies, so a request can be modest in tokens and enormous in megabytes at the same time.

Does the SDK protect me from oversized requests?

No. The client libraries send what you give them, and the rejection happens at the edge. Measure len(json.dumps(payload).encode()) before sending and reroute anything approaching 32 MB.

Where do big files belong?

In the Files API, which accepts up to 500 MB. Upload once, reference the file from your messages, and the Messages payload stays small no matter how heavy the source material gets.

Changelog

  • — Published. Byte caps per endpoint, the Cloudflare boundary, and the response shape verified against Anthropic's API error docs.

Sources

  • Anthropic API errors — platform.claude.com/docs/en/api/errors (verified June 12, 2026)
  • benchr api-errors.json, the structured entry for this error