Reference·June 2026

AI API errors: identify the cause and apply the right fix

A practical reference for common OpenAI, Anthropic, and Gemini errors, checked against provider documentation and organized by the action each error requires.

By the benchr team · Published June 12, 2026 · Every error verified against official provider docs, June 12, 2026

Showing 15 of 15 errors

OpenAI · 429 · quotainsufficient_quota

The account has no usable quota. Retrying does not change the billing state.

OpenAI · 429 · rate limitrate_limit_exceeded

The request or token rate exceeded the account's current limit.

OpenAI · 400 · contextcontext_length_exceeded

The input and requested output together exceed the model's context window.

OpenAI · 404 · availabilitymodel_not_found

Wrong ID, wrong endpoint, no access — or a model OpenAI retired.

OpenAI · 401 · authinvalid_api_key

"Incorrect API key provided" — key wrong, revoked, or malformed.

Anthropic · 529 · overloadoverloaded_error

Anthropic is temporarily overloaded. Retry with jitter and a firm deadline.

Anthropic · 429 · rate limitrate_limit_error

Tier ceiling — or acceleration limits if your usage ramped too sharply.

Anthropic · 400 · formatinvalid_request_error

Check unsupported sampling parameters, assistant prefills, and modified thinking blocks.

Anthropic · 413 · sizerequest_too_large

Body over the 32 MB Messages cap — rejected before Anthropic even sees it.

Anthropic · 404 · availabilitynot_found_error

Since June 15, 2026 the top cause is a retired Claude model ID.

Gemini · 429 · rate limitRESOURCE_EXHAUSTED

The project exceeded a request, token, or quota limit for its current tier.

Gemini · 400 · formatINVALID_ARGUMENT

Malformed body — or a feature that doesn't exist on your API version.

Gemini · 404 · availabilityNOT_FOUND

Expired file references — or a model from a line Google already shut down.

Gemini · 504 · timeoutDEADLINE_EXCEEDED

The request did not finish before the deadline. Reduce it, stream it, or adjust the timeout.

Gemini · 400 · billingFAILED_PRECONDITION

Free tier unavailable in your region without billing enabled.

Quota and rate limits: the two 429s that need opposite fixes

OpenAI sends both rate limits and exhausted billing as HTTP 429, and confusing them wastes hours: backoff cures the first and does nothing for the second. Anthropic adds a wrinkle worth knowing — acceleration limits that fire when usage ramps too sharply, even below your ceiling. If limits keep binding, the practical escape is routing bulk traffic to cheaper, higher-throughput tiers: compare what that costs in the calculator against the current rankings.

Model not found: check the identifier and its lifecycle

Three providers are retiring model lines this year — Claude Sonnet 4 and Opus 4 went dark June 15, Gemini's 2.5 line ends October 16, and OpenAI retires nine IDs on October 23. Every 404 in this category (OpenAI, Anthropic, Gemini) links into the deprecations record, where each retirement carries its replacement and the before-and-after price math. The live tracker shows every model's current status.

Context too large

Token overflow (context_length_exceeded) and byte overflow (request_too_large) fail differently and need different fixes — counting tokens versus measuring payloads. When trimming isn't an option, the fix is a bigger window: the context-window comparison shows what each model's advertised window is really worth in practice.

Authentication and billing walls

For 401 errors, verify the key, project, organization, header, and any IP restrictions before changing application code. Gemini's FAILED_PRECONDITION can instead point to regional availability or a project that needs billing enabled.

Server errors and overload

Anthropic's 529 and Gemini's 504 usually need bounded retries, timeout handling, and monitoring rather than an immediate code rewrite. If availability is critical, use the price history and model reference to price and test a fallback before an incident.

Changelog

June 12, 2026 — Section launched with 15 errors across OpenAI, Anthropic, and Gemini, each verified against official provider documentation.

Sources

OpenAI error codes guide — developers.openai.com/api/docs/guides/error-codes (verified June 12, 2026)
Anthropic API errors — platform.claude.com/docs/en/api/errors (verified June 12, 2026)
Gemini API troubleshooting — ai.google.dev/gemini-api/docs/troubleshooting (verified June 12, 2026)
benchr api-errors.json — the structured dataset behind this section

Frequently asked questions

Why do AI API errors spike in 2026?

Model retirements. OpenAI shuts down nine model IDs on October 23, 2026, Anthropic retired Claude Sonnet 4 and Opus 4 on June 15, and Google's Gemini 2.5 line ends October 16. Code pinned to old IDs starts returning 404s — which is why every model-availability error here links straight into benchr's deprecations record.

Are these error explanations official?

Every page is verified against the provider's own error documentation — OpenAI's error-codes guide, Anthropic's API errors page, and Google's Gemini troubleshooting docs — with the verification date printed on the page. Where benchr adds editorial advice (like cheaper fallback models), it's labeled as benchr's pick.

What's the difference between a 429 rate limit and insufficient_quota?

Both arrive as HTTP 429 from OpenAI, but they need opposite responses. Rate limits are temporary — back off and retry, limits reset every minute. insufficient_quota means billing is exhausted — no retry strategy fixes it; only adding credits or raising your cap does.