What 529 is telling you
Every other status code in Anthropic's table points at something you control: 401 means a bad key, 400 means a malformed body, 429 means your account hit a limit. The 529 is different — it fires when traffic runs high across all users at once, which makes it the only Claude error that's about everyone else. The message is literal: "The API is temporarily overloaded." Launch mornings are the classic setup. A new model drops, half the industry tries it before lunch, and the platform sheds load to stay standing.
One distinction saves real debugging time. Anthropic's docs warn separately that sharp increases in your own usage can trigger 429s from acceleration limits, even with the platform calm and your tier ceiling nowhere in sight. So a flood of errors during your big launch isn't automatically a 529 story. Check the type field: if it reads rate_limit_error, the problem is your ramp, and the fix lives on the 429 page instead: grow gradually, keep usage patterns consistent.
The response body
{
"type": "error",
"error": {
"type": "overloaded_error",
"message": "The API is temporarily overloaded."
},
"request_id": "req_011CSHoEeqs5C35K2UUqR7Fy"
}
Three habits worth wiring in while you're here. Branch on the type field, and in SDK code catch the typed exception classes (Python raises things like anthropic.RateLimitError) instead of string-matching messages. Save the request_id, which also arrives on every response as a req_-prefixed header; support tickets that include it move faster. And if you stream, remember that with server-sent events an error can land after the 200 already arrived, so a clean status line isn't the end of the story.
Retry like a good citizen
A 529 is the platform asking for breathing room, and the retry loop should grant it by design: random jitter so your fleet doesn't march in lockstep, a wall-clock deadline instead of a bare attempt counter, and a circuit that opens once the deadline passes.
# Python: jittered retries under a deadline, then circuit-break
import random, time
import anthropic
client = anthropic.Anthropic()
def create_with_deadline(deadline_s=120, **kwargs):
start = time.monotonic()
attempt = 0
while time.monotonic() - start < deadline_s:
try:
return client.messages.create(**kwargs)
except anthropic.APIStatusError as e:
if e.status_code not in (429, 529):
raise
attempt += 1
time.sleep(min(30, 2 ** attempt) * random.random())
raise RuntimeError("circuit open: still overloaded at deadline")
When the circuit opens, stop calling. Park new work in a queue, serve cached results where they exist, and probe again only after a cool-down. Retry storms are the one way you can make a 529 worse: thousands of clients replaying requests on synchronized timers turn a traffic spike into a long afternoon.
Design for the bad hour
Teams that shrug off 529s made their choices before the bad hour, not during it. Three moves cover most of it. Put a queue with a concurrency cap between your product and the API, so pressure builds in your infrastructure rather than theirs. Route anything that can wait into the Batch API, which runs at a 50% discount on standard Claude pricing and turns a platform spike into a non-event for bulk jobs. And if uptime is contractual, keep a second provider warm: pick the alternate from the rankings ahead of time, price the switch with the calculator, and put it behind a flag you can flip without a war room.