OpenAI insufficient_quota: meaning, cause, and fix

It arrives dressed as a 429, but this is a billing problem wearing a rate-limit costume. No retry loop in the world fixes an empty account.

By the benchr team · · Verified against OpenAI's error documentation, June 12, 2026

OpenAIHTTP 429severity: highquota & billing

Why it happens

OpenAI bills by prepaid credit and optional budget caps. The moment usable balance hits zero (or your cap does), the API stops serving you. Four situations produce almost every case: the account simply ran out of credits mid-month; a budget cap you set months ago finally bound; free-trial credits expired (they have a shelf life); or your key belongs to a project whose quota is exhausted while a sibling project still has money. That last one bites teams that organize work into multiple projects and assume billing is shared. It isn't — quota follows the project, and so does the failure.

The error you'll see

{
  "error": {
    "message": "You exceeded your current quota, please check your plan and billing details.",
    "type": "insufficient_quota",
    "code": "insufficient_quota"
  }
}

HTTP status: 429. That status is the trap — your retry middleware sees 429 and politely backs off, then fails again, forever. The body, not the status, tells you which 429 you have.

The code guard that saves the night

Split the two 429s at the handler level. Quota failures go to an alert; rate limits go to backoff:

# Python — route the two 429s differently
from openai import OpenAI, RateLimitError

client = OpenAI()
try:
    r = client.chat.completions.create(model="gpt-5", messages=msgs)
except RateLimitError as e:
    if "insufficient_quota" in str(e):
        alert_oncall("OpenAI billing exhausted — requests halted")
        raise            # retrying is pointless
    sleep_with_backoff() # a real rate limit — this one heals itself

Prevention

Set a usage alert below your cap, not at it. You want the email before production feels anything. Give each project its own budget so one runaway batch job can't starve the rest. And if quota keeps evaporating faster than planned, the bill itself is the bug: route bulk traffic to a cheaper tier instead of feeding everything to your most expensive model.

Frequently asked

Why do I get insufficient_quota on a brand-new API key?

New keys don't come with money. If the project has no prepaid credits and no live trial balance, the first request fails exactly this way. Fund the account; the key was never the problem.

Will exponential backoff fix it?

No. Rate limits reset every minute; quota persists until billing changes. Backoff against insufficient_quota is an infinite loop with extra steps.

How do I tell it apart from a rate limit programmatically?

Same HTTP status, different body. Check the error's type/code field: insufficient_quota means alert a human; rate_limit_exceeded means back off and retry.

Changelog

  • — Published. Error semantics verified against OpenAI's error-codes guide.

Sources

  • OpenAI error codes guide — developers.openai.com/api/docs/guides/error-codes (verified June 12, 2026)
  • benchr api-errors.json — structured entry for this error