Reference·June 2026

OpenAI insufficient_quota: meaning, cause, and fix

This response uses HTTP 429, but it reports an account quota or billing problem rather than a short-lived request-rate limit.

By the benchr team · Published June 12, 2026 · Verified against OpenAI's error documentation, June 12, 2026

OpenAIHTTP 429severity: highquota & billing

Why it happens

OpenAI uses prepaid credit and optional budget caps. When the usable balance reaches zero or a cap is reached, the API stops serving requests. Common causes include exhausted credits, a project budget limit, or expired trial credit.

The key can also belong to a project whose quota is exhausted while another project still has credit. Check the project attached to the credential rather than assuming that quota is shared across all projects in the account.

The error you'll see

{
  "error": {
    "message": "You exceeded your current quota, please check your plan and billing details.",
    "type": "insufficient_quota",
    "code": "insufficient_quota"
  }
}

HTTP status: 429. That status is the trap — your retry middleware sees 429 and politely backs off, then fails again, forever. The body, not the status, tells you which 429 you have.

Handle quota and rate-limit errors separately

Split the two 429s at the handler level. Quota failures go to an alert; rate limits go to backoff:

# Python — route the two 429s differently
from openai import OpenAI, RateLimitError

client = OpenAI()
try:
    r = client.chat.completions.create(model="gpt-5", messages=msgs)
except RateLimitError as e:
    if "insufficient_quota" in str(e):
        alert_oncall("OpenAI billing exhausted — requests halted")
        raise            # retrying is pointless
    sleep_with_backoff() # a real rate limit — this one heals itself

Prevention

Set a usage alert below your cap, not at it. You want the email before production feels anything. Give each project its own budget so one runaway batch job can't starve the rest. And if quota keeps evaporating faster than planned, the bill itself is the bug: route bulk traffic to a cheaper tier instead of feeding everything to your most expensive model.

Frequently asked

Why do I get insufficient_quota on a brand-new API key?

New keys don't come with money. If the project has no prepaid credits and no live trial balance, the first request fails exactly this way. Fund the account; the key was never the problem.

Will exponential backoff fix it?

No. Rate limits reset every minute; quota persists until billing changes. Backoff against insufficient_quota is an infinite loop with extra steps.

How do I tell it apart from a rate limit programmatically?

Same HTTP status, different body. Check the error's type/code field: insufficient_quota means alert a human; rate_limit_exceeded means back off and retry.

Changelog

June 12, 2026 — Published. Error semantics verified against OpenAI's error-codes guide.

Sources

OpenAI error codes guide — developers.openai.com/api/docs/guides/error-codes (verified June 12, 2026)
benchr api-errors.json — structured entry for this error