Cause one: the file you referenced is gone
Google's troubleshooting docs give one official cause for the 404: the request referenced a file that isn't there. That covers media you attach by reference instead of inline, meaning images, audio, and video. Uploaded references don't live forever; they can expire or get deleted, and nothing pings you when one quietly vanishes. Your code keeps presenting a handle to a resource that stopped existing somewhere between the upload and this call.
The repair is mercifully boring. Upload the file again, capture the new reference, and resend. Anything long-running should treat file handles as perishable: catch the 404, refresh, continue, the way the snippet below does. If a fresh upload doesn't clear it, confirm that your request parameters line up with the API version you're on; a mismatch there can send you chasing a file problem that was never about the file.
Cause two: the model retired
The second cause is bigger than your request: the entire model line has been switched off. Three verified dates matter as of June 2026. gemini-3-pro-preview has been dead since March 9, 2026. The gemini-2.0-flash family shut down June 1, 2026. And gemini-2.5-pro plus gemini-2.5-flash carry an October 16, 2026 date, which Google publishes as the earliest possible shutdown rather than a promise of extra time. The 2.5 Pro deprecation record and the model tracker keep the full timeline.
Migration has a price tag, literally. The replacements are gemini-3.5-flash and gemini-3.1-pro-preview, and both cost more than what they replace. On the Flash path, input jumps from $0.30 to $1.50 per million tokens (that's ×5) while output climbs from $2.50 to $9.00. On the Pro path, $1.25/$10 becomes $2/$12. A one-line config change can multiply a bill, so budget before you ship the swap.
The response
Here's a representative 404 body, in Google's standard error shape:
{
"error": {
"code": 404,
"message": "The requested resource wasn't found.",
"status": "NOT_FOUND"
}
}
Branch on the status field; NOT_FOUND stays stable while message wording can drift. Notice what the body never tells you: which of the two causes you've hit. The calendar usually does. A 404 that starts on the morning of a published shutdown date is a retirement, not a file glitch.
The fix in code
For the file case, wrap the call so a stale reference triggers one re-upload and one retry instead of a crash loop:
# Python (google-genai): re-upload when a file reference 404s
from google import genai
from google.genai import errors
client = genai.Client()
def ask_about(path, prompt, model="gemini-3.5-flash"):
ref = client.files.upload(file=path)
try:
return client.models.generate_content(model=model, contents=[ref, prompt])
except errors.APIError as e:
if e.code != 404:
raise
ref = client.files.upload(file=path) # stale handle: refresh it
return client.models.generate_content(model=model, contents=[ref, prompt])
For the retirement case, the entire code change is one line of config:
# config.yaml
- model: gemini-2.5-flash
+ model: gemini-3.5-flash
One step remains before re-pointing production traffic: re-quote the bill. That diff raises input cost five-fold at list price, and a jump that's tolerable for a chatbot can wreck a batch pipeline. Pull current numbers from the Gemini 3.5 Flash pricing page, push your own token volumes through the calculator, and ship the migration with the budget already signed off.