Essay·May 2026

When the model remembers you

How persistent memory works across chats, what it buys you, and the privacy trade.

By the benchr team · Reviewed May 30, 2026 · View changelog · Figures verified against official sources, 30 May 2026

You open a brand-new chat. You haven't pasted anything, haven't reminded it of anything, and yet it greets you by name, knows you write Python, knows you prefer short answers without the throat-clearing. Nothing carried over in the conversation, because there was no prior conversation. So where did that come from?

It came from memory, a layer that sits outside any one chat and quietly feeds the assistant a little dossier about you before it answers. This is new in a way that's easy to miss. For years, every chat with a model started cold. Close the tab and the thing forgot you completely. Now the better assistants keep a running profile, and the felt result is that the tool acts less like a search box and more like a coworker who remembers last week.

Memory is not a context window

Here's the distinction people trip on, and it's worth slowing down for. A context window is the raw amount of text a model can hold in a single session, a hard per-conversation limit measured in tokens. When the window fills, the oldest material falls out. benchr's look at context windows walks through where that ceiling actually bites and why a million-token window is often a ceiling, not a target.

Persistent memory is the opposite axis. It doesn't make any one chat bigger. It carries small facts about you across separate chats, and those facts survive after a conversation's window resets to zero. One is single-session capacity. The other is cross-session recall. A model can have a huge window and no memory, or modest windows and a sharp memory of who you are. They're different machines doing different jobs, and conflating them is how people end up disappointed by both.

A context window is how much the model can read right now. Memory is what it still knows about you tomorrow.

Two flavors of remembering

ChatGPT splits memory into two mechanisms, and once you see the split you'll spot it in every assistant. The first is what OpenAI calls saved memories: things you explicitly tell it to keep. You type "remember that I'm vegetarian" and it files that away as a durable fact. Clean, deliberate, easy to audit.

The second is reference chat history, where the assistant automatically gathers insights from your past chats without being asked. OpenAI's own example is lunch: if you once said you like Thai food, it may take that into account next time you ask what to have for dinner. You never told it to remember that. It just noticed. That automatic mode is the one that produces the slightly uncanny "how did it know" moments, and it's also the one worth watching, because you didn't choose what got stored.

Who remembers you, and where to control it

All three big consumer assistants now keep some kind of cross-chat memory, but they shipped on different timelines and with different defaults. ChatGPT's comprehensive reference-chat-history version reached Plus and Pro subscribers on April 10, 2025, then a lighter, short-term version landed for free users on June 3, 2025. Google announced Gemini personalization from past chats on August 13, 2025, and this is the one to flag: it ships on by default. Anthropic launched Claude memory for Team and Enterprise on September 11, 2025, extended it to Pro and Max with the October 23, 2025 update, and opened it to the free plan on March 2, 2026, where you can also import memory from other AI tools.

That default difference matters. With Gemini, past-chat personalization is working unless you go switch it off. With the others, the controls sit a click or two deep but the posture is more opt-in. Here's where each assistant stores its memory of you and where you go to manage it.

Where each assistant remembers you and where to control it, May 2026
Assistant	What it remembers	Where to control it
ChatGPT	Saved memories you ask for, plus insights it gathers from past chats	Settings > Personalization > Manage Memories; Temporary Chat for one-offs
Gemini	Preferences learned from past chats (on by default)	Settings > Personal context; delete in Gemini Apps Activity; Temporary Chat
Claude	A separate memory per project, kept distinct from unrelated work	Editable memory summary in settings; Incognito chat for no-memory sessions

Claude's project-isolated memory is the design choice worth calling out. Instead of one giant profile that bleeds your side project into your day job, it keeps a separate memory for each project. Pair that with Incognito chat, a clean slate that doesn't appear in history or save to memory, and you get fairly granular control over what the model knows in which context. Go with Claude if compartmentalizing what it remembers across different work matters to you.

The privacy trade you should set first

This is where people conflate two things that are genuinely separate, so pull them apart. First question: is the assistant storing personal details about you? Yes, that's the whole feature, and it's controllable and deletable on every provider above. Second question, totally different: is that content also training the model? That's a separate setting, and it doesn't move just because you turned memory on or off.

If you only do one thing after reading this, audit those two switches per assistant. The convenience of being remembered is real, but it's worth knowing whether "remembered" also means "studied." For the full picture of who learns from your conversations by default and how to opt out, read benchr's breakdown of which providers train on your chats, which covers retention windows and the exact toggles in more depth than fits here.

One more practical note. Memory makes your prompts shorter, because you stop re-explaining yourself every session, which quietly trims what you spend. If you're watching usage, that overlaps with the levers in benchr's guide to cutting your token bill. And as assistants start acting on your behalf, the same stored profile is what lets an agent skip the setup questions entirely, which is the thread benchr's piece on agentic shopping picks up.

So is it worth turning on?

For most people, yes. The everyday payoff is concrete: less repeating yourself, answers tuned to how you actually work, a tool that feels like it's been paying attention. Stick with memory on unless you share an account, work across clients whose context shouldn't mix, or simply don't want a profile built. In those cases, lean on the no-memory modes, Temporary Chat on ChatGPT and Gemini, Incognito on Claude, and keep the durable stuff out of the picture. The feature is a convenience with a cost. Knowing exactly where the controls live is what keeps it on your terms.

Frequently asked

How does AI memory actually work?

It stores small facts about you outside any single chat and pulls them into new ones. ChatGPT does this two ways, per OpenAI's Memory FAQ: "saved memories" are things you explicitly tell it to remember, like your name or that you're vegetarian, and "reference chat history" lets it automatically draw on insights from past chats. Gemini and Claude work on the same idea, each storing a profile of you that informs later answers.

Is memory the same as a big context window?

No, and it's worth keeping them separate. A context window, covered on benchr's context windows compared page, is the raw amount of text a model can hold in one session, a hard per-conversation limit. Persistent memory is the other axis: it carries small facts about you across separate chats and survives after a conversation's window resets. One is single-session capacity, the other is cross-session recall.

Can I turn AI memory off?

Yes, on every major assistant. In ChatGPT you toggle reference chat history and view or remove entries under Settings > Personalization > Manage Memories, and Temporary Chat skips memory entirely. Gemini's past-chat personalization lives in Settings > Personal context and ships on by default, so turning it off is a deliberate step. Claude offers an editable memory summary plus an Incognito chat mode that doesn't save to memory.

Is my memory used to train the model?

Memory and training are two different settings. ChatGPT may use your content to improve its models unless you turn off "Improve the model for everyone" in Data Controls, and Temporary Chat sidesteps both. Gemini's Keep Activity setting, formerly Gemini Apps Activity, governs whether uploads help improve Google services, and Temporary Chats avoid personalization and training. Anthropic moved to an opt-in choice for Claude Free, Pro, and Max: allow it and retention runs five years, decline and it stays at 30 days. See benchr's who-trains-on-you page for the full breakdown.

Which assistants remember you?

All three of the big consumer assistants now do. ChatGPT rolled out cross-chat memory to Plus and Pro on April 10, 2025 and a lighter version to free users on June 3, 2025. Google announced Gemini personalization from past chats on August 13, 2025, on by default. Anthropic launched Claude memory for Team and Enterprise on September 11, 2025, extended it to Pro and Max with the October 23, 2025 update, and brought it to the free plan on March 2, 2026.

Changelog

May 30, 2026 — Originally published. Rollout dates, defaults, and control paths verified against OpenAI, Google, and Anthropic documentation.

References

OpenAI, "Memory FAQ," help.openai.com, accessed May 2026.
OpenAI, "Data Controls FAQ," help.openai.com, accessed May 2026.
OpenAI, "How your data is used to improve model performance," help.openai.com, accessed May 2026.
TechCrunch, "OpenAI updates ChatGPT to reference your other chats," techcrunch.com, accessed May 2026.
The Decoder, "OpenAI brings longer-term memory feature to free ChatGPT users," the-decoder.com, accessed May 2026.
Google, "Temporary chats and privacy controls in Gemini," blog.google, accessed May 2026.
Search Engine Journal, "Google Gemini adds personalization from past chats," searchenginejournal.com, accessed May 2026.
Anthropic, "Memory," claude.com, accessed May 2026.
9to5Mac, "Free Claude users can now use memory and import context from rivals," 9to5mac.com, accessed May 2026.
Anthropic, "Updates to our consumer terms and privacy policy," anthropic.com, accessed May 2026.