Three ways to get AI to do the spreadsheet you're dreading. You can let Copilot live inside Excel. You can let Gemini live inside Google Sheets. Or you can paste a chunk of data into a chat model and read back what it gives you. They're not interchangeable, and the gap between them is widest on the one task everyone cares about: writing a formula that's correct.
Copilot wins the all-in-one slot because it's the only option that sits in the grid you already work in and touches every job at once. Type a request and it writes the formula, builds the pivot, drafts the chart, and flags the cells that look broken, without you copying anything out to a separate window. The new =COPILOT() function takes that further: you write a natural-language prompt straight into a cell and it returns a value, a summary, or a categorization that recalculates like any other formula. Nothing else here is that close to the metal.
The catch is that "in the grid" and "correct" are different promises, and the rest of this guide is mostly about the second one. Here's how the three approaches actually split across the work.
| Task | Excel Copilot | Sheets' Gemini | Paste into a chat model |
|---|---|---|---|
| Write a formula | Native, in-cell, explained; =COPILOT() lives in the sheet | Good for basic formulas; =AI() shines on bulk text ops | Fine for one-offs, but you copy the result back by hand |
| Clean data | Strong for dedupe and fill; works on your live table | Best for bulk categorize, sentiment, extraction via =AI() | Works, but row limits hit fast on big files |
| Analyze | Trends and summaries inline; warns on high-stakes math | Decent summaries; can build a sheet from a prompt | Claude/ChatGPT reason hardest on multi-sheet models |
| Make a chart | One request, native chart, editable like any other | Builds charts, but fewer formatting controls | ChatGPT renders an image; you rebuild it natively anyway |
Read the table by row, not by column. No single tool sweeps it, which is the whole point. Copilot takes formulas and charts because those are grid-native jobs. Gemini takes data cleaning because the =AI() function turns "categorize these 5,000 support tickets" into one formula you fill down the column. And the chat models take analysis because that's where raw reasoning over a messy multi-sheet model matters more than living in the cell.
When a chat model is the right call
Reach for a chat model when the file is big, the logic is tangled, or you don't have the paid add-in. This is the lane where Claude for Excel earns the runner-up spot. It runs on Excel Web, Windows, and Mac, needs a Pro, Max, Team, or Enterprise plan, and with Opus 4.7 behind it, it's built for multi-sheet financial models and for finding the broken formula in a workbook nobody wants to audit by hand. It can trace a #REF!, a #VALUE!, or a circular reference back to its source and fix it without snapping the dependency chain that runs through the rest of the book.
The reason it pulls ahead on large files is context. Sonnet 4.6 reached a 1M token window in 2026, which is enough to hold hundreds of thousands of rows of a business CSV in view at once, and Opus 4.7 is the pick when you want maximum reasoning over an uploaded XLSX. If you want the long version of why that window matters and where the marketing oversells it, benchr's read on what a million-token context really buys you is the place to start, and the side-by-side of context windows across models shows how the field stacks up. For the underlying model itself, the Claude Opus 4.7 review has the benchmark and pricing detail.
ChatGPT belongs here too, with a caveat that trips people up. Its Advanced Data Analysis, the old Code Interpreter, is what writes Python against your file and runs real calculations. That's on Plus at $20/month and Pro at $200/month, and it is not on the free tier. Free ChatGPT can open a file and describe its shape, but it can't run the analysis. ChatGPT for Excel, powered by GPT-5.4 and tuned for financial modeling and scenario work, went generally available across plans on May 5, 2026, which closes some of that gap inside the grid.
The cost math here isn't obvious, because a subscription and a per-token API bill are different animals. Claude's API has no free tier after the trial, new accounts get about $5 in starter credits, and Opus 4.7 runs $5 per million input tokens and $25 per million output, with Sonnet 4.6 at $3 and $15. Whether that beats a flat $20 subscription depends entirely on how much you run. benchr's cost-per-workload breakdown across five models does that arithmetic by job, and if your bills are creeping up, the guide to cutting token spend covers the levers that move it.
Every one of these writes a confident wrong formula
This is the part the demos skip. Each tool here will, on a bad day, hand you a formula that looks right, runs without an error, and quietly returns the wrong number. The failure isn't random; it comes from how language models read a spreadsheet.
So the rule is the same across all three: treat the output as a draft, not an answer. Spot-check the formula on a few rows where you already know the result. Watch for sheet names you don't recognize and for INDIRECT creeping into a large file. And never let a model's number be the final word on anything with money, a deadline, or a regulator attached. The same honesty problem shows up wherever these tools touch facts, which is exactly the territory of benchr's guide to research AI that cites honestly instead of inventing sources.
The free tiers deserve one flat warning, because they're thinner than they look. Free ChatGPT has no Code Interpreter, so no real analysis. The free Claude plan is chat-only, with no Excel add-in and no API. And the free Gemini tier at gemini.google.com is a different product from Gemini in Sheets; it has none of the Workspace integration that makes the in-grid version useful.
The Google Sheets case
If your data already lives in Sheets, Gemini is the obvious default and often a free one. As of January 2025 it's bundled into Workspace Business Standard and higher at no added cost, which means a lot of teams already have it and don't know it. Its standout is the =AI() function for bulk text work: categorize, score sentiment, or extract a field across thousands of rows with one formula you fill down. As of an April 2026 update it can also build a whole spreadsheet from a plain-language description. It's weaker than the chat models on heavy multi-sheet reasoning, but for cleaning and tagging columns of text, nothing here is faster.
The choice between Copilot and Gemini mostly comes down to which suite you live in, not which model is smarter. If your work is in Excel, Copilot is the answer. If it's in Sheets, Gemini already has the home-field advantage and probably the lower bill. The decision only gets interesting when the file is large or the logic is gnarly enough that you'd rather hand the whole workbook to a chat model and reason about it out loud.