Leaderboard · Coding Capabilities
AI Coding Models Leaderboard
Ranked comparison of Large Language Model APIs on coding benchmarks. Sourced from official docs, sorted by SWE-bench Verified score.
| Rank | Model | Provider | SWE-bench Verified | HumanEval | Blended $/1M |
|---|---|---|---|---|---|
| Loading leaderboard rankings... | |||||
Methodology
This leaderboard ranks models based on **SWE-bench Verified**, the gold standard benchmark for resolving real-world GitHub issues. Where providers have not officially published their SWE-bench scores, we print estimated or fallback scores (marked with *est* or *—*).