The Bitter Lesson
View sourceSutton’s argument is simple but profound: over 70 years of AI research, methods that leverage computation have consistently won over methods that leverage human knowledge. Every time researchers have tried to build in human understanding of a domain, they’ve eventually been beaten by approaches that just throw more compute at the problem.
Chess, Go, speech recognition, computer vision—the story repeats. Hand-crafted features and expert systems get outperformed by learning algorithms that scale with data and compute.
The “bitter” part is that this is hard for researchers to accept. We want our insights and domain expertise to matter. We want the elegant solution, not the brute-force one. But the evidence is overwhelming.
This has major implications for how we should think about AI progress. If Sutton is right (and the last few years of LLMs seem to confirm it), then the path to AGI isn’t through better algorithms—it’s through scale. Which is both exciting and concerning.
I think about this essay a lot when I see people dismissing scaling as “just throwing compute at the problem.” That dismissal is exactly what the bitter lesson is about.