The essay “The Bitter Lesson,” written by Professor Rich Sutton in 2019, has since gained importance for machine learning experts and people interested in understanding the future of AI. Insights provided in this document foresaw important developments in AI, including the emergence of ChatGPT/GPT-4 and the acceptance of OpenAI’s methodologies.
The core of “The Bitter Lesson” explores a paradigm shift in the field of AI. In the past, scientists studying AI had a tendency to think that developing advanced AI required a remarkable, distinctive approach, also known as an “inductive bias.” This idea alludes to the addition of specialized information or intuitive understanding of a specific issue, which then directs the machine’s solution pathway.
“The Bitter Lesson”‘s central theme examines a paradigm shift in the study of artificial intelligence. Previously, researchers studying AI had a propensity to believe that creating advanced AI required a remarkable, unique approach. This bias is referred to as the “inductive bias.” This concept suggests the addition of specialized knowledge or intuitive insight into a particular problem, which then directs the machine’s solution pathway.
But a recurring pattern became apparent. Researchers repeatedly found that by simply adding more data and computational power, they could outperform the outcomes produced by these painstakingly crafted methods. This pattern was not specific to one field but appeared in chess, go, starcraft, and probably nethack as well. Convolutional neural networks, for instance, perform better in the field of computer vision than manual techniques like SIFT. It’s interesting to note that the inventor of SIFT later said that if neural networks had been around when he was conducting his research, he would have chosen that course of action. Similar to this, LSTMs outperformed all rule-based systems in the field of machine translation. Using a simple “add more layers” strategy, ChatGPT/GPT-4, a leading example of this trend, was able to surpass highly developed models created by computational linguists.
The core of Sutton’s “bitter lesson” is that computational methods that are not modified by human intuition frequently outperform other approaches in terms of performance. This understanding hasn’t, however, become widely accepted. Many researchers still pursue complex, intuition-based strategies, frequently ignoring the potential of inclusive, calculation-based approaches.
Five reasons why GPT triumphed over handcrafted computational techniques:
- Scalability: Computational methods, especially when augmented with more data, have the potential to evolve and adapt as technology progresses, making them more future-proof.
- Efficiency: General methods based on calculations and data have consistently outperformed specialized, human intuition-based methods across various domains, from games like chess and Go to machine translation and computer vision.
- Broad Applicability: These general, computation-driven methods are versatile and can be applied across various disciplines without the need for domain-specific tweaks.
- Simplicity: Systems built on raw computational power and data tend to be simpler in their approach, without the need for intricate adjustments based on human intuition.
- Consistent Performance: As demonstrated by examples like ChatGPT/GPT-4, calculation-based models can achieve consistent high performance, often surpassing specialized methods.
The original essay is a priceless tool for getting a better understanding of Professor Sutton’s viewpoint and the principles guiding this AI trajectory.
The article was inspired by the Telegram channel “Boris Again.“
Read more about AI:
Read More: mpost.io