Can we get AGI by scaling up architectures similar to current ones, or are we missing key insights?
It's an open question whether we can create AGI
Shorthand for “computing power”. It may refer to, for instance, physical infrastructure such as CPUs or GPUs that perform processing, or the amount of processing power needed to train a model.
Some researchers have formulated empirical scaling laws
The relationship between a model’s performance and the amount of compute used to train it.
For a variety of opinions on this question, see:
- Gwern on the scaling hypothesis.
- Daniel Kokotajlo on what we could do with a trillion times as much compute as current models use.
- Rohin Shah on the likelihood that scaling current techniques will produce AGI.
- Rich Sutton's "The Bitter Lesson", which argues that more computation beats leveraging existing human knowledge.Rich Sutton
2024 Turing Award winning computer scientist who is considered one of the founders of the field of reinforcement learning. He is known for popularizing the “scaling hypothesis” as well as the “bitter lesson”.
- Gary Marcus's "The New Science of Alt Intelligence", which argues that current deep learning systems are limited and scaling will not help.Gary Marcus
Professor of psychology and neural science that is known for his skepticism towards the stance that current methods will be sufficient to lead to AGI.
- AI Impacts' "Evidence against current methods leading to human level artificial intelligence".
- Lesswrong user eggsyntax’s “LLM Generality is a Timeline Crux”.
- Leopold Aschenbrenner on counting the OOMs.
- Arvind Narayanan and Sayash Kapoor on AI Scaling Myths.
- Dwarkesh Patel tries to represent both sides.