|Main Question: How might language models be relevant to AI alignment?|
|Child tag(s): gpt|
|Alignment Forum Tag|
Language Models are a class of AI trained on text, usually to predict the next word or a word which has been obscured. They have the ability to generate novel prose or code based on an initial prompt, which gives rise to a kind of natural language programming called prompt engineering. The most popular architecture for very large language models is called a transformer, which follows consistent scaling laws with respect to the size of the model being trained, meaning that a larger model trained with the same amount of compute will produce results which are better by a predictable amount (when measured by the 'perplexity', or how surprised the AI is by a test set of human-generated text).
Making a narrow AI for every task would be extremely costly and time-consuming. By making a more general intelligence, you can apply one system to a broader range of tasks, which is economically and strategically attractive.
Of course, for generality to be a good option there are some necessary conditions. You need an architecture which is straightforward enough to scale up, such as the transformer which is used for GPT and follows scaling laws. It's also important that by generalizing you do not lose too much capacity at narrow tasks or require too much extra compute for it to be worthwhile.
Whether or not those conditions actually hold it seems like many important actors (such as DeepMind and OpenAI) believe that they do, and are therefore focusing on trying to build an AGI in order to influence the future, so we should take actions to make it more likely that AGI will be developed safety.
Additionally, it is possible that even if we tried to build only narrow AIs, given enough time and compute we might accidentally create a more general AI than we intend by training a system on a task which requires a broad world model.
- Reframing Superintelligence - A model of AI development which proposes that we might mostly build narrow AI systems for some time.
GPT-3 is the newest and most impressive of the GPT (Generative Pretrained Transformer) series of large transformer-based language models created by OpenAI. It was announced in June 2020, and is 100 times larger than its predecessor GPT-2.
Gwern has several resources exploring GPT-3's abilities, limitations, and implications including:
- The Scaling Hypothesis - How simply increasing the amount of compute with current algorithms might create very powerful systems.
- GPT-3 Nonfiction
- GPT-3 Creative Fiction
Vox has an article which explains why GPT-3 is a big deal.
- GPT-3: What’s it good for? - Cambridge University Press
GPT-3 showed that transformers are capable of a vast array of natural language tasks, codex/copilot extended this into programming. One demonstrations of GPT-3 is Simulated Elon Musk lives in a simulation. Important to note that there are several much better language models, but they are not publicly available.
MuZero, which learned Go, Chess, and many Atari games without any directly coded info about those environments. The graphic there explains it, this seems crucial for being able to do RL in novel environments. We have systems which we can drop into a wide variety of games and they just learn how to play. The same algorithm was used in Tesla's self-driving cars to do complex route finding. These things are general.
Generally capable agents emerge from open-ended play - Diverse procedurally generated environments provide vast amounts of training data for AIs to learn generally applicable skills. Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning shows how these kind of systems can be trained to follow instructions in natural language.
GATO shows you can distill 600+ individually trained tasks into one network, so we're not limited by the tasks being fragmented.
Codex / Github Copilot are AIs that use GPT-3 to write and edit code. When given some input code and comments describing the intended function, they will write output that extends the prompt as accurately as possible.