literature

From Stampy's Wiki
literature

Canonically answered

What should I read to learn about decision theory?

Show your endorsement of this answer by giving it a stamp of approval!

abramdemski and Scott Garrabrant's post on decision theory provides a good overview of many aspects of the topic, while Functional Decision Theory: A New Theory of Instrumental Rationality seems to be the most up to date source on current thinking.

For a more intuitive dive into one of the core problems, Newcomb's problem and regret of rationality is good, and Newcomblike problems are the norm is useful for seeing how it applies in the real world.

The LessWrong tag for decision theory has lots of additional links for people who want to explore further.

Where can I learn about AI alignment?

Show your endorsement of this answer by giving it a stamp of approval!

If you like interactive FAQs, you've already found one! All joking aside, probably the best places to start as a newcomer are The AI Revolution posts on WaitBuyWhy: The Road to Superintelligence and Our Immortality or Extinction for a fun accessible intro, or Vox's The case for taking AI seriously as a threat to humanity for a mainstream explainer piece. If you prefer videos, Rob Miles's YouTube (+these) and MIRI's AI Alignment: Why It’s Hard, and Where to Start are great. If you like clearly laid out reports, AGI safety from first principles might be your best option.

If you've up for a book-length introduction, there are several options.

The Alignment Problem by Brian Christian is the most recent (2020) in-depth guide to the field.

The book which first made the case to the public is Nick Bostrom's Superintelligence. It gives an excellent overview of the state of the field in 2014 and makes a strong case for the subject being important as well as exploring many fascinating adjacent topics. However, it does not cover newer developments, such as mesa-optimizers or language models.

There's also Human Compatible by Stuart Russell, which gives a more up-to-date (2019) review of developments, with an emphasis on the approaches that the Center for Human Compatible AI are working on such as cooperative inverse reinforcement learning. There's a good review/summary on SlateStarCodex.

Though not limited to AI Safety, Rationality: A-Z covers a lot of skills which are valuable to acquire for people trying to think about large and complex issues, with The Rationalist's Guide to the Galaxy available as a shorter and more AI focused accessible option.

Various other books are explore the issues in an informed way, such as The Precipice, Life 3.0, and Homo Deus.

I’d like to get deeper into the AI alignment literature. Where should I look?

Show your endorsement of this answer by giving it a stamp of approval!

The AGI Safety Fundamentals Course is a arguably the best way to get up to speed on alignment, you can sign up to go through it with many other people studying and mentorship or read their materials independently.

Other great ways to explore include:

You might also want to consider reading Rationality: A-Z which covers a lot of skills that are valuable to acquire for people trying to think about large and complex issues, with The Rationalist's Guide to the Galaxy available as a shorter and more accessible AI-focused option.

What are some good resources on AI alignment?

Show your endorsement of this answer by giving it a stamp of approval!

Unanswered canonical questions

Unanswered non-canonical questions

Have you read 'Superintelligence' by Nick Bostrom? What is you opinion on the book? (I just finished it)