decision theory

From Stampy's Wiki
Decision theory
decision theory
Main Question: Why might decision theory be important for AI alignment?
Alignment Forum Tag
Wikipedia Page

Description

Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals. Every action at least implicitly represents a decision under uncertainty: in a state of partial knowledge, something has to be done, even if that something turns out to be nothing (call it "the null action"). Even if you don't know how you make decisions, decisions do get made, and so there has to be some underlying mechanism. What is it? And how can it be done better? Decision theory has the answers.

Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals. Every action at least implicitly represents a decision under uncertainty: in a state of partial knowledge, something has to be done, even if that something turns out to be nothing (call it "the null action"). Even if you don't know how you make decisions, decisions do get made, and so there has to be some underlying mechanism. What is it? And how can it be done better? Decision theory has the answers.

Note: this page needs to be updated with content regarding Functional Decision Theory, the latest theory from MIRI.

Related: Game Theory, Robust Agents, Utility Functions

A core idea in decision theory is that of expected utility maximization, usually intractable to directly calculate in practice, but an invaluable theoretical concept. An agent assigns utility to every possible outcome: a real number representing the goodness or desirability of that outcome. The mapping of outcomes to utilities is called the agent's utility function. (The utility function is said to be invariant under affine transformations: that is, the utilities can be scaled or translated by a constant while resulting in all the same decisions.) For every action that the agent could take, sum over the utilities of the various possible outcomes weighted by their probability: this is the expected utility of the action, and the action with the highest expected utility is to be chosen.

Thought experiments

The limitations and pathologies of decision theories can be analyzed by considering the decisions they suggest in the certain idealized situations that stretch the limits of decision theory's applicability. Some of the thought experiments more frequently discussed on LW include:

Commonly discussed decision theories

Standard theories well-known in academia:

Theories invented by researchers associated with MIRI and LW:

Other decision theories are listed in A comprehensive list of decision theories.

Blog posts

Sequence by AnnaSalamon

Sequence by orthonormal (Decision Theories: A Semi-Formal Analysis)

See also

Canonically answered

abramdemski and Scott Garrabrant's post on decision theory provides a good overview of many aspects of the topic, while Functional Decision Theory: A New Theory of Instrumental Rationality seems to be the most up to date source on current thinking.

For a more intuitive dive into one of the core problems, Newcomb's problem and regret of rationality is good, and Newcomblike problems are the norm is useful for seeing how it applies in the real world.

The LessWrong tag for decision theory has lots of additional links for people who want to explore further.