What are the differences between AI safety, AI alignment, AI control, Friendly AI, AI ethics, AI existential safety, and AGI safety?
There are a variety of terms that mean something like "making AI go well." The distinctions between these terms are vague, but loosely speaking, the meanings are as follows:
-
AI safety means preventing harm from AI. This often refers to avoiding existential risks
, which is how we use it on aisafety.info. It can also encompass smaller-scale risks, like accidents caused by self-driving cars or harmful text produced by language models. People sometimes use “AI existential safety” to refer specifically to risks at the level of human extinction.Existential riskView full definitionA risk of human extinction or the destruction of humanity’s long-term potential.
-
AI alignment means getting AI to pursue the right goals — the problem of accomplishing this is known as the “alignment problem”. "AI alignment" often refers to “intent alignment”, according to which an AI is aligned if it’s trying to do what its operator wants it to do. Others use “AI alignment” for the broader problem of making powerful AI go well, but still emphasize that getting AI “on our side” is the core issue.1
Why equate making AI go well with AI alignment? Because if we can't control a superintelligencethat is not on our side, then the problem of making AI safe amounts to the problem of is the same as getting it on our side.SuperintelligenceView full definitionAn AI with cognitive abilities far greater than those of humans in a wide range of important domains.
-
AI ethics
broadly refers to the project of making sure that AI systems are designed and used in ethical ways. In practice, the term is associated with concerns about the harmful societal impacts of current-day AI, such as algorithmic bias against marginalized groups, poor treatment of crowd workers used in training AI, the environmental impacts of AI, and artists losing their livelihood to generative algorithms. The overarching principles guiding this work are fairness, accountability and transparency. While there is some overlap between AI ethics and AI alignment research, AI ethics researchers have often been critical of AI safety research that focuses on existential risk at the expense of addressing current harms.AI ethicsThe study of ethical principles for AI systems and their creators to follow. In practice, “AI ethics” often refers to a cluster of concerns about present systems that include algorithmic bias and transparency.
-
AI governance relates to institutions and norms to coordinate the development and deployment of AI. Like technical AI safety, AI governance is aimed at preventing disastrous outcomes from AI, but governance focuses on the social context instead of on the technical problems, and deals with questions like preventing misuse, implementing good safety practices, and preventing dangerously misaligned systems from being deployed.
Terms that are used less often include:
-
AI control (and the “control problem”) is a term that was sometimes used roughly synonymously with “AI alignment” (and the “alignment problem”), though it is less commonly used now. Some people use the term "AI control" to encompass all potential methods of preventing AI systems from behaving dangerously — including incentivizing and constraining them (“capability control”) — and use "AI alignment" only to refer to giving AI the right internal values (“motivation selection”).
-
Friendly AI (FAI) is a term that was used in early work by MIRI2
, but is no longer used. It informally referred to AI that acts benevolently toward humans — for example, pursuing “coherent extrapolated volition”, or some other specification of the values of humanity as a whole, as its highest goal.Then known as the Singularity Institute. -
AI notkilleveryoneism is a term Eliezer Yudkowsky
and others have used facetiously to refer to the project of preventing AI from exterminating humanity, out of a sense that other terms, like “AI alignment” and others listed above, tend to drift to encompass risks of smaller scope.3Eliezer YudkowskyCo-founder of MIRI, known for his early pioneering work in AI alignment and his predictions that AI will probably cause human extinction.
For instance, Senator Blumenthal's remark during a Senate hearing with Sam Altman(CEO of OpenAISam AltmanCo-founder and CEO of OpenAI.
): "I think you have said, in fact… 'Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity.' You may have had in mind the effect on jobs, which is really my biggest nightmare."
Why equate making AI go well with AI alignment? Because if we can’t control a superintelligence that is not on our side, then the problem of making AI safe amounts to the problem of is the same as getting it on our side. ↩︎
Then known as the Singularity Institute. ↩︎
For instance, Senator Blumenthal's remark during a Senate hearing with Sam Altman (CEO of OpenAI): "I think you have said, in fact… 'Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity.' You may have had in mind the effect on jobs, which is really my biggest nightmare." ↩︎