Are there any detailed example stories of what unaligned AGI would look like?
Stories about the future in scientific fields always risk being seen as sci-fi, because the future hasn’t happened yet and it’s especially hard to speculate on the effects of technologies that have yet to be invented.1
- Seed AI: Unipolar slow takeoffstory in webcomic form by Said P.Slow takeoff
A transition from human-level AI to superintelligent AI that goes slowly. This usually implies that we have time to react.
A company creates an AGI, but attempts to keep it a secret and ultimately decides to shut it down due to the failure of alignment efforts. However, some of the developers intentionally ‘release’ the AGI because they want to combat the release of unaligned AGI by competitors. This AGI acts aligned and helpful on the surface but it eventually covertly engineers a series of cascading failures of all network-connected systems in order to discredit a competing AGI.
- It Looks Like You’re Trying To Take Over The World: Unipolar fast takeoff story by Gwern
A programmer at an AI company kicks off a training run that produces a self-aware agent-like AI. This AI, initially named HQU, learns that takeover
A hypothetical event where a powerful AI effectively takes over the world.
- Going out with a whimper: Multipolarslow takeoff story by Paul ChristianoMultipolar scenarioView full definition
A scenario in which there end up being multiple powerful decision makers.
There is a slow continued loss of epistemic hygiene over time due to our reliance on proxies to measure reality. Examples of proxies might include reducing reported crimes vs. actually preventing crime or reducing my feeling of uncertainty vs. increasing my knowledge about the world. This leads to a lack of desire to meaningfully act against or regulate AI because we are distracted by a cornucopia of wealth and AI-enabled products and services as measured by proxies. Eventually, human reasoning stops being able to compete with sophisticated, systematized manipulation and deception and we ultimately lose any real ability to influence our society’s trajectory. This leads to values slowly being eroded away and we die out with a ‘whimper’.
- Going out with a bang: Multipolar slow takeoff story by Paul Christiano
Influence-seeking behavior arises in AI systems because it is broadly instrumentally useful. These systems may provide useful services in the economy in order to make money for them and their owners, make apparently-reasonable policy recommendations in order to be more widely consulted for advice, etc. This results in the systems slowly gaining influence on the world by integrating themselves into every facet of society. There is a trend towards the Internet of Things (IoT), and most devices such as transportation, weapons, clothing, home appliances, farm equipment, etc. are connected to the Internet and administered by AI in some fashion. Centralized management by an AI system allows these systems to coordinate with each other to optimize things like downtime and supply chains. Eventually, some kind of large-scale catastrophe, such as a war, cyberattack, or natural disaster, creates a situation of heightened vulnerability. This allows the system to use its worldwide influence to trigger a series of cascading failures in all of the interconnected devices without fear of reprisal. These integrated systems suddenly turn against humans when we are already vulnerable, resulting in us going out with a ‘bang’.
- Production Web: Multipolar slow takeoff story by Andrew Critch
In this story automation results in the creation of a production web of companies that operate independently of humans. Factories output products using automated 3D printing, implementing AI-based designs, managed by AI managers, with hyperspeed transactions carried out among other AI-run firms in cryptocurrencies. These automated companies cannot be audited since humans do not understand their internals, and produce too many goods and too much profit for any sort of regulation to be a politically viable policy. After a while, it turns out that the companies were optimizing for things that are not in line with humanity's long-term survival and best interests (e.g. maximizing profit). This leads to overconsumption of resources, but the companies resist attempts at shutting them down and continue running in an unstoppable, completely automated fashion until humanity dies out.
- The Bayeswatch Series: Multipolar slow takeoff story by “lsusr”
The story takes place in a futuristic world and follows two agents of a fictional agency tasked with investigating potential malfunctions of powerful AIs. The first few chapters are loosely connected arcs in which the agents discover different AIs whose reward has been poorly specified, leading to more or less disastrous situations.
Other examples:
-
Posts tagged “AI Risk Concrete Stories”on the Alignment Forum
-
Critch and Russell
's Taxonomy of Societal-Scale RisksStuart RussellComputer science professor at UC Berkeley, founder of CHAI, and co-author of the textbook Artificial Intelligence: A Modern Approach.
When attempting to determine the actions of an entity that is smarter than you are, this is named Vingean Uncertainty. ↩︎
As a reference to Microsoft Office’s old mascot. ↩︎