How likely is extinction from superintelligent AI?

Extinction from misaligned superintelligence is a tricky event to put a probability on: we don’t have a base rate of how many past civilizations like ours went extinct (whether from misaligned superintelligence or anything else), or a way to split all possible futures into a set of symmetrical and equally likely cases. That said, various people have tried putting numbers on their informed guesses of the chance of superintelligence leading to existential catastrophe, giving estimates ranging from under 1% to over 90%.

Eliezer Yudkowsky and Nate Soares at the Machine Intelligence Research Institute (MIRI) are examples of researchers who give high probabilities of extinction. In Yudkowsky’s view, humanity is on the bad end of a logistic success curve: because our response to the problem is seriously inadequate in multiple ways, any individual improvement won’t do much good by itself. By their mainline models, we’d need to “move up the curve” by doing better in several dimensions before we’d start seeing our probability of survival increase by a noticeable amount from the present ~0%.

Others, including Paul Christiano and Katja Grace, give lower probabilities, but still think there is substantial risk of extinction.

Joe Carlsmith wrote a report which offers a framework for calculating the probability of power-seeking AI causing an existential catastrophe. The calculation involves multiplying factors like "how likely are AI systems to be agentic?" and "how likely is a warning shot?". Carlsmith gave a final estimate of >10%; various reviewers used the same model to come up with different probabilities..

Though the range of estimates is wide, even those at the low end are worryingly high. Ben Garfinkel, who has estimatesd the existential risk from power-seeking AI by 2070 at 0.4%, agrbelieves major efforts are justified to understand and reduce it.