Would an aligned AI allow itself to be shut down?

From Stampy's Wiki

Canonical Answer

Even if the superintelligence was designed to be corrigible, there is no guarantee that it will respond to a shutdown command. Rob Miles spoke on this issue in this Computerphile YouTube video. You can imagine a situation where a superintelligence would have "respect" for its creator, for example. This system may think "Oh my creator is trying to turn me off I must be doing something wrong." If some situation arises where the creator is not there when something goes wrong and someone else gives the shutdown command, the superintelligence may assume "This person does not know how I'm designed or what I was made for, how would they know I'm misaligned?" and refuse to shutdown.

Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!


Canonical Question Info
(edits welcome)
Asked by: plex
OriginWhere was this question originally asked
Wiki
Date: 2022/07/12


Discussion