Why would we only get one chance to align a superintelligence?

From Stampy's Wiki

Canonical Answer

An AGI which has recursively self-improved into a superintelligence would be capable of either resisting our attempts to modify incorrectly specified goals, or realizing it was still weaker than us and acting deceptively aligned until it was highly sure it could win in a confrontation. AGI would likely prevent a human from shutting it down unless the AGI was designed to be corrigible. See Why can't we just turn the AI off if it starts to misbehave? for more information.

Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!

Tags: None (add tags)

Canonical Question Info
(edits welcome)
Asked by: plex
OriginWhere was this question originally asked
Wiki
Date: 2022/07/06


Discussion