TapuZuko's Answer to Might an aligned superintelligence immediately kill everyone and then go on to create a "hedonium shockwave"?
I think an AI inner aligned to optimize a utility function of maximize happiness minus suffering is likely to do something like this.
Inner aligned meaning the AI is trying to do the thing we trained it to do. Whether this is what we actually want or not.
"Aligned to what" is the outer alignment problem which is where the failure in this example is. There is a lot of debate on what utility functions are safe or desirable to maximize, and if human values can even be described by a utility function.
|Original by:||TapuZuko (edits by 335492609520697344)|