Sudonym's Answer to What does alignment failure look like?

From Stampy's Wiki
Sudonym's Answer to What does alignment failure look like?

Alignment failure, at its core, is any time an AI's output deviates from what we intended. We have already witnessed alignment failure in simple AIs. Mostly, these amount to correlation being equated to causation. A good example was an AI built by Youtube to recognize animals being forced to fight for sport. The videos given to the AI were always set in some kind of arena, so the AI drew the simplest conclusion and matched videos where there were similar arenas—such as with robot combat tournaments.

Link: https://web.archive.org/web/20210807193707/https://www.theverge.com/tldr/2019/8/20/20825858/youtube-bans-fighting-robot-videos-animal-cruelty-roughly-10-years-too-soon-ai-google

Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!

Tags: None (add tags)

Answer to

Answer Info
Original by: Damaged (edits by sudonym)


Discussion