How would we align an AGI whose learning algorithms / cognition look like human brains?

1 min read

Steven Byrnes, a full-time, independent alignment researcher, works on answering the question: "How would we align an AGI whose learning algorithms / cognition look like human brains?"

Humans seem to robustly care about things; why is that? If we understood that, could we design AGIs to do the same thing? As far as I understand it, most of this work is biology-based: learning how various parts of the brain work, and addressing the alignment problem with this understanding.

There are three other independent researchers working on related projects that Byrnes has proposed.