Edit AnswerTags: 's Answer to How quickly would the AI capabilities ecosystem adopt promising new advances in AI alignment?

From Stampy's Wiki
Log-in is required to edit or create pages.

You do not have permission to edit this page, for the following reason:

The action you have requested is limited to users in the group: Users.


Answer text

A lot of narrow AI alignment advances will improve capabilities too, make it easier for human users to work with the AI tools. Those are going to be adopted almost instantly, for example interpretability might be considered a desirable property by all AI researchers.

However, with our current research methods for search in highly dimensional spaces, it seems exponentially more likely to find a capable AGI than to find a capable and aligned AGI. So even if capabilities research will adopt all new advances in AI alignment as soon as they come along, it is likely capability research will happen faster than alignment research. We need to find ways to incentivize the corporations who will would profit from more capability research to also focus on alignment.