How can I use a background in the social sciences to help with AI alignment?
Nora Ammann, in the post AI alignment as “navigating the space of intelligent behaviour”, describes “three epistemic strategies for making progress on the alignment problem: 1) tinkering, 2) idealization and 3) intelligence-in-the-wild”. Research in the social sciences, biology, philosophy, and other fields can inform alignment efforts by shedding light on “intelligence-in-the-wild”. (As illustrated by the examples below, such research often still involves mathematics as well.)
Some examples of approaches, taken from the post:
-
Steve Byrnes’s research on brain-like AGI safety asks how we can align artificial general intelligence if it’s built on the same principles as the human brain, drawing analogies with neuroscience.
-
John Wentworth studies agent-like systems in nature to understand agency in general.
-
Andrew Critch’s research on multipolar
takeoffs and “robust agent-agnostic processes” relates to concepts from sociology.Multipolar scenarioView full definitionA scenario in which there end up being multiple powerful decision makers.
-
Discussions of mesa-optimization
use human evolution as a source of analogies.Mesa-optimizationView full definitionAn algorithm that is created by optimization and that is also itself an optimizer.
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS) is a group that runs a summer research fellowship and has recommendations for books and videos.
Other such research agendas exist. You can consider these as examples of what alignment-relevant research with varying amounts of math and computer science could look like:
-
An Open Agency Architecture for Safe Transformative AI
is an AI alignment paradigm aimed at ending the acute risk period without creating worse risks.Transformative AIView full definitionAn AI that is capable of transforming society, as drastically as the industrial revolution or even more so.
-
Learning Normativity: A Research Agenda aims to develop ways for agents to learn norms like languages and values in the absence of perfect feedback.
-
What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment tries to ground AI alignment in pluralist and contractualist norms.
-
Political Economy of Reinforcement Learning
(PERLS) is a workshop studying the societal implications of reinforcement learning systems.Reinforcement learningView full definitionA machine learning method in which the machine gets rewards based on its actions, and is adjusted to be more likely to take actions that lead to high reward.
-
The Alignment of Complex Systems Research Group studies connections between AI alignment and complex systems theory.
See also the EA Forum post Social scientists interested in AI safety should consider doing direct technical AI safety research, (possibly meta-research), or governance, support roles, or community building instead