What are some AI governance exercises and projects I can try?
This list is largely focused on projects within AI policy rather than other career paths like AI safety A research field about how to prevent risks from advanced artificial intelligence.
-
[Public] Some AI Governance Research Ideas (from GovAI)
-
Project page from AGI
Safety Fundamentals and their Open List of Project Ideas -
AI Safety Ideas by Apart Research; EAF post
-
-
Competitions like SafeBench (see example ideas)
-
Student ML Safety Research Stipend Opportunity – provides stipends for doing ML research.
-
course.mlsafety.org projects — CAIS is looking for someone to add details about these projects on course.mlsafety.org
-
-
Distilling / summarizing / synthesizing / reviewing / explaining
-
Forming your own views on AI safety (without stress!) – also see Neel Nanda's presentation slides and "Inside Views Resources" document
-
"Mostly focused on AI" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
-
Possible ways to expand on "Discovering Latent Knowledge in Language Models Without Supervision"
-
Answer some of the application questions from the winter 2022 SERI-MATS application process, such as Vivek Hebbar's problems
-
10 exercises from Akash in “Resources that (I think) new alignment researchers should know about”
-
[T] Deception Demo Brainstorm has some ideas (message Thomas Larsen if these seem interesting)
-
Alignment research at ALTER – interesting research problems, many have a theoretical mathematics flavor
-
Open Problems in AI X-Risk
[PAIS #5]Existential riskView full definitionA risk of human extinction or the destruction of humanity’s long-term potential.
-
Steven Byrnes: [Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
-
Evan Hubinger: Concrete experiments in inner alignment
, ideas someone should investigate further, sticky goalsInner misalignmentView full definitionWhen an AI system ends up pursuing a different objective than the one that was specified.
-
Richard Ngo: Some conceptual alignment research projects, alignment research exercises
-
Buck Shlegeris: Some fun ML engineering projects that I would think are cool, The case for becoming a black box investigator of language models
-
Implement a key paper in deep reinforcement learning
-
Amplify creative grants (old)
-
“Paper replication resources” section in “How to pursue a career in technical alignment”
-
ELK – How can we train a model to report its latent knowledge of off-screen events?
-
Daniel Filan idea – studying competent misgeneralization without reference to a goal
-
Summarize a reading from Reading What We Can
-
Zac Hatfield-Dodds: “The list I wrote up for 2021 final-year-undergrad projects is at https://zhd.dev/phd/student-ideas.html - note that these are aimed at software engineering rather than ML, NLP, or AI Safety per se (most of those ideas I have stay at Anthropic, and are probably infeasible for student projects).” These projects are good for AI safety engineering careers.