What are some exercises and projects I can try?

6 min read

Suggest changes in Google Docs

This document is somewhat organized based on projects focusing on AI safety technical research and projects focusing on AI policy.

Consider joining some online AI safety communities (see aisafety.community) and asking for feedback or ideas.

Technical AI safety

Levelling Up in AI Safety Research Engineering [Public] (LW)
- Highly recommended list of AI safety research engineering resources for people at various skill levels.
Research directions from OpenAI’s Superalignment Fast Grants
Alignment Jams / hackathons from Apart Research
- Some past / upcoming hackathons: LLM, interpretability 1, AI test, interpretability 2, oversight, governance
- Resources: black-box investigator of language models, interpretability playground (LW), AI test, oversight, governance
- Examples of past projects; interpretability winners
- Projects on AI Safety Ideas: LLM, interpretability, AI test,
- How to run one as an in-person event at your school
200 Concrete Open Problems in Mechanistic Interpretability by Neel Nanda
Center for AI Safety
- Competitions
- Student ML Safety Research Stipend Opportunity – provides stipends for doing ML research.
Projects week from the alignment track of AI Safety Fundamentals
Victoria Krakovna's compilation of project ideas
Open Problems in AI X-Risk [PAIS #5]
"Technical/theoretical AI safety/alignment" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
Possible ways to expand on "Discovering Latent Knowledge in Language Models Without Supervision"
AI Alignment Awards, a contest that concluded in May of 2023
Answer some of the application questions from the winter 2022 SERI-MATS, such as Vivek Hebbar's problems
[T] Deception Demo Brainstorm has some ideas (message Thomas Larsen if these seem interesting)
Alignment research at ALTER – interesting research problems, many have a theoretical math flavor
Steven Byrnes: [Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
Evan Hubinger: Concrete experiments in inner alignment, ideas someone should investigate further, sticky goals
Richard Ngo: Some conceptual alignment research projects, alignment research exercises
Buck Shlegeris: Some fun ML engineering projects that I would think are cool, The case for becoming a black box investigator of language models
Implement a key paper in deep reinforcement learning
“Paper replication resources” section in “How to pursue a career in technical alignment”
ELK
Zac Hatfield-Dodds: “The list I wrote up for 2021 final-year-undergrad projects is at https://zhd.dev/phd/student-ideas.html - note that these are aimed at software engineering rather than ML, NLP, or AI Safety per se (most of those ideas I have stay at Anthropic, and are probably infeasible for student projects).” These projects are good for AI safety engineering careers.
Daniel Filan’s idea
Owain Evans + Stuart Armstrong: “AI Safety Research Project Ideas”

AI policy/strategy/governance

[Public] Some AI Governance Research Ideas (from GovAI)
AI Governance Needs Technical Work
Compute Research Questions and Metrics - Transformative AI and Compute [4/4]
Week 9: Projects for the AI Safety Fundamentals course on AI governance
The Alignment Jams / hackathons from Apart Research sometimes focus on AI governance
"AI policy/strategy/governance" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
Alignment Jams / hackathons from Apart Research
- AI Governance Alignment Jam and its resources

Both technical research and AI governance

2023 Open Philanthropy AI Worldviews Contest
AI Safety Ideas by Apart Research; EA Forum post
Distilling / summarizing / synthesizing / reviewing / explaining
Forming your own views on AI safety (without stress!) – also see Neel's presentation slides and "Inside Views Resources" doc
10 exercises from Akash in “Resources that (I think) new alignment researchers should know about”
Most Important Century writing prize (Superlinear page)
Important, actionable research questions for the most important century (Holden Karnofsky)
Amplify creative grants (old)
Summarize a reading from Reading What We Can