What are some exercises and projects I can try?

6 min read

Suggest changes in Google Docs

This document is somewhat organized based on projects focusing on AI safety technical research and projects focusing on AI policy.

Consider joining some online AI safety communities (see AISafety.com/communities) and asking for feedback or ideas.

Technical AI safety

Levelling Up in AI Safety Research Engineering [Public] (LW)
- Highly recommended list of AI safety research engineering resources for people at various skill levels.
Alignment Jams / hackathons from Apart Research
- Some past / upcoming hackathons: LLM, interpretability 1, AI test, oversight, governance
- Resources: black-box investigator of language models, interpretability playground, AI test, oversight, governance
- Examples of past projects; interpretability winners
- Projects on AI Safety Ideas: LLM, interpretability, AI test,
200 Concrete Open Problems in Mechanistic Interpretability by Neel Nanda
Center for AI Safety
- Competitions
- Student ML Safety Research Stipend Opportunity – provides stipends for doing ML research. (Applications currently closed.)
Unit exercises from the alignment track of Bluedot Impact’s courses (formerly AI Safety Fundamentals).
Victoria Krakovna's compilation of project ideas
Open Problems in AI X-Risk [PAIS #5]
"Technical/theoretical AI safety/alignment" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
Possible ways to expand on "Discovering Latent Knowledge in Language Models Without Supervision"
Answer some of the application questions from the winter 2022 SERI-MATS, such as Vivek Hebbar's problems
[T] Deception Demo Brainstorm has some ideas (message Thomas Larsen if these seem interesting)
Alignment research at ALTER – interesting research problems, many have a theoretical math flavor
Steven Byrnes: [Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
Evan Hubinger: Concrete experiments in inner alignment, ideas someone should investigate further, sticky goals
Richard Ngo: Some conceptual alignment research projects, alignment research exercises
Buck Shlegeris: Some fun ML engineering projects that I would think are cool, The case for becoming a black box investigator of language models
Implement a key paper in deep reinforcement learning
“Paper replication resources” section in “How to pursue a career in technical alignment”
ELK
Zac Hatfield-Dodds: “The list I wrote up for 2021 final-year-undergrad projects is at https://gist.github.com/Zac-HD/405bcabab236a56586690df735bac9bc — note that these are aimed at software engineering rather than ML, NLP, or AI Safety per se (most of those ideas I have stay at Anthropic, and are probably infeasible for student projects).” These projects are good for AI safety engineering careers.
Daniel Filan’s idea about studying competent misgeneralization
Owain Evans + Stuart Armstrong: “AI Safety Research Project Ideas”
Singular Learning Theory Exercises — A researcher at Timaeus created this list of exercises for the Distilling Singular Learning Theory Sequence.
CAIS has some project ideas for demonstrating techniques in ML safety.
Apollo Research has a long list of open problems and concrete projects in evals.

AI policy/strategy/governance

[Public] Some AI Governance Research Ideas (from GovAI)
AI Governance Needs Technical Work
Compute Research Questions and Metrics - Transformative AI and Compute [4/4]
Week 9: Projects for the AI Safety Fundamentals course on AI governance
"AI policy/strategy/governance" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
Alignment Jams / hackathons from Apart Research
- AI Governance Alignment Jam and its resources

Both technical research and AI governance

2023 Open Philanthropy AI Worldviews Contest
AI Safety Ideas by Apart Research; EA Forum post
Distilling / summarizing / synthesizing / reviewing / explaining
Forming your own views on AI safety (without stress!) — also see Neel's presentation slides and "Inside Views Resources" doc
10 exercises from Akash in “Resources that (I think) new alignment researchers should know about”
Most Important Century writing prize (Superlinear page)
Important, actionable research questions for the most important century (Holden Karnofsky)
Amplify creative grants (old)
Summarize a reading from Reading What We Can
Review or implement plans from AI-plans.com
The UK AI Security Institute lists a number of research ideas in its 2025 research agenda.

What are some helpful AI policy resources?

What is everyone working on in AI governance?

AISafety.info is a project founded by Rob Miles. The website is maintained by a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

Get involved

Join us on Discord

Partner projects

Alignment Ecosystem Development

© AISafety.info, 2022—1970

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.