|Main Question: Which organizations are working on AI safety? (edit question) (write answer)|
|Child tag(s): ai safety camp, future of humanity institute, miri|
The organizations which most regularly give grants to individuals working towards AI alignment are the Long Term Future Fund, Survival And Flourishing (SAF), the OpenPhil AI Fellowship and early career funding, the Future of Life Institute, the Future of Humanity Institute, and the Center on Long-Term Risk Fund. If you're able to relocate to the UK, CEEALAR (aka the EA Hotel) can be a great option as it offers free food and accommodation for up to two years, as well as contact with others who are thinking about these issues. The FTX Future Fund only accepts direct applications for $100k+ with an emphasis on massively scaleable interventions, but their regranters can make smaller grants for individuals. There are also opportunities from smaller grantmakers which you might be able to pick up if you get involved.
Each grant source has their own criteria for funding, but in general they are looking for candidates who have evidence that they're keen and able to do good work towards reducing existential risk (for example, by completing an AI Safety Camp project), though the EA Hotel in particular has less stringent requirements as they're able to support people at very low cost. If you'd like to talk to someone who can offer advice on applying for funding, AI Safety Support offers free calls.
Another option is to get hired by an organization which works on AI alignment, see the follow-up question for advice on that.It's also worth checking the AI Alignment tag on the EA funding sources website for up-to-date suggestions.
Each major organization has a different approach. The research agendas are detailed and complex (see also AI Watch). Getting more brains working on any of them (and more money to fund them) may pay off in a big way, but it’s very hard to be confident which (if any) of them will actually work.
The following is a massive oversimplification, each organization actually pursues many different avenues of research, read the 2020 AI Alignment Literature Review and Charity Comparison for much more detail. That being said:
- The Machine Intelligence Research Institute focuses on foundational mathematical research to understand reliable reasoning, which they think is necessary to provide anything like an assurance that a seed AI built will do good things if activated.
- The Center for Human-Compatible AI focuses on Cooperative Inverse Reinforcement Learning and Assistance Games, a new paradigm for AI where they try to optimize for doing the kinds of things humans want rather than for a pre-specified utility function
- Paul Christano's Alignment Research Center focuses is on prosaic alignment, particularly on creating tools that empower humans to understand and guide systems much smarter than ourselves. His methodology is explained on his blog.
- The Future of Humanity Institute does work on crucial considerations and other x-risks, as well as AI safety research and outreach.
- Anthropic is a new organization exploring natural language, human feedback, scaling laws, reinforcement learning, code generation, and interpretability.
- OpenAI is in a state of flux after major changes to their safety team.
- DeepMind’s safety team is working on various approaches designed to work with modern machine learning, and does some communication via the Alignment Newsletter.
- EleutherAI is a Machine Learning collective aiming to build large open source language models to allow more alignment research to take place.
- Ought is a research lab that develops mechanisms for delegating open-ended thinking to advanced machine learning systems.
The major AI companies are thinking about this. OpenAI was founded specifically with the intention to counter risks from superintelligence, many people at Google, DeepMind, and other organizations are convinced by the arguments and few genuinely oppose work in the field (though some claim it’s premature). For example, the paper Concrete Problems in AI Safety was a collaboration between researchers at Google Brain, Stanford, Berkeley, and OpenAI.
However, the vast majority of the effort these organizations put forwards is towards capabilities research, rather than safety.
Unanswered canonical questions