What are Responsible Scaling Policies (RSPs)?
METR1
Anthropic was the first company to publish an RSP in September 2023 defining 4 AI Safety Levels.
“A very abbreviated summary of the ASL system is as follows:
-
ASL-1 refers to systems which pose no meaningful catastrophic risk, for example a 2018 LLM or an AI system that only plays chess.
-
ASL-2 refers to systems that show early signs of dangerous capabilities – for example ability to give instructions on how to build bioweapons – but where the information is not yet useful due to insufficient reliability or not providing information that e.g. a search engine couldn’t. Current LLMs, including Claude, appear to be ASL-2.
-
ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities.
-
ASL-4 and higher (ASL-5+) is not yet defined as it is too far from present systems, but will likely involve qualitative escalations in catastrophic misuse potential and autonomy.”
Other AI companies2
-
OpenAI’s 2023 beta version of their Preparedness Framework
-
Deepmind’s 2024 Frontier Safety Framework
-
Microsoft’s 2025 Frontier Governance Framework
-
Meta’s 2025 Frontier AI Framework
-
Amazon’s 2025 Frontier Model Safety Framework
RSPs have received positive and negative reactions from the AI safety community. Evan Hubinger of Anthropic, for instance, argues that they are “pauses done right”; others are more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the "burden of proof" from the people working on capabilities to people concerned about safety while serving only as a promissory note rather than an actual policy.
Further reading:
-
METR’s key components of an RSP
-
SaferAI’s comparison of OpenAI’s RDP and Anthropic’s RSP
-
The Center for Governance of AI’s proposal of a grading rubric for RSPs
-
longerrambling’s analysis of published RSPs