Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey

What are Responsible Scaling Policies (RSPs)?

METR1

defines a responsible scaling policy (RSP) as a specification of “what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.”

Anthropic was the first company to publish an RSP in September 2023 defining 4 AI Safety Levels.

“A very abbreviated summary of the ASL system is as follows:

  • ASL-1 refers to systems which pose no meaningful catastrophic risk, for example a 2018 LLM or an AI system that only plays chess.

  • ASL-2 refers to systems that show early signs of dangerous capabilities – for example ability to give instructions on how to build bioweapons – but where the information is not yet useful due to insufficient reliability or not providing information that e.g. a search engine couldn’t. Current LLMs, including Claude, appear to be ASL-2.

  • ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities.

  • ASL-4 and higher (ASL-5+) is not yet defined as it is too far from present systems, but will likely involve qualitative escalations in catastrophic misuse potential and autonomy.”

Other AI companies2

have released their own versions of such documents with various names:

RSPs have received positive and negative reactions from the AI safety community. Evan Hubinger of Anthropic, for instance, argues that they are “pauses done right”; others are more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the "burden of proof" from the people working on capabilities to people concerned about safety while serving only as a promissory note rather than an actual policy.

Further reading:


  1. Formerly known as ARC Evals. ↩︎

  2. More companies still have committed to publish such documents. ↩︎



AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.