Summary
TLDR
Responsible Scaling Policies (RSPs) have been recently proposed as a way to keep scaling frontier large language models safely.
While being a nice attempt at committing to specific practices, the framework of RSP is:
1. missing core components of basic risk management procedures (Section 2 & 3)
2. selling a rosy and misleading picture of the risk landscape (Section 4)
3. built in a way that allows overselling while underdelivering (Section 4)
Given that, I expect the RSP framework to be negative by default (Section 3, 4 and 5). Instead, I propose to build upon risk management as the core underlying framework to assess AI risks (Section 1 and 2). I suggest changes to the RSP framework that would make it more likely to be positive and allow to demonstrate what it claims to do (Section 5).
Section by Section Summary:
General Considerations on AI Risk Management
This section provides background on risk management and a motivation for its relevance in AI.
* Proving risks are below acceptable levels is the goal of risk management.
* To do that, acceptable levels of risks (not only of their sources!) have to be defined.
* Inability to show that risks are below acceptable levels is a failure. Hence, the less we understand a system, the harder it is to claim safety.
* Low-stake failures are symptoms that something is wrong. Their existence make high-stake failures more likely.
Read more.
What Standard Risk Management Looks Like
This section describes the main steps of most risk management systems, explains how it applies to AI, and provides examples from other industries of what it looks like.
1. Define Risk Levels: Set acceptable likelihood and severity.
2. Identify Risks: List all potential threats.
3. Assess Risks: Evaluate their likelihood and impact.
4. Treat Risks: Adjust to bring risks within acceptable levels.
5. Monitor: Continuously track risk levels.
6. Report: Update stakeholders on risks they incur and measures t