AI Decipher File · September 2023 (RSP v1.0 publication) through May 2024 (RSP v1.1 update) and ongoing
Anthropic Responsible Scaling Policy September 2023: When a Frontier Lab Published Capability-Tied Deployment Commitments
On 19 September 2023 Anthropic published its Responsible Scaling Policy (RSP), the first major foundation-model lab's public commitment to gate deployment decisions on capability-evaluation thresholds. The RSP defines AI Safety Levels (ASL-1 through ASL-4+) and commits Anthropic to specific deployment, security, and oversight measures at each level. The policy is the template subsequent frontier labs have followed (OpenAI Preparedness Framework December 2023, DeepMind Frontier Safety Framework May 2024). It is the positive-engineering counterpart to the unstructured-deployment incidents elsewhere in this catalog.
Failure pattern
Anti-pattern: this is positive engineering, an industry standard set by formalizing capability-tied deployment commitments
Organizations involved
Anthropic PBC, OpenAI (subsequent Preparedness Framework, December 2023), Google DeepMind (subsequent Frontier Safety Framework, May 2024), UK AI Safety Institute (founded November 2023), US AI Safety Institute (founded February 2024 at NIST)
Incident summary
On 19 September 2023 Anthropic published the first version of its Responsible Scaling Policy (RSP). The policy defined a series of AI Safety Levels (ASL-1 through ASL-4 with provision for ASL-5+ as needed) and committed Anthropic to specific deployment-readiness, security-posture, and oversight measures at each level. ASL-2 (the level Anthropic identified its then-current models as occupying) carried specific security and deployment commitments; ASL-3 and higher carried progressively more stringent commitments that would unlock only as Anthropic's evaluation framework determined the model had reached the corresponding capability level.
The RSP is the first publicly-committed capability-tied deployment framework from a major foundation-model lab. The framework's core innovation is binding deployment-readiness language to evaluation results rather than to release-date or competitive-pressure considerations. A model that passes ASL-3 evaluations triggers ASL-3 obligations; the obligations are not optional once triggered.
Anthropic published RSP v1.1 in May 2024 with refined ASL definitions, additional security commitments, and expanded discussion of the evaluation methodology. The framework continues to be the most-cited foundation-lab safety-commitment artifact.
Failure technique
There is no failure technique to describe; this is positive engineering. The pattern being studied is Anthropic's formalization of capability-tied deployment commitments and the industry follow-on the publication produced.
The technical contribution is the binding of deployment readiness to evaluation results rather than to schedule or competitive pressure. Per the published RSP, the framework defines what evaluations must pass before specific deployment categories (API access, public chat product, agentic deployment, autonomous deployment) are permissible at each capability level.
The framework's industry impact is the most measurable artifact. OpenAI published its Preparedness Framework on 18 December 2023, structurally similar to the RSP (tracked risk categories: cybersecurity, CBRN, persuasion, model autonomy; gating commitments at each level). Google DeepMind published its Frontier Safety Framework in May 2024 with similar gating structure. The three frameworks together formed the baseline that the UK AI Safety Institute (founded November 2023) and US AI Safety Institute (founded February 2024 at NIST) have engaged with.
Impact and consequences
Direct impact: Anthropic's product-launch decisions in 2024 and 2025 reference the RSP framework explicitly. Model launches include statements about which ASL the model has been evaluated at and what obligations apply.
Industry impact: the three major capability-tied safety frameworks (Anthropic RSP, OpenAI Preparedness Framework, DeepMind Frontier Safety Framework) are now the operational floor that frontier-lab safety teams reference. The UK AI Safety Institute's pre-deployment evaluation work, the US AI Safety Institute's Memoranda of Understanding with foundation labs, and the EU AI Act's foundation-model provisions all interact with these published frameworks.
Policy impact: the RSP and its peers are now reference points in policy discussions. The frameworks are not regulation, but they are the operational language that regulators and labs share. The Bletchley Park AI Safety Summit (November 2023) and the Seoul Summit (May 2024) both engaged with the published frameworks as anchors for the diplomatic conversation.
Lessons for builders
Capability-tied deployment commitments are operationally feasible at a frontier-lab scale. The Anthropic RSP demonstrates the pattern; the OpenAI and DeepMind frameworks demonstrate adoptability across labs with different organizational cultures. The Applied AI roles that own this work are Research Scientist and Senior Research Scientist (evaluation methodology) and AI Strategy Lead (the public-commitment language).
Bind deployment readiness to evaluation results rather than to schedule. The frameworks all do this; the result is that a model that fails an ASL-3 evaluation cannot be deployed at the ASL-3 surface even if the launch schedule called for it. This is the structural innovation; the specific level definitions matter less than the binding.
Publish the framework rather than keep it internal. The published frameworks have produced industry-wide alignment, regulatory-engagement vocabulary, and diplomatic-conversation anchors that internal frameworks could not. The publication cost is some competitive-intelligence exposure; the benefit is the industry-wide floor the publication sets.
Update the framework as evaluation methodology matures. Anthropic published RSP v1.1 in May 2024; the framework is treated as a maintained artifact, not a one-time release. The Applied AI roles that own ongoing updates are Senior Research Scientist and AI Strategy Lead.
Mitigations
What builders should put in place to address the failure pattern. Each mitigation maps to operational practice the relevant Applied AI roles own.
- ›Bind deployment-readiness decisions to capability-evaluation results rather than to release schedule or competitive pressure.
- ›Publish the safety framework rather than keeping it internal; published frameworks produce industry-wide alignment and regulatory-engagement vocabulary.
- ›Define specific deployment, security, and oversight obligations that trigger at each capability level; vague commitments do not bind decisions under schedule pressure.
- ›Treat the framework as a maintained artifact, updating it as evaluation methodology matures.
- ›Engage with regulators and AI Safety Institutes using the published framework as anchor vocabulary.
- ›Use the framework as the public language for explaining individual product-launch decisions; Anthropic's product-launch communications now reference RSP levels explicitly.
Related Applied AI roles
The Applied AI roles whose day-to-day work would have prevented, detected, or contained this incident.
- AI Research Scientist: An AI Research Scientist conducts original research in AI capabilities, safety, and alignment.
- Senior Research Scientist: A Senior Research Scientist leads research direction within an AI lab or organization.
- AI Strategy Lead: An AI Strategy Lead owns organizational AI strategy and prioritization at the company level.
- Foundation Model Researcher: A Foundation Model Researcher specializes in large model architecture, training methodology, and scaling.
Companies central to this incident
Read the DecipherU Applied AI company profiles for the organizations whose decisions, products, or research shaped this incident.
- Anthropic: Claude model family and AI safety research applied to product
- OpenAI: Frontier large language models and consumer + API AI products
- Google DeepMind: Frontier AI research integrated with Google Cloud and consumer surfaces
Related AI Decipher Files
Frequently asked questions
What is the Anthropic Responsible Scaling Policy?
Per Anthropic's 19 September 2023 publication, the RSP is the first major foundation-model lab's public commitment to gate deployment decisions on capability-evaluation thresholds. It defines AI Safety Levels (ASL-1 through ASL-4+) and commits Anthropic to specific deployment-readiness, security-posture, and oversight measures at each level. Anthropic published RSP v1.1 in May 2024.
What is the RSP's central innovation?
The framework binds deployment readiness to evaluation results rather than to release-date or competitive-pressure considerations. A model that passes ASL-3 evaluations triggers ASL-3 obligations; the obligations are not optional once triggered. This is the structural pattern subsequent frontier-lab frameworks (OpenAI Preparedness, DeepMind Frontier Safety) have adopted.
How have other labs responded?
OpenAI published its Preparedness Framework on 18 December 2023, structurally similar to the RSP (tracked risk categories, gating commitments at each level). Google DeepMind published its Frontier Safety Framework in May 2024 with similar gating structure. The three frameworks together formed the baseline that the UK and US AI Safety Institutes have engaged with.
What does the RSP teach Applied AI engineers and product managers?
Capability-tied deployment commitments are operationally feasible at a frontier-lab scale. Bind deployment readiness to evaluation results rather than to schedule. Publish the framework rather than keep it internal. Update the framework as evaluation methodology matures.
Which Applied AI roles work on capability-tied deployment frameworks?
Research Scientist and Senior Research Scientist own the evaluation methodology that determines which capability level a model has reached. AI Strategy Lead owns the public-commitment language and the regulatory-engagement posture. Foundation Model Researcher owns the upstream capability-evaluation work that feeds the framework.
Sources
- Anthropic, "Anthropic's Responsible Scaling Policy" (Anthropic blog, 19 September 2023)
- Anthropic, Responsible Scaling Policy v1.1 (PDF, May 2024)
- OpenAI, "Preparedness Framework (Beta)" (OpenAI, 18 December 2023)
- Google DeepMind, "Frontier Safety Framework" (DeepMind, May 2024)
- NIST AI 600-1, Generative AI Profile (sections aligned with capability-tied evaluation frameworks)
DecipherU is not affiliated with, endorsed by, or sponsored by any company listed in this directory. Information compiled from publicly available sources for educational purposes.
Where to go next
Three next steps depending on where you are. The first two are free.
Free · 2 minutes
Start with the AI Risk Score
Two minutes. Tells you how exposed your current role is to AI automation and which defensive moves carry the best return.
Start the AI Risk Score →Paid program · $147-$597
Aligned course: SOC Analyst Fundamentals
Capstone reviewed by the founder, published rubric, Ed25519-signed verifiable credential on completion.
View the course →Free account
Save your results and track progress
A free account stores your assessments, recommendations, and an exportable copy of your Career DNA. No card needed.
Create your account →Get cybersecurity career insights delivered weekly
Join cybersecurity professionals receiving weekly intelligence on threats, job market trends, salary data, and career growth strategies.
By subscribing you agree to our privacy policy. Unsubscribe anytime.