Anthropic, widely regarded as the most safety-conscious major AI company, has formally abandoned the central commitment of its flagship safety policy. The decision marks a pivotal moment not just for Anthropic, but for the entire AI industry’s approach to self-regulation.
A Three-Year Promise, Now Revoked
Since 2023, Anthropic’s Responsible Scaling Policy (RSP) had held firm on one core rule: the company would not train or release a frontier AI model unless it could guarantee that adequate safety measures were already in place. For years, executives cited that pledge as evidence Anthropic would resist the commercial pressure to rush powerful — and potentially dangerous — technology to market.
That guarantee is now gone. In an exclusive interview with Time, Anthropic confirmed the release of RSP Version 3.0, a revised framework that no longer places hard restrictions on training new models when safety guardrails aren’t already secured.
Chief Science Officer and co-founder Jared Kaplan explained the company’s reasoning plainly. “We felt that it wouldn’t actually help anyone for us to stop training AI models,” he told Time. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”
What Changes Under the New Framework
The revised policy replaces categorical restrictions with more flexible, transparency-based commitments. Under RSP Version 3.0, Anthropic will:
- Publish detailed “Frontier Safety Roadmaps” that outline concrete plans for future risk mitigations
- Release comprehensive “Risk Reports” every three to six months
- Commit to matching or exceeding competitors’ safety efforts
- Delay AI development only if Anthropic believes it holds a clear lead in the industry and assesses that catastrophic risks are significant
The original RSP worked like a hard tripwire — specific AI capability levels automatically halted development until safety measures caught up. The new version removes that automatic stop. Critics caution this could produce a slow-burn effect where risk gradually escalates without triggering any clear alarm.
Chris Painter, director of policy at METR, a nonprofit focused on evaluating AI models for dangerous behavior, reviewed an early draft of the revised policy at Anthropic’s invitation. He acknowledged the shift was understandable but called it a sobering sign for the world’s ability to manage AI risks. Anthropic, he said, appears to believe it “needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities.” He added that the change is “more evidence that society is not prepared for the potential catastrophic risks posed by AI.”
Pentagon Tensions and Political Headwinds
The timing of the announcement attracted immediate scrutiny. Anthropic revealed the policy change on the same day CEO Dario Amodei met with Defense Secretary Pete Hegseth. According to reports, Hegseth reportedly presented Amodei with an ultimatum — ease the company’s AI safeguards or risk losing a $200 million Pentagon contract. The Defense Department had earlier moved to restrict use of Anthropic’s Claude models and reportedly threatened to label the company a supply chain risk, following Anthropic’s refusal to allow its AI for use in autonomous weapons or domestic surveillance.
It remains unclear, however, whether the policy revision is directly connected to that Pentagon meeting. Anthropic’s own blog post announcing the changes pointed to broader structural factors, explicitly citing an “anti-regulatory political climate” as part of the reasoning. The Trump administration has taken a largely hands-off stance on AI regulation, even attempting to override state-level AI laws, leaving companies without the federal regulatory framework Anthropic had long hoped would emerge.
An Industry Caught in a Race
Anthropic’s reversal does not stand alone. Demis Hassabis, CEO of Google DeepMind, has previously warned that competitive urgency — the race to outpace rivals or nations — can push AI developers into corners they would otherwise avoid. That warning now carries new weight.
Following Anthropic’s Pentagon standoff, OpenAI moved to announce a new deal supplying AI models for classified government networks. OpenAI claims it shares Anthropic’s safety concerns, but critics argue the arrangement opens the door to sweeping military applications, including surveillance of U.S. citizens.
Max Tegmark, founder of the Future of Life Institute, argued that AI companies themselves helped create the very competitive pressures they now cite as reasons to loosen safeguards. He suggested that had companies worked earlier to convert their voluntary commitments into binding legal requirements, the current race dynamic might never have intensified to this degree.
Nik Kairinos, CEO of RAIDS AI, an independent AI monitoring organization, put the core problem bluntly: “Voluntary commitments can be rewritten. Regulation, backed by real-time oversight, cannot.”
Ironically, Anthropic’s earlier refusal to yield to Pentagon demands won the company unexpected public support. After that standoff became public, the company climbed to the top of Apple’s App Store download rankings — a reminder that transparency and principled resistance still resonate with users, even as competitive forces push in the opposite direction.
