When Safety Becomes Negotiable: What Anthropic's Reversal Means for Europe

In September 2023, Anthropic published a document that was hailed as a landmark in the AI industry: the Responsible Scaling Policy (RSP). Its core promise was unusually clear: if an AI model exceeded certain capability thresholds, development would be paused — until safety could be demonstrably guaranteed. OpenAI and Google DeepMind adopted similar frameworks shortly after.

On 24 February 2026, Anthropic withdrew that promise. The new RSP 3.0 replaces binding development pauses with “nonbinding but publicly-declared targets”. The company once seen as the industry’s conscience has moved its red line.

This shift coincides with the week the US Department of Defense gave Anthropic an ultimatum: either deliver its model Claude to the Pentagon without safety restrictions, or face the Defense Production Act and designation as a “supply chain risk”.

The question this raises is not whether Anthropic is a responsible company. The question is what this means for everyone who relies on this technology — as an individual, as a business, as a European economy.

What has changed — in plain language

Anthropic’s previous safety model worked like a traffic light system. So-called AI Safety Levels (ASL) classified a model’s capabilities — the higher the level, the stricter the safeguards. The critical point was: before a model could advance to the next level, the corresponding protections had to be proven to work. If they didn’t, development stopped. A hard stop signal.

The new RSP 3.0 replaces this stop signal with what Anthropic itself describes as “nonbinding but publicly-declared targets”. Instead of firm if-then commitments, there is now a Frontier Safety Roadmap — a plan with ambitious but non-binding milestones.

Anthropic is remarkably transparent about its reasons. In the official announcement, the company cites three factors:

A “zone of ambiguity”: The science of model evaluation is not mature enough to definitively determine whether a model has crossed a capability threshold.
An “anti-regulatory political climate”: The US government is moving towards prioritising competitiveness and economic growth, while safety discussions “have yet to gain meaningful traction at the federal level”.
Requirements impossible to meet unilaterally: The highest safety levels (ASL-4 and ASL-5) may be “outright impossible to implement without collective action” — and that collective action is being politically blocked.

The core message, in Anthropic’s own words: they preferred to adjust the RSP before reaching safety levels they cannot meet — rather than defining standards that would be “easy to achieve” but would undermine the policy’s purpose.

Why: Government pressure meets market logic

The softening does not happen in a vacuum. It is the result of pressure from two directions simultaneously.

From above: The Pentagon. US Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei a deadline of Friday on 24 February to lift Claude’s safety restrictions for military use. The threat: either cooperate, or the government forces cooperation through the Defense Production Act. Axios reports that Hegseth said he would not allow any company to dictate the terms under which the Pentagon makes operational decisions.

Whether the government can actually compel Anthropic to deliver a product without safety guarantees is legally contested. Alan Rozenshtein, a professor at the University of Minnesota and editor at Lawfare, analyses the legal landscape in detail. James Baker, former chief judge on the US Court of Appeals for the Armed Forces, puts it this way: “Sometimes the availability of potential authority is sufficient leverage to achieve a result through consultation without invoking the law.”

From the side: The competition. While Anthropic holds its line on safety, other AI companies have already yielded. OpenAI and Google provide their models to the Pentagon for “all lawful purposes”. Elon Musk’s xAI has signed a contract for use in classified environments. Anthropic stands increasingly alone — with the technically strongest model for military applications, but the strictest restrictions.

The result is a classic prisoner’s dilemma: whoever abandons safety first wins the contract; whoever holds out longest loses the market. For a deeper analysis of the Pentagon-Anthropic confrontation, see our previous article.

Two worlds: Who tightens, who loosens

What makes Anthropic’s reversal particularly striking is the contrast with the rest of the world. While the US weakens its AI safety standards, virtually every other major economic region is moving in the opposite direction.

European Union: The EU AI Act has been in force since August 2024 — the world’s first binding AI law. Since February 2025, eight explicitly prohibited AI practices apply, including social scoring and mass biometric surveillance. Since August 2025, transparency obligations apply to general-purpose AI models. More than 230 companies have signed the EU AI Pact — a voluntary commitment to early compliance.

United Kingdom: The AI Safety Institute (AISI) employs over 100 technical staff, receives pre-deployment access to leading AI models for safety testing, and has an annual budget of £66 million. The Bletchley Declaration of November 2023 — signed by 29 countries, including both the US and China — acknowledged that advanced AI poses “significant risks, including serious, even catastrophic, harm”.

California: Even within the US, counter-movements exist. SB 53, signed by Governor Newsom in September 2025, requires providers of frontier models to publish safety frameworks and report critical incidents within 15 days.

China: The Interim Measures for Generative AI, in effect since August 2023, require safety assessments, algorithm registration, and labelling of AI-generated content.

The US at the federal level: The opposite direction. President Biden’s Executive Order 14110 — 36 pages of mandatory safety reporting, testing, and transparency requirements — was revoked on the new administration’s first day in office. Its replacement: Executive Order 14179 — two pages, zero safety requirements, one declared objective: “sustain and enhance America’s global AI dominance.”

The contrast could hardly be starker. In November 2023, the US signed the Bletchley Declaration acknowledging AI risks as potentially catastrophic. Fourteen months later, it revoked every binding safety measure at the federal level.

What this means for the European economic area

For European businesses and individuals, this development is more than a transatlantic regulatory comparison. What is emerging is a fundamentally new risk profile — and a dilemma with no easy resolution.

The product risk. When an AI provider loosens its safety commitments under government pressure, the impact extends beyond military applications. Anthropic does not develop two separate models — one for the Pentagon and one for the rest of the world. The architecture, the internal priorities, the allocation of resources between safety research and feature development: all of it shifts. Chris Painter of Metr, a non-profit that evaluates AI risks, warns of a “frog-boiling effect”: without binary thresholds that serve as warning signals, risks accumulate gradually — until there is no clear moment that triggers an alarm.

The compliance risk. The EU AI Act does not automatically make the use of AI systems in Europe safer — but it makes the users liable. Companies deploying high-risk AI systems must document and assess their risks — even when the provider sits in the US and has just lowered its standards.

The sovereignty risk. The confrontation between the Pentagon and Anthropic reveals a pattern that extends beyond AI: US companies operate under a legal framework in which the government can change the rules at any time — not through transparent legislation, but through executive threats. The CLOUD Act concerns data. The Defense Production Act concerns products. The logic is the same: what a US company promises its customers is subject to whatever the US government permits.

The dilemma: Stricter regulation or faster progress

This is where it becomes uncomfortable for Europe. The weakening of AI safety standards does not only carry risks — it also brings speed.

Fewer safety requirements mean faster iteration, faster product launches, faster technological progress. US companies freed from safety evaluations and documentation obligations can invest resources in development rather than compliance. This is not a theoretical scenario — it is already happening.

For European businesses, this creates a situation in which both available options carry costs:

Staying within the European regulatory framework means: clear rules, legal certainty, consumer protection — but also higher costs, longer development cycles, and the growing risk that the technological gap with US and Chinese providers continues to widen. The regulatory burden falls on European start-ups and mid-sized companies, while US competitors are exempted.

Exposing oneself to unregulated dynamics means: access to the most capable models without restrictions — but also dependence on providers whose safety promises can be revised under political pressure. Anyone who relies today on a US AI product without a European alternative is implicitly accepting that the rules can change at any time without notice.

The decisive open question: will Europe maintain its strict regulatory framework — and pay the price of slower innovation? Or will the pressure of a new AI arms race, in which artificial intelligence becomes a military power tool, push the EU to soften its own standards?

A neutral position — strict safety without competitive disadvantage — no longer seems to exist.

What this means for you in concrete terms

For individuals: The AI systems you use every day — from chatbots to translation services to AI-powered search — are built by companies whose safety promises are under political pressure. What was guaranteed yesterday is negotiable today. This does not mean these tools will be unsafe tomorrow. It means the mechanisms that were supposed to guarantee their safety have become weaker — and that no independent authority can intervene.

For businesses: Every European organisation using US AI products should reassess its dependency profile. Not on principle, but as sober risk management. If the provider of your AI system can be compelled to change its safety standards under political pressure, that is an operational risk — comparable to a supply chain running through a single country with unstable rule of law. Our digital risk audit provides a structured entry point for this assessment.

For the debate: Anthropic’s RSP softening is not an isolated case. It is a symptom of a development in which commercial AI safety and state power interests collide. Nik Kairinos, CEO of Raids AI, puts it bluntly: voluntary commitments are revised when it matters. This is not a moral failure — it is a structural reality. And it is the strongest argument that binding regulation — for all its downsides — is the only mechanism that can withstand this pressure.

One thing the Anthropic case shows with painful clarity: in a world where AI becomes a strategic weapon, there is no risk-free position. Neither inside regulation nor outside it.

Sources

Topic overview: All articles on digital-independence.org →

When Safety Becomes Negotiable:
What Anthropic’s Reversal Means for Europe