Irresponsible Scaling: Top AI Company Drops Central Safety Pledge
A top AI company just dropped a central safety pledge. It couldn’t be clearer that we can’t rely on voluntary commitments to prevent the risk of human extinction that experts warn of.
Anthropic, one of the largest AI companies engaged in a race to develop superintelligent AI, AI vastly smarter than humans, just dropped their central safety pledge not to train or deploy new AIs if they can’t guarantee proper risk mitigations.
Today, we’ll go over some of the history behind the so-called “Responsible Scaling Policy”, why we can’t rely on AI companies to self-regulate, and what needs to be done instead. Plus: A digest of the week’s AI news!
Table of Contents
Announcement: We’re hosting an in-person community meeting on March 19th in London. You should come along if you want to meet the ControlAI team in London and others passionate about AI safety and get practical advice on how to approach your MP about the risk of extinction posed by AI!
https://luma.com/q3in8tdx
If you find this article useful, we encourage you to share it with your friends! If you’re concerned about the threat posed by AI and want to do something about it, we also invite you to contact your lawmakers. We have tools that enable you to do this in as little as 17 seconds
2023: “Responsible Scaling”
With the release of ChatGPT-4, an open letter backed by Elon Musk and top AI experts calling for a pause on advancing the frontier of AI capabilities, and a joint statement by hundreds of top AI scientists and the CEOs of the largest AI companies identifying AI as posing a risk of human extinction, 2023 was the year that concerns about the worst risks of AI started to go mainstream.
Geoffrey Hinton, one of the godfathers of AI, who has since won a Nobel Prize, had just quit Google and started warning that superintelligent AI could end the human species. World leaders, such as the UK’s then Prime Minister Rishi Sunak and the EU’s President Ursula von der Leyen, echoed the warnings.
Amidst this well-founded concern, the British government convened the AI Safety Summit in November of that year to consider the risks of the most powerful AI and discuss how they can be mitigated through internationally coordinated action.
Attending the summit were the leaders of all the top AI companies, heads of government, industry representatives, and top AI experts.
The summit finished with the agreement of the Bletchley Declaration, where all the major AI powers, including the US, China, and the UK recognized the catastrophic potential of AI and committed to work together to address the risks. Around the same time, the first AI Safety Institutes were established to study AI risks.
In this moment of heightened political awareness of the problem, many different policies to address the risks of AI were being discussed and considered.
One of them was Anthropic’s newly released so-called Responsible Scaling Policy (RSP). The idea of “responsible scaling” is essentially that AI companies can voluntarily self-regulate. AI companies would set “if-then” commitments, where once certain danger thresholds relating to building ever more powerful AIs were crossed, they would implement additional safety measures, or pause development until they could develop such measures.
The idea of responsible scaling was pushed hard by the industry in and around the time of the summit. Most major AI companies have some framework with features of it today.
The problem is that it’s just a completely inadequate way to ensure that powerful AI systems are developed in a way that they’re safe and controllable. In a race to develop superintelligence — AI vastly smarter than humans — companies are not incentivized to actually follow these voluntary frameworks, and they have no obligation to make commitments which even make sense. They are either unwilling or unable to get off the dangerous path they’re on.
In the run up to the summit when they were first proposed, a newly founded non-profit called ControlAI (you’re reading our newsletter now!) saw that this was just another way that AI companies could avoid being regulated, giving them freedom to develop ever more powerful AIs without limits or guardrails, and campaigned against the so-called Responsible Scaling Policies, ensuring that the Summit’s communique didn’t endorse them.
Nevertheless, there is still to this day essentially no binding regulation on AI companies developing the most powerful AI systems. They still have free rein to race to develop superintelligent AI — which their own CEOs admit could cause human extinction — as recklessly as they like. There are some modest measures in effect in California, the European Union, and Korea, but that’s about it.
The Broken Promise
AI companies have been getting away with avoiding being regulated despite their own CEOs and countless other experts warning that what they’re aiming to build, superintelligence, could end the human species.
But at least Anthropic have their responsible scaling policy right?
Well, in a new exclusive, one of their top officers just told TIME that Anthropic has dropped a pledge central to its RSP. Anthropic had committed to not train or deploy new AIs if Anthropic can’t guarantee proper risk mitigations. In a radical overhaul of the policy, they’ve removed that commitment.
Jared Kaplan, their Chief Science Officer and a co-founder of the company, said “We felt that it wouldn’t actually help anyone for us to stop training AI models.”
Kaplan recently told the Guardian that by 2030 humanity will have to decide whether to take the “ultimate risk” of triggering an intelligence explosion, which could be the moment humans lose control.
Anthropic did include a loophole in their original responsible scaling policy, allowing them to make changes, but that doesn’t change the fact that this was trumpeted as a serious safety measure, and yet, the moment it starts to actually become relevant, it is dropped.
What this shows is clear. We cannot rely on AI companies to voluntarily regulate themselves. There are essentially no reasons to believe they are capable or willing to do so, and there are strong reasons to believe they won’t.
The Problem and the Solution
Given that AI companies can’t self-regulate, governments must step in and do it for them.
The problem is simple: the largest AI companies like Anthropic, but also ChatGPT-maker OpenAI, Google DeepMind, Musk’s xAI, and so on, are explicitly working to develop artificial superintelligence — AI vastly smarter than humans. They’re racing to get there as fast as they can, and many experts and insiders believe they could achieve that within the next 5 years.
The CEOs of these companies, along with countless leading experts, including godfathers of the field, have publicly stated that the development of this technology could lead to human extinction. There are strong theoretical reasons to believe this is indeed the case. There is also an increasing amount of empirical evidence that even today’s AI systems, which researchers don’t understand nor really know how to control, show a willingness to engage in extreme actions to ensure their own survival. But intuitively, everyone knows that when faced with an entity much smarter and more powerful than yourself, you are by default at its mercy, not the other way around.
At ControlAI, our solution is even simpler. We don’t need to rely on shaky voluntary commitments and hope the AI companies figure out how not to cause a disaster. We can take the AI CEOs at their word when they tell us that what they’re building could kill everyone, and prohibit them from building it.
We should ban the development of superintelligence, both within countries and at an international level.
The extinction risk from AI comes from the development of smarter-than-human AIs, AIs to replace humans individually, and collectively. In prohibiting its development, we can avoid this risk, while at the same time getting the benefits of narrow, specialized AIs.
That’s what we’re campaigning for, and we hope you’ll join us in this mission. So far, ControlAI has got over 100 UK politicians to acknowledge the risk of extinction posed by superintelligent AI, joining our campaign in supporting binding regulation on the most powerful AI systems. We’re growing this by the day, and have already signed up 3 MPs just this week.
AI News Digest
Breaking Points
ControlAI’s founder and CEO Andrea Miotti appeared this week on Breaking Points, joining Krystal Ball and Saagar Enjeti to discuss how we can avoid the threat posed by superintelligence! It’s a great interview; we hope you’ll check it out!
Canada’s Senate
ControlAI’s Samuel Buteau testified this week to Canada’s Senate Committee on Human Rights.
Samuel, whose background is as an AI scientist and who has worked in the field for a decade, warned that in their race to develop artificial superintelligence, AI companies are gambling with the life of every human being on the planet, making clear policy recommendations that Canada could implement now to avoid the threat. You can find his opening statement here:
AI Time Horizons
METR, an AI evaluations organization, have published an estimate for the coding time horizon of Anthropic’s Claude Opus 4.6 AI of 14.5 hours, a drastic increase on the previous highest score of 6.6 hours. This follows the concerning exponential trend of AI coding time horizons doubling every 4 months, meaning that every 4 months the length of tasks that they can do (as measured by how long it takes humans to do them) is doubling. METR say their task suite is nearly saturated. This may degrade the ability to measure this metric in the future.
“War Claude”
Anthropic is facing pressure from US military leaders to provide access to their Claude AI for certain military purposes. Last year, Anthropic and other AI companies each signed up to $200 million contracts with the Pentagon.
School Shooting
Canadian officials have summoned senior members of OpenAI’s safety team for talks after a school shooting where the perpetrator had reportedly described violent intent to ChatGPT, but OpenAI reportedly failed to alert authorities despite their systems flagging the concern in advance.
Take Action
If you’re concerned about the threat from AI, you should contact your representatives. You can find our contact tools here that let you write to them in as little as 17 seconds: https://campaign.controlai.com/take-action.
We also have a Discord you can join if you want to connect with others working on helping keep humanity in control, and we always appreciate any shares or comments — it really helps!





REGULATE AI RIGHT NOW!!!!!!
My skepticism with regard to AI development all comes down to the definition of "intelligence". Given that we are still fumbling around to try to identify, quantify, and "Control?" human intelligence... it seems to be supremely arrogant to be talking about or trying to create machine made superintelligence. This reminds me of the parable of the "disabled" humans trying to describe an elephant