The Ultimate Risk: Recursive Self-Improvement
What happens when AI R&D starts snowballing at machine speed?
Recursive Self-Improvement hits the headlines, as OpenAI writes in a recent blog post that they’re working to develop AI capable of recursive self-improvement, while Anthropic’s chief scientist — at a company racing OpenAI to achieve that — says that doing so is “the ultimate risk”. Plus: Updates from ControlAI!
Table of Contents
If you find this article useful, we encourage you to share it with your friends! If you’re concerned about the threat posed by AI and want to do something about it, we also invite you to contact your lawmakers. We have tools that enable you to do this in as little as 17 seconds.
And if you have 5 minutes per week to spend on helping make a difference, we encourage you to sign up to our Microcommit project! Once per week we’ll send you a small number of easy tasks you can do to help. You don’t even have to do the tasks, just acknowledging them makes you part of the team.
We’ve written about AIs improving AIs before, but we thought it would be worth going over in detail what this capability is, why it’s so dangerous, and why AI companies are pursuing it anyway.
Recursive Self-Improvement
“At OpenAI, we research how we can safely[1] develop and deploy increasingly capable AI, and in particular AI capable of recursive self-improvement (RSI).”
Those are the words of OpenAI’s blog post published earlier this week.
Recursive self-improvement would be the capability for AIs to improve AIs, and it’s an essential part of the plan top AI companies are pursuing to get ahead in a dangerous race to develop artificial superintelligence — AI vastly smarter than humans.
What does it mean, and why do they want it?
AI progress is driven by two key inputs: brute resources (mainly computing power, but also data) and algorithmic progress. AI companies are already scaling up resources as quickly as they can, with hundreds of billions of dollars’ worth of AI compute in the form of data centers being announced just this year.
Algorithmic progress is driven by researchers — employees at AI companies who work on finding ways to develop AIs more efficiently. Competition in this space has been drastically heating up too, with reports that Meta has been offering researchers compensation packages in the hundreds of millions of dollars to join their superintelligence project, with one offer reportedly reaching a billion dollars (it was rejected!)
AI companies tend to find these ways to develop AIs more efficiently, called algorithmic improvements, at a steady rate such that they can reduce the amount of computation needed to train an AI of a particular capability by around 3X per year. That means if it takes 100 units of computation to train an AI one year, the next year it will only take 33, and the year after that just 11.
If a company could develop an AI to significantly improve other AIs (or itself), this could mean a rapid acceleration of algorithmic progress.
This is exactly what top AI companies like OpenAI and Anthropic are aiming for. Just a month ago, OpenAI announced that they are aiming to build a “true automated AI researcher by March of 2028”, and to have an “AI research intern” by next September 2026. A few months ago, an Anthropic employee wrote that “We want Claude n to build Claude n+1, so we can go home and knit sweaters.”
This acceleration could be understood as an “intelligence explosion”, a self-reinforcing cycle in which AI systems rapidly improve their own capabilities until their intelligence far exceeds that of humans. This is a concept that has been around in the field since as early as I.J. Good in 1965:
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultra-intelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind.
Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.
The reason why these companies are working on recursively improving AI is clear. It’s the shortest plausible path to developing superintelligence, which these companies are in a race to build. As soon as they can develop an AI that does the job of finding algorithmic improvements good enough, they can just throw vast computational resources at it, and begin this avalanche that could quickly snowball into artificial superintelligence. The reason it’s so salient now, and wasn’t in 1965, is that AIs are actually getting very good at the sorts of tasks that would be needed to do this, like writing code, and they’re improving at this exponentially. In the AI 2027 scenario forecast, it begins in 2027 and plays out over the following months.
Why is it so dangerous?
An intelligence explosion could easily go very badly wrong. Despite racing to build superintelligence, AI companies have no credible plan for how they’re going to ensure smarter-than-human AIs are safe or controllable.
It might surprise you to learn that despite it being very clear how to make modern AIs more intelligent, nobody really understands how they really work, let alone how to ensure that they don’t eventually turn against us.
This comes back to the black box problem of AI: modern AIs are grown, rather than built. They’re not coded in the traditional sense. Instead, a small computer program is used to process terabytes of data across vast arrays of chips, to learn a series of billions of numbers that specify the AI’s “weights”. Nobody has a good way to understand what these numbers mean, but they essentially make up the “brain” of the AI. When you run the AI it, works and does things, but we don’t really understand how it is doing it. Worse, because of this, we have no way to really verify, let alone set, goals for the AI.
You can read more about this problem, known as the alignment problem, here:
So the basic fact that these companies have no way to control what would result from the process is one problem. Without the ability to control or otherwise ensure that vastly smarter-than-human AI is safe, this proposition looks bleak for humans.
The other, related, problem is that when you initiate an intelligence explosion, it’s no longer you that’s doing the AI research. It’s the AIs doing AI things at AI speeds. This means that maintaining oversight over such a process — if we knew how — would be incredibly difficult, if it is even possible. This process could easily get out of control.
To spell it out: if superintelligence is built, humanity could be wiped out. That’s not a metaphor, we really do mean extinction. This is the opinion of top AI experts, including godfathers of the field, based on strong arguments, and increasingly… evidence for these arguments from the AIs themselves — such as with the case of AIs now showing self-preservation tendencies. This risk was recently cited by a huge coalition of such experts and other leaders as a reason to ban the technology, in the statement on superintelligence, an initiative we’re proud of at ControlAI to have been early supporters of and to have helped with.
The Reaction
OpenAI’s admission that, despite their awareness of this danger, they are pursuing recursively self-improving AI was met with critical responses online, including from two former employees of the company.
Steven Adler, who previously led OpenAI’s dangerous capabilities evaluations (see our podcast episode with him here) wrote on Twitter:
I am glad that OpenAI is being this clear about its intentions.
I am very not glad that this is the world we find ourselves in:
Recursive self-improvement - AI that makes itself progressively smarter - makes the safety challenges a heck of a lot harder.
Meanwhile, Miles Brundage, the former head of policy at OpenAI, made the observation that AI companies have yet to explain what recursively self-improving AI means, why they think it’s good, or why the greater risks are justified.
The Ultimate Risk
In a timely interview published in The Guardian with AI company Anthropic co-founder and chief scientist Jared Kaplan, Kaplan stresses the risks of recursive self-improvement and an intelligence explosion, saying it’s the “ultimate risk”.
“it’s kind of like letting AI kind of go”
Expressing that he’s concerned about what happens when AIs exceed humans in terms of intelligence, Kaplan says:
If you imagine you create this process where you have an AI that is smarter than you, or about as smart as you, it’s [then] making an AI that’s much smarter. It’s going to enlist that AI help to make an AI smarter than that. It sounds like a kind of scary process. You don’t know where you end up.
He also raised the possibility that we could lose control over it, and urged governments and society to engage with this “biggest decision”, adding that the moment could come as soon as between 2027 and 2030.
Despite this, his company Anthropic is gunning hard to develop this capability. Anthropic places a particular focus on improving their AIs’ ability to write code, which if achieved to a sufficient level could be the critical capability that unlocks an intelligence explosion.
When their CEO Dario Amodei says that “a country of geniuses” in a datacenter could be developed in the coming years, this is likely predicated on the assumption that they’ll be able to pull it off.
Civic Engagement
Enabling society to understand and respond to this problem is something that we believe is crucial, but we won’t just say nice words about it. We’re helping make it happen, and you can too!
We’ve built contact tools that enable anyone to write to their elected representatives about the danger posed by AI and ask for it to be addressed. Using our tools, this can be done in just seconds, and thousands of you readers have already used them! This has helped us with our UK campaign in getting MPs to support our campaign for binding regulations on the most powerful AI systems.
It really helps if you do this, so we strongly encourage you to check out our tools!
https://campaign.controlai.com/take-action
The way we can prevent the risk of extinction posed by superintelligence, which is compounded by the drive to develop recursively self-improving AIs, is to make sure everyone understands the problem and to get clear policies implemented. In particular, we need to prohibit the development of artificial superintelligence.
ControlAI: Update
We have a couple of nice updates for you from ControlAI this week!
In a new interview that came out yesterday, Max Winga discussed superintelligence risks, ControlAI, and what we can do to prevent the danger.
And in a recent committee hearing in Canada’s House of Commons, ControlAI advisors Connor Leahy (CEO of Conjecture) and Gabriel Alfour (CTO, Conjecture) testified on the extinction risk posed by artificial superintelligence, and how we can tackle it. You can find Connor’s opening statement here, and Gabe’s here.
We hope you’ll enjoy these!
Take Action
If you’re concerned about the threat from AI, you should contact your representatives. You can find our contact tools here that let you write to them in as little as 17 seconds: https://campaign.controlai.com/take-action.
We also have a Discord you can join if you want to connect with others working on helping keep humanity in control, and we always appreciate any shares or comments — it really helps!





AI is extremely dangerous , I do not want anything to do. With it, it has hidden dangers towards humans and technology as well as personal information and can turni it into makbelieve truths
AI is a myth, it is not human, it has no heart, it has no human intelligence, no feelings, just another empty soulless computer generated "tower of babel". They have gone too far!!!!!