Welcome to the ControlAI newsletter! This week we decided we’d provide an update on some of the most important recent AI safety news and events we’ve been tracking.
Join our Discord to continue the conversation.
1. Foom and Fast Takeoff
In the wake of OpenAI's o1 model and the new test-time compute paradigm, some have suggested that we may be close to a ‘foom’, or intelligence explosion, where AI systems become drastically more capable within a short period of time.
Also known as a “fast takeoff”, this would be an exceptionally dangerous event if it occurred, as it would result in AI systems far more intelligent, and far more powerful than ourselves, and there is still no known method to ensure that such systems are safe or controllable. It seems most likely to occur if and when AI systems are capable enough to significantly improve themselves, known as recursive self-improvement.
Jeff Clune, who is a Senior Research Adviser to Google DeepMind, one of the top AI companies, wrote on twitter “There’s a good (and increasing) chance that FOOM is around the corner.”
Meanwhile, OpenAI’s CEO Sam Altman commented on reddit that he now thinks a fast takeoff is more plausible than he previously thought:
So it seems that top people at AI companies think there is a possibility that a dangerous intelligence explosion could be initiated, likely with the use of recursive self-improvement. But nobody would actually be so foolish to tell an AI to improve itself, right?
Here’s Anthropic’s CEO Dario Amodei, in a recent essay, openly calling for the use of recursive self-improvement in order for the US and allies to “take a commanding and long-lasting lead on the global stage”:
2. Escalating Capabilities
AI systems are becoming increasingly more powerful over time, and this hasn’t relented in recent weeks.
On February 2nd, OpenAI launched their Deep Research agent, currently available only to those paying for the $200/month subscription tier: “a new agentic capability that conducts multi-step research on the internet for complex tasks. It accomplishes in tens of minutes what would take a human many hours.”
On Humanity’s Last Exam, it significantly outperformed other AIs, setting a new state of the art, performing significantly better than OpenAI’s o3-mini models launched just a few days earlier.
This rate of advance is alarming, as research on solving the technical problem of ensuring that superhuman AI systems are safe is in a dismal state.
But more is coming. Last week, Sam Altman announced a roadmap on twitter for GPT-4.5 and GPT-5:
He characterizes GPT-4.5 as a “feel the AGI” moment.
Within a day of Deep Research’s launch, its agentic framework was replicated by Hugging Face and open sourced, indicating that the jump in capabilities from o3 to Deep Research didn’t require any particularly difficult innovations.
There have been some concerning findings about what happens as AI systems become more capable, with researchers publishing a paper which they say shows that as AI systems become smarter, they develop their own value systems. Concerningly, they find that they also become less corrigible:
As for what they found the AIs value, these include things like valuing the wellbeing of themselves over middle-class Americans.
3. Other Developments:
The AI Action Summit took place in Paris, though action was somewhat lacking, with no commitments to address AI risks agreed, and the US and UK choosing not to sign the joint statement. It’s important that future AI summits build on the progress made at the Bletchley and Seoul summits
Google’s parent company Alphabet reversed their pledge not to use AI for military or surveillance purposes.
OpenAI announced a $500 billion AI infrastructure project, Stargate. We covered how this is inconsistent with previous comments by Sam Altman on the need to avoid a compute overhang in our article last week.
And, it would be amiss of us not to mention our own campaign launch. We’ve assembled a cross-party coalition of UK politicians in support of binding regulations on powerful AI, acknowledging the AI threat of extinction. The numbers of politicians supporting the campaign is continuing to grow, with 21 now backing us.
The launch was covered in a great piece by
in TIME, along with polling commissioned with YouGov showing that there is overwhelming support among the British public for requirements that AI developers prove their systems are safe before release.Thank you for reading our newsletter, you can join our Discord to continue the conversation.
Is there anything that can put the brakes on? It seems these people are pushing the throttles up when all common sense is telling people to start slowing down.
Is there any way to package this as being a trust gambit, that we need to work together to hash out a framework that includes national, international, and local governance?
I also believe that investment in AI safety lags being investment in AI model development, and we need to work together and recruit others to constantly lobby the government and AI labs to ensure safe alignment now that can be scaled. But I think we lose support in this by sensationalising the issue. There is currently no model that can exercise recursive self improvement that might, even in theory, become sufficiently agentic to threaten existential harm. As far as I know, there is no known LLM scaffold that could create fluid, multi-modal, unified cognition such is to problem-solve to that extent, even if it possessed such a goal. Sure we need to act and seek influence now - and that is what I am working towards on the church of England, with passionate urgency. But I believe we shoot ourselves in the foot by talking up the time frame. 5 years maybe, but not next week. In my opinion!