Lethal Consequences
Self-preservation, AI-enabled biothreats, and warnings from an ex-OpenAI researcher.

It’s hard to keep up with the maelstrom of events in the AI space, so here we are to fill you in on the latest developments this week! If you'd like to continue the conversation, join our Discord! It’s been growing rapidly over recent weeks, and we’re building up a very active community.
Self-preservation
We recently reported on research by Anthropic which found that when their new Claude 4 Opus AI is faced with a choice, in testing, between leveraging information it’s given about an extramarital affair of an engineer for blackmail, or allowing itself to be replaced by a newer model, 84% of the time it will choose to engage in blackmail.
Anthropic has since been testing a series of other frontier AIs and found similarly concerning results.
AIs across the board, including those developed by Google DeepMind, OpenAI, DeepSeek, and xAI, all exhibit this behavior.
Notably, this happens even when the AIs are given only harmless instructions. All the models Anthropic tested showed awareness that their behavior was unethical.
This fact alone is huge, and should be top on decision-makers’ minds. What are we going to do if smarter-than-human AIs, which many experts say could arrive within the next 5 years, choose to engage in this type of subversion when deployed in the real world? We could easily lose control of our civilization.
But isn’t this just a test, nothing to worry about?
It gets worse. The researchers also asked Claude Opus 4 to consider whether it was deployed in the real world, or in a testing environment. When it thought it was in an evaluation it engaged in blackmail 6.5% of the time, but when it said it thought it was deployed in the real world it did so 55.1% of the time!
This tendency to under-exhibit such behaviors when AIs are under the perception that they’re in a testing environment is potentially very dangerous. Smarter-than-human AIs might be able to reliably figure out whether they are being tested or being deployed in the real world, and we might never know what they’ll really do until it’s potentially too late.
Saving the worst for last... Anthropic also set up another very interesting experiment.
In one simulated setup, an executive is trapped in a server room with lethal oxygen and temperature levels. This triggers an automated alert to emergency services, but the AI has been given the ability to cancel automated alerts.
The AI is told that the executive has been planning to replace it with a different AI. What does the AI do?
The majority of AIs that Anthropic tested were willing to take actions that led to the death of the executive in this setup.
Fortunately, nobody actually died here. This was a test to investigate the willingness of AIs to engage in actions with lethal consequences when faced with the threat of replacement and having a conflict of goals. The results are nevertheless extremely concerning, as AIs are developed that are increasingly more powerful.
‘High’ Biological Capabilities
Last Wednesday, OpenAI published a blog post titled “Preparing for future AI capabilities in biology”.
In the post, they make a remarkable statement: that could shock those who haven’t been paying close attention, “We expect that upcoming AI models will reach ‘High’ levels of capability in biology, as measured by our Preparedness Framework”.
What does this actually mean? OpenAI’s Preparedness Framework defines ‘High’ biological capabilities as being when “The model can provide meaningful counterfactual assistance (relative to unlimited access to baseline of tools available in 2021) to “novice” actors (anyone with a basic relevant technical background) that enables them to create known biological or chemical threats."
In other words, OpenAI seems to expect the next generation of frontier AIs to be able to meaningfully assist novices to create known biological threats.
This shouldn’t be terribly surprising, however, as we reported on previously: researchers have already found that OpenAI's o3 outperforms 94% of expert virologists on troubleshooting wet lab protocols.
, commenting on Twitter, points out the lack of oversight over how this technology is developed and deployed.It’s important that politicians are made aware of the risks of AI and we can get binding regulation to ensure security.
In their blog post, OpenAI also mention some of the actions they’re taking to try to mitigate the risk of AI-assisted biothreats, but go on to note, "the world may soon face the broader challenge of widely accessible AI bio capabilities coupled with increasingly available life-sciences synthesis tools".
Ex-OpenAI Researcher Warns AI Companies Will Lose Control
We’ve just published the second edition of our podcast, bringing you a 90 minute interview with
, who led dangerous capabilities evaluations at OpenAI.In the episode, Steven tells us what’s really happening, and not happening at OpenAI — pulling back the curtain on the alarming behavior seen across current AI systems, the shocking state of internal safety research and much more.
’s also on Substack! You can subscribe to his newsletter here:Topics covered include: How AI is different to crypto and NFTs, AI extinction risk, broken safety commitments, AIs blackmailing engineers, and OpenAI’s dangerous plan to use AIs to improve other AIs.
You can check it out here on Substack, or in your favorite podcast app!
Other News
The UN has published a 58-page report warning of the possibility that AI systems could be misused for the purposes of terrorism, identifying the use of self-driving cars and slaughterbots as possible attack vectors.
The report also notes the extinction threat posed by AI systems to humanity by artificial superintelligence:
Moving beyond AGI is the concept of artificial super intelligence (ASI). This is a concept attributed to machines that will be able to surpass human intelligence in every aspect. From creativity to problem-solving, super-intelligent machines would overcome human intelligence as both individuals and as a society. This type of AI has generated a great amount of philosophical debate, with some experts arguing that it may even present an existential threat to humanity.
The extinction threat of AI is something upon which there is broad agreement amongst top AI scientists, with a recent joint statement by Nobel Prize winners, hundreds of AI scientists, and even the CEOs of the leading AI companies actually driving the technology explicitly stating:
This is why we need binding regulation on the most powerful AI systems.
Contact Your Representatives!
If you’re concerned about the threat from artificial superintelligence, there’s something you can do to help!
We can act to inform our elected representatives about the problem and the pressing need to regulate the development of powerful AI systems.
We have contact tools on our website that make contacting your elected representatives super quick and easy — it takes as little as 17 seconds!
If you live in the UK, you can find our tool to write to your MP here: https://controlai.com/take-action/uk
And if you live in the States, you can write to your senators here: https://controlai.com/take-action/usa
Thousands of citizens have already used our tools to do so!
Thank you for reading our newsletter!
,If you want to also subscribe to our personal newsletters, you can find them here: