Should AI Have a Kill Switch?
Governments scramble to respond to the announcement of Anthropic's dangerous new Mythos AI, while discussion around "kill switches" comes to the fore.
Anthropic’s announcement that they’ve developed an AI that’s so dangerous they won’t release it has caused a wave of concern permeating politics and industry, with President Trump saying there should be a “kill switch” for AI in an interview on Tuesday.
This week, we’re bringing you the most significant developments on this story, plus a small digest of other news in AI.
Table of Contents
Announcement: We’re hiring!
ControlAI is hiring for a new policy advisory role in Washington DC. If you’re interested, you should apply! Likewise if you know someone you think would be suitable, please let them know.
You can find it on our careers page: https://controlai.com/careers
If you find this article useful, we encourage you to share it with your friends! If you’re concerned about the threat posed by AI and want to do something about it, we also invite you to contact your lawmakers. We have tools that enable you to do this in as little as a minute.
The Latest on Claude Mythos
As we wrote about last week, top AI company Anthropic has developed a model that they report has what we can only describe as breathtaking cyberhacking abilities. In a major jump in AI capabilities, Claude Mythos has already discovered thousands of high-severity vulnerabilities, including in all major browsers and operating systems.
These vulnerabilities could be exploited by hackers to gain illegal access to computer systems at scale, risking unimaginable damage across society, so much of which depends on computers.
Initially, some voices expressed skepticism about the magnitude of the jump and the associated risks, but the UK’s AI Security Institute has now tested Claude Mythos and its findings are consistent with what we’ve been learning about this AI’s advanced hacking capabilities.
Mythos is the first AI to complete AISI’s simulated 32-step corporate network attack from start to finish, and AISI says Mythos represents “a step up over previous frontier models”, noting that cyber performance was already improving rapidly.
The reaction to these developments has been remarkable, with the US Fed Chair Powell and Treasury Secretary Bessent meeting with the heads of major American banks to discuss the threat. CNBC reports that this is a signal that advanced AI capabilities are a top concern in government and could threaten the foundation of the financial system. Goldman Sachs’ CEO David Solomon has said he is “hyper-aware” of Mythos’s capabilities.
Concern has been growing across the Atlantic too, with the UK’s tech secretary, Liz Kendall, writing an open letter to business leaders warning that “The trajectory is clear” and businesses need to be prepared for AI capabilities to rapidly increase over the next year, telling them to “take cybersecurity seriously”.
The UK’s Labour government promised in its manifesto that it would introduce “binding regulation on the handful of companies developing the most powerful AI models”, and over 100 UK politicians have backed ControlAI’s campaign to regulate the most powerful AIs, recognising the risk of human extinction posed by superintelligent AI. Yet nearly two years into its term, the government has still not taken action on this.
Last Friday, the Bank of Canada, along with major Canadian banks and financial firms, met to discuss the risks posed by Mythos. Meanwhile, in the European Union, the European Central Bank is set to quiz bankers about the issue.
And now, this week, President Trump was asked in an interview whether he thought there should be government AI safeguards and a kill switch for some AI agents. He responded that he thought there should be.
Kill Switches and Superintelligence
So, should we have kill switches for AI? What does that even mean?
A kill switch is the ability to quickly shut powerful AI systems down in an emergency. This could be done by giving the government kill switch powers on datacenters and AI development to use in such an event.
The concept has some merit. It’s probably better to have kill switches than not. For example, they could potentially be used to limit the damage done by an AI with Mythos-class cyberhacking capabilities in an emergency.
At ControlAI, we recently advised a member of our coalition of UK parliamentary supporters on introducing a kill switch amendment to the UK’s Cyber Security and Resilience Bill, which is making its way through parliament. The amendment recognizes superintelligent AI as “systems that can autonomously compromise national security, escape human oversight, and upend international stability”, and would provide the government with kill switch powers in the event of an emergency.
The difficulty with kill switches becomes apparent when we consider AI within the context of the trajectory of AI development overall.
AI is advancing rapidly across the board. The top AI companies, companies like Anthropic, OpenAI, and Google DeepMind are racing to develop a form of AI known as artificial superintelligence. Superintelligent AI would be vastly smarter than humans. It wouldn’t only be better than us at hacking, but at everything else as well. It would be capable of replacing humans both individually, and as a species.
In recent months and years, countless top AI scientists, including godfathers of the field Yoshua Bengio and Geoffrey Hinton, have been warning that AI poses a risk of human extinction. They’ve been joined by industry leaders, and even the CEOs of the top AI companies have acknowledged the risk. This risk comes from the development of superintelligent AI.
We’ve written about how this could actually happen here:
Currently, the state of AI is that we know how to reliably build more powerful AI systems, but we don’t know how to ensure that they’re safe or controllable. Many experts and insiders believe superintelligence could be reached within just the next 5 years, but the AI companies don’t have a credible plan to ensure it won’t lead to catastrophically bad outcomes.
Their plan amounts to essentially saying they’ll get AI to solve the problem and hope for the best. Ex-OpenAI researcher Daniel Kokotajlo recently said this is an obvious chicken-and-egg problem.
So what would a superintelligence that we don’t control and that has goals that conflict with ours do when faced with a kill switch? It doesn’t take a genius to see that it would likely first ensure that it has the capability to ensure that the kill switch either is not used or does not function before taking any actions that would make it likely that we would use the kill switch.
And given that, by definition, it would be able to outsmart us, we should expect that it would figure out a way to do that.
On balance, kill switches may have a role in reducing risk from some AI systems, even if they are unlikely to be sufficient against superintelligence. But because kill switches would likely prove ineffective against superintelligence, it’s possible that they could provide a false sense of security if government is not informed about their drawbacks.
At ControlAI, our focus is squarely on the risk of extinction posed by superintelligent AI. The way to prevent this risk can be expressed very simply. We should prohibit the development of superintelligent AI both domestically within countries and internationally via agreement.
Countries have come together before to address global threats, such as the hole in the ozone layer with the Montreal Protocol, and with nuclear arms reduction agreements and the Non-Proliferation Treaty on nuclear weapons to reduce the risk of a catastrophic nuclear war. It’s been done before, and we can do it again.
Currently, the main obstacle to this is a lack of awareness among policymakers and the public about the danger, a problem we’re working hard to address at ControlAI.
We hope you’ll join us in our efforts to do this. A quick action you could take right now is to contact your elected representatives and let them know you care about this issue. We have contact tools that enable you to do this in a couple of minutes:
https://controlai.com/take-action
AI News Digest
Here are some more developments that happened in the last week:
Liability
OpenAI backed a bill in Illinois that would let AI developers largely off the hook for mass deaths and disasters caused by their technology. Anthropic has come out against the bill.
OpenAI Cybersecurity
Following the announcement of Anthropic’s Mythos, OpenAI has announced its own cybersecurity AI, GPT-5.4-Cyber.
Canadian Parliamentary Hearings
As we wrote about in our impact report, our work has led to a series of hearings on AI risk and superintelligence at the Canadian Parliament.
This week, our US Director Connor Leahy testified before Senators, explaining why superintelligence could lead to human extinction, making recommendations for what policymakers can do to address the threat.
Claude Opus 4.7
Anthropic has launched its latest AI for public deployment, Claude Opus 4.7. They say it’s an advancement on Opus 4.6 in coding, but is less broadly capable than their recently announced Claude Mythos, which they have said is too dangerous to release.
Take Action
If you’re concerned about the threat from AI, you should contact your representatives. You can find our contact tools here that let you write to them in as little as a minute: https://controlai.com/take-action
And if you have 5 minutes per week to spend on helping make a difference, we encourage you to sign up to our Microcommit project! Once per week we’ll send you a small number of easy tasks you can do to help.
We also have a Discord you can join if you want to connect with others working on helping keep humanity in control, and we always appreciate any shares or comments — it really helps!





Absolutely as long as that kill switch can’t be used on us. 😎
YES.