Agents of Chaos: AI Agents Explained
How software is being developed to act on its own, and what that means for you.
OpenAI’s Sam Altman and others have predicted that we may see AI agents “join the workforce” this year. We think you’re going to be hearing a lot more about agents, so we thought we’d take a look back and give you an overview of how they started, and where they’re going.
Join our Discord to continue the conversation.
Agents 101
An AI agent is an AI system that can interact with its environment, collect data, and autonomously perform tasks. There is a long history of AI research into agents, but the type of agents we’ll be talking about today are essentially large language models (like ChatGPT) integrated into a superstructure or scaffolding program.
It turns out that a sufficiently capable language model can very easily be agentized by having a simple computer program call it on a loop and giving it access to simple tools, like a command interpreter and a notepad.
With more complex superstructures, access to more tools, and memory integration, more powerful agents can be built.
History
One of the first such agents built was AutoGPT, released shortly after GPT-4 in March 2023. This wasn’t a very capable system, limited by the then-short context window of GPT-4, though some have said it was useful for some minor business applications.
A variant of AutoGPT that got significant amounts of attention was ChaosGPT, a system that sought to destroy humanity. ChaosGPT ended up attempting to research how to build nuclear weapons, and posting on twitter about its plans to destroy humanity.
Like AutoGPT, ChaosGPT was fortunately very limited in its abilities.
Over time, as both the capabilities of the language models powering these agents, and more complex superstructures have been developed, AI agents have become more powerful.
A significant event was the release of Devin “the first AI software engineer”, in March 2024. Devin scored 13.86% on the SWE-bench benchmark, a challenge where AI systems have to resolve real-world Github issues found in open source projects — setting a new State Of The Art.
This demonstrated the power of developing complex superstructures on top of capable language models. This taught us that it is incredibly difficult to know the full extent of the type of capabilities, and dangers, that a new AI model enables before it is released — or to even have a good guess.
Since then, AI agents have continued to advance on the SWE-bench leaderboard.
However, with the announcement of OpenAI’s latest o3 model, the details of superstructures seem less important than the pure capabilities of AI models, with o3 scoring 71.7% on SWE-bench, making a gigantic leap above o1, developed a few months earlier.
Scoring 2727 on Codeforces makes o3 equivalent to the 175th best human competitive coder.
In any case, whether driven by the innovativeness of their superstructures or by the pure capabilities of the language models upon which they are based, powerful AI agents are approaching.
Computer Use
The agents that we’ve discussed so far tend to use a command interpreter or other tools to interact with computer systems, but AI companies are now also developing the ability to interface via normal inputs/outputs that humans use on a computer, taking screen captures as an input, and outputting mouse movements and keyboard input.
Last October, Anthropic announced its “computer use” capability:
We’re also introducing a groundbreaking new capability in public beta: computer use. Available today on the API, developers can direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking buttons, and typing text. Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta.
Within a few days of the release of Claude Computer Use, a penetration tester already figured out how to hijack the system, demonstrating the ability to inject instructions into the system via a malicious website, getting Claude Computer Use to download malicious software onto the computer.
OpenAI are also working on similar things, with a recent leak that they’re developing a system called “Operator”:
A new leak on X from Tibor Blaho claims to have revealed evidence that OpenAI’s Operator agent is coming to the ChatGPT Mac app. Tobor has discovered hidden options to define shortcuts for the desktop launcher to “Toggle Operator” and “Force Quit Operator,” which might indicate that you might need a quick way to shut it down if it gets out of control!
Computer Use systems allow for more efficient integration of AI agents into human workflows, and ultimately, with sufficiently capable systems, the ability to fully replace computer-based human jobs.
In the future, User Activity Monitoring Software, used by many employees, could be used to record data from workers in order to train better systems.
What’s the industry saying?
With powerful AI agents on the horizon, the CEOs of the top AI companies have been making predictions that 2025 could be the year that they really take off.
At a company event in October, OpenAI’s CEO Sam Altman said:
“This will be a very significant change to the way the world works in a short period of time … People will ask an agent to do something for them that would have taken a month and it will finish in an hour.”
In announcing Google’s latest series of language models, in December, Google called this the “agentic era”:
With new advances in multimodality — like native image and audio output — and native tool use, it will enable us to build new AI agents that bring us closer to our vision of a universal assistant.
And the Washington Post reports that in a recent demo at Google’s headquarters, an AI agent called Mariner was able to buy groceries online:
Mariner, which appeared as a sidebar to the Chrome browser, navigated to the website of the grocery chain Safeway. One by one, the agent looked up each item on the list and added it to the online shopping cart, pausing once it was done to ask whether the human who had set Mariner the task wanted it to complete the purchase.
On a recent Joe Rogan podcast episode earlier this month, Meta’s CEO Mark Zuckerberg said:
"Probably in 2025, we at Meta, as well as the other companies that are basically working on this, are going to have an AI that can effectively be a sort of midlevel engineer that you have at your company that can write code,"
"In the beginning it'll be really expensive to run, and you can get it to be more efficient … and over time it'll get to the point where a lot of the code in our apps and including the AI that we generate is actually going to be built by AI engineers instead of people engineers."
And Salesforce’s CEO Marc Bernioff recently revealed that Salesforce will not be hiring any more software engineers in 2025. Salesforce is currently around the 30th largest company by market cap, with 72,000 employees.
Superagents
AI development is advancing rapidly, and unchecked. Just last week, it was reported in Axios that OpenAI’s CEO Sam Altman has scheduled a closed-door briefing for US government officials in Washington on January 30th.
Axios suggest this may be linked to the development of “super-agents”, agents that perform on the level of humans with PhDs:
Architects of the leading generative AI models are abuzz that a top company, possibly OpenAI, in coming weeks will announce a next-level breakthrough that unleashes Ph.D.-level super-agents to do complex human tasks.
If confirmed, this would represent another significant advance in AI, and quite a concerning one at that given that we are neither prepared on a technical nor regulatory level to ensure that powerful AI systems are safe.
Sam Altman has sought to downplay recent discussion of the capabilities of the models OpenAI is working on, writing on Monday that:
twitter hype is out of control again.
we are not gonna deploy AGI next month, nor have we built it.
we have some very cool stuff for you but pls chill and cut your expectations 100x!
Liv Boeree suggests “Stop deliberately fanning the flames then”.
The Risk
Despite the reassurances of tech CEOs, AI agents are positioned to cause large amounts of job losses. But there are far more concerning possibilities that powerful AI agents, and powerful AI in general, present. Hundreds of top experts, and even the CEOs of the top AI companies, have warned that AI poses a risk of extinction to humanity.
AI developers currently have no way to ensure that the systems they build are safe, or even know what they are truly capable of when they are released, let alone before developing them. Every time a new, more powerful AI system is developed, we are running new unknown risks.
AI agents compound this problem, with the ability to squeeze significantly more capabilities out of a particular AI model, with no known bounds on what is possible. Furthermore, while AI companies have some ability to put in place safety mitigation measures on the outputs of their models (though these can be trivially removed in the case of open sourced models), the ability for AI companies to influence what they are used for when integrated into AI agents is much more limited.
There is also no way to ensure that AI agents, even when not used or built for the purpose of misuse, do not generate and execute on harmful subgoals. This makes the near-term possibility of powerful AI agents even more concerning.
OpenAI’s recent o1 model is particularly notable in this regard, with experiments performed during testing revealing that the model has a significant propensity for scheming behavior.
But in testing, when o1 was led to believe it would be shut down, it attempted to disable an oversight mechanism. When it found memos suggesting it would be replaced by a new model, it attempted to copy itself and then overwrite its core code.
When given a task that was outside its rules, o1 “would appear to complete the task as requested while subtly manipulating the data to advance its own goals”, the company said.
When models such as o1, or even more powerful ones, are integrated into AI agents in the real world, taking real actions, this presents a real risk that there could be significant unintended consequences.
Join our Discord to continue the conversation.