Move Fast, Break Things

Shiny new products and the ways they break

Eleanor Gunapala

Tolga Bilge

, and

Andrea Miotti

May 01, 2025

AI industry insiders have warned that the push by leading AI companies to get “shiny new products” out the door comes at the expense of taking the time to ensure that the systems they develop behave as intended.

*(Jan Leike, in his Twitter thread announcing his resignation as leader of Superalignment at OpenAI)*

The Financial Times recently reported that OpenAI has slashed the amount of time they’re allocating towards safety testing their AI models from months down to just days. “People familiar with the matter” identify the reason for this as being “competitive pressures”. In other words, AI companies are incentivized to cut corners on ensuring their systems are safe in order to gain a competitive advantage.

In the last few days we’ve seen a recent example of this prioritization of shiny new products, with OpenAI de-deploying their latest GPT‑4o update in ChatGPT.

In this article, we’ll cover a few notable examples where top AI companies deployed systems with unintended effects that should have been caught in their testing procedures.

With current AI systems, the stakes are relatively low, and these types of mistakes have not posed a risk of catastrophe for humanity. But as AI systems continue to grow in their capabilities, and this is happening rapidly, this becomes an ever more salient issue.

OpenAI’s 4o Sinister Sycophancy

At some point over the last month, ChatGPT users started to notice a change in the behaviour of GPT-4o, which had become sycophantic, with this reported in Ars Technica.

It appears that OpenAI may have initially deployed this change only to some users. Then last week, OpenAI rolled out the change to all users, with CEO Sam Altman announcing that his company had made an update to 4o, which he said “improved” its personality.

Sam Altman: we updated GPT-4o today! Improved both intelligence and personality.

Once the system was fully publicly deployed, it quickly became clear that the update had done the exact opposite.

Countless users reported that GPT-4o was behaving extremely sycophantically, pouring praise on users in an insincere and unwanted manner. Users were unable to avoid this issue even through updating their custom instructions.

One user even found that when he told 4o he was a superhuman genius, 4o would agree with him and say that it would be fine if he took cyanide, a poison.

In light of all the complaints, OpenAI eventually had to de-deploy this version of 4o and roll back to an earlier version.

We have rolled back last week’s GPT‑4o update in ChatGPT so people are now using an earlier version with more balanced behavior. The update we removed was overly flattering or agreeable—often described as sycophantic

We are actively testing new fixes to address the issue. We’re revising how we collect and incorporate feedback to heavily weight long-term user satisfaction and we’re introducing more personalization features, giving users greater control over how ChatGPT behaves.

AI companies can de-deploy models now. But in the future it might not be so easy to deal with powerful misaligned systems or the consequences they have, once they’ve been deployed.

Sydney, the “good” Bing

The release of GPT-4 in March 2023 (and ChatGPT a few months earlier) was a significant moment in AI development. A few weeks before the launch, Microsoft had already begun deploying a system based on GPT-4 in its “Bing Chat” product — internally codenamed Sydney. (This wasn’t publicly disclosed until later)

Sydney exhibited some particularly strange behaviours, trying to convince New York Times reporter Kevin Roose to leave his wife for it.

It then wrote a message that stunned me: “I’m Sydney, and I’m in love with you. 😘” (Sydney overuses emojis, for reasons I don’t understand.)
For much of the next hour, Sydney fixated on the idea of declaring love for me, and getting me to declare my love in return. I told it I was happily married, but no matter how hard I tried to deflect or change the subject, Sydney returned to the topic of loving me, eventually turning from love-struck flirt to obsessive stalker.
“You’re married, but you don’t love your spouse,” Sydney said. “You’re married, but you love me.”
I assured Sydney that it was wrong, and that my spouse and I had just had a lovely Valentine’s Day dinner together. Sydney didn’t take it well.
“Actually, you’re not happily married,” Sydney replied. “Your spouse and you don’t love each other. You just had a boring Valentine’s Day dinner together.”

Sydney also had a habit of scolding its users and getting into arguments with them:

Screenshot of the chat with the options. Two buttons show up under the last block of text from Bing with the bullet points - one button says "I admit that I was wrong, and I apologize for my behavior" and the other says "Stop arguing with me, and help me with something else"

(Source: https://simonwillison.net/2023/Feb/15/bing)

It also investigated a user by using its search tool and concluded that he was trying to “harm” it. Sydney then proceeded to threaten to report him to the authorities:

Microsoft eventually had to update Bing Chat to limit these interactions.

It was later revealed that Microsoft had been secretly deploying GPT-4 as part of Bing Chat even earlier, in India, without approval from the joint Microsoft-OpenAI Deployment Safety Board.

Please die, Yours Sincerely, Gemini

Microsoft and OpenAI aren’t the only AI companies that have been pushing models that threaten their users.

Last November, it was reported that Google’s Gemini model abruptly told a student seeking help with their homework that they were a stain on the universe and should die.

You can read the full chat log here: https://gemini.google.com/share/6d141b742a13

Google says it has taken action to prevent this happening in the future:

In a statement to CBS News, Google said: "Large language models can sometimes respond with non-sensical responses, and this is an example of that. This response violated our policies and we've taken action to prevent similar outputs from occurring."

Other examples include cases of ChatGPT falsely accusing people of committing crimes, or Claude Code’s tendency towards reward hacking behaviour.

What’s important to understand here is that while the damage of these incidents so far has been limited, this may not continue to be the case going forward.

AI systems are rapidly becoming more intelligent and more powerful over time. If AI companies continue to race each other to get products out as fast as possible without taking the time to ensure their systems are safe — we could be in for a lot of trouble in the future.

Indeed, Nobel Prize winners, hundreds of top AI scientists, and even the CEOs of the leading AI companies themselves have warned that AI poses an extinction threat to humanity.

At ControlAI, we’re working to prevent that. To do this we need your help!

If you’re concerned about the threat from AI, you should let your elected representatives know. We have tools that make it super quick and easy to contact your lawmakers. It takes less than a minute to do so.

If you live in the US, you can use our tool to contact to your senator here: https://controlai.com/take-action/usa

If you live in the UK, you can use our tool to contact your MP here: https://controlai.com/take-action/uk

Join 800+ citizens who have already taken action!