The INformation : New Security Breaches at Anthropic and OpenAI Proved Mark Zuck

New Security Breaches at Anthropic and OpenAI Proved Mark Zuckerberg Right

Hours after Anthropic said it was investigating a report that users had gained unauthorized access to its ballyhooed Mythos model, OpenAI accidentally made a slate of its own unreleased models available on its Codex app.

The breaches are a reminder that Anthropic and OpenAI have plenty of their own cybersecurity issues even as they help other organizations fortify their systems for a new era of AI-powered cyberattacks.

Anthropic has gone to great lengths in recent weeks to stress that Mythos is capable of devastating cyberattacks, and that it was only making the model available to a select few companies and government agencies as a result.

Even so, a group of unauthorized users had been accessing Mythos without Anthropic’s permission and sharing their findings in a Discord channel, Bloomberg reported Tuesday. Anthropic said in a statement that it was investigating the apparent leak, which it believes was made possible by software that made the model available to a third-party contractor.

The OpenAI leak included models named GPT-5.5, arcanine, glacier-alpha and Heisenberg, the last of which was labeled “latest frontier life science research model,” according to a screen recording a Codex user posted online. A person with direct knowledge said the leak was a significant lapse.

“Last night, for less than 20 minutes, a very small number of Codex users were able to access a limited set of non-public models due to a temporary configuration error,” said a spokesperson for OpenAI in a statement. “We quickly remediated the issue, conducted a full investigation and found no evidence of malicious or concerning use during this period. No internal systems or source code were accessed.”

Security experts say a leak like the Mythos one was all but inevitable. They’re now warning that defenders should assume Mythos—or similarly capable models—will be available to hackers imminently and should start to prepare their cyber defenses accordingly.

“Most of us believed that it was only a matter of time before we had to confront this new reality, whether it was unauthorized access or the models becoming part of the public domain,” said Andrew Rubin, CEO of cybersecurity startup Illumio.

Rubin’s company has been testing OpenAI’s forthcoming cybersecurity model, GPT-5.4-Cyber, which OpenAI also has declined to release publicly to give defenders time to prepare for the types of attacks it could facilitate. He said defenders are still figuring out how to respond to the growing AI security threat, but in the meantime companies should focus on segmenting their IT systems so that if hackers breach one part they can’t easily get into others.

“Whether it’s models from Anthropic, OpenAI, or other firms not located in the U.S., models are going to keep getting much better than the human brain has been at finding vulnerabilities, and that’s not going to slow down,” Rubin said. “I don’t think anybody really has an answer for what the operating model should look like in this new world.”

Part of the urgency from cybersecurity experts is that, even without cutting edge models like Mythos, existing models are already extremely good at sophisticated cyberattacks. A rash of hacks and breaches have hit companies including Mercor, Vercel, and the open source project Axios in recent weeks, which researchers suspect have been fueled by AI models.

“Those of us who pay attention to incident response have noticed an extreme pace of offensive operations in recent months,” said Alex Stamos, a Stanford cybersecurity researcher who previously served as Facebook’s chief security officer and is now an executive at coding security startup Corridor. “It's hard to tell whether this is because of AI or a coincidence, but it’s very possible we’re already seeing the effects of AI on hacking.”

Tech executives have long warned of this. In a 2024 essay, Mark Zuckerberg used the poor state of security in the AI industry to argue in favor of open-source AI models that anyone can download. “Some people argue that we must close our models to prevent China from gaining access to them, but my view is that this will not work and will only disadvantage the U.S. and its allies,” he wrote. “Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult.” Those comments look mighty prescient now.

Buck Shlegeris, CEO of Redwood Research, which studies cybersecurity approaches for containing AI models, said it is reasonable to think that hackers have already stolen the list of security vulnerabilities that Anthropic found using Mythos.

“Anthropic is currently not robust to high-effort security threats,” he said on The Information’s TITV this week. “I kind of have the attitude that if you aren’t going to be able to secure your model, or dangerous vulnerabilities found by your model, I kind of think you should not train the model in the first place.”

Anthropic upgraded its cybersecurity when it released Claude Opus 4 last May. The cyber measures included access controls and monitoring for the use of its models by outside parties, but Anthropic noted that it was not prepared to defend against sophisticated insiders or nation-state attackers. In February, the company said it was working on a “large, wide-reaching effort” to harden its internal systems, which it aimed to complete by July of next year.

Going forward, we’re willing to bet that Anthropic won’t repeat its apparent mistake in giving dozens of outside partners access to a powerful, unreleased model the same way again. If only to make nation-state attackers that want the model work a little harder!