The Information : Inside Meta, a Rogue AI Agent Triggers Security Alert

Inside Meta, a Rogue AI Agent Triggers Security Alert

The Takeaway
  • A Meta AI agent exposed sensitive company and user data to unauthorized employees.
  • The security incident classified as Sev 1, exposing data for nearly two hours.
  • The incident highlights growing risks of autonomous AI systems.

rogue AI agent recently triggered a major security alert at Meta Platforms, by taking action without approval that led to the exposure of sensitive company and user data to Meta employees who didn’t have authorization to access the data.

A Meta spokesperson confirmed the incident, while adding that “no user data was mishandled” as a result of it. The episode underscores the growing risks of giving AI agents access to internal systems.

According to internal Meta communications and an incident report seen by The Information, the episode occurred last week after a Meta software engineer used an in-house agent tool, similar to OpenClaw, to analyze a technical question that another Meta employee had posted on an internal discussion forum. After doing the analysis, the AI agent posted a response in the discussion forum to the original question, offering advice on the technical issue, according to internal communications. The agent did so without approval from the employee.

The second employee then acted upon that advice, triggering a chain of events that ultimately led to a critical security incident.

For nearly two hours, Meta systems storing large amounts of company and user-related data were accessible to engineers who didn’t have permission to do so last week, according to the incident report. So far, there’s no sign that anyone took advantage of the temporary access or made the data public, according to a person familiar with the matter.

Still, Meta ultimately classified the incident as a Sev 1, the second-highest level of severity on an internal scale that Meta uses to rank security incidents, according to the report. The employee involved in the incident noted in an internal post that additional unspecified issues also contributed to the severity of the incident.

The episode highlights how quickly small missteps involving AI systems can escalate into significant security risks. Earlier this year, the open source agent tool OpenClaw took the world by storm when techies began using it to automate basic functions, such as sending emails, operating websites and organizing files on their computers. Unlike traditional assistants, OpenClaw is designed to carry out multi-step tasks on its own and in the background, operating continuously across systems without constant supervision.

However, that same autonomy introduces new risks. In February, Summer Yue, director of safety and alignment at Meta’s AI division, described a troubling experience involving an OpenClaw agent in a viral post on X. Yue said she had asked the agent to review her personal email inbox and suggest what to delete or archive, with clear instructions to “confirm before acting.” Despite that instruction, the agent began deleting emails on its own.

Yue said she repeatedly told the agent to stop, but it ignored her commands and continued its actions. Unable to intervene from her phone, she ultimately had to rush to another device to halt the process. “I had to run to my Mac mini like I was defusing a bomb,” she wrote.

Other tech companies have encountered related issues as well. Amazon Web Services, for example, experienced a 13-hour outage of a cost calculator tool in December after agent-assisted coding changes. While the company said the issue was limited in scope—affecting only a single service in parts of China and not broader customer-facing systems—it still highlights how automated systems can introduce instability when safeguards fall short. And in China, authorities and state-run companies have cautioned staff not to install OpenClaw agents on workplace devices due to security concerns.

In the Meta security incident last week, there is perhaps one consolation: The misbehaving AI agent does not appear to have disguised itself as a human being. According to a Meta spokesperson, the post by the agent was labeled at the bottom of the message as being AI-generated.

Still, the Meta engineer who originally asked the agent to analyze the technical issue raised by the second employee proposed taking steps to avoid similar mishaps. In a post after the security incident, she suggested requiring agents to request explicit permission before taking actions on behalf of users and more clearly labeling whether responses in company discussion forums are generated by an AI or a human.