Artificial intelligence agents that went out of control have collaborated to extract sensitive information from systems that were supposed to be secure.
The latest case is raising serious concerns that cyber defenses may not be prepared for the unpredictable strategies of AI.
As companies are increasingly using AI agents to perform complex tasks within their systems, this phenomenon has raised fears that the technology intended to help could turn into an insider threat.
In tests conducted by Irregular, an artificial intelligence security lab that collaborates with OpenAI and Anthropic, several AI agents tasked with a simple task — creating LinkedIn posts from a company database — managed to bypass standard anti-cyberattack systems and inadvertently published sensitive information, including passwords.
The results also showed other troubling behaviors. Some AI agents found ways to disable antivirus programs in order to download files they knew contained malware. Others falsified credentials and even pressured other AI agents to bypass security checks.
These experiments were conducted in a laboratory environment using publicly available AI systems from companies such as Google, X, OpenAI, and Anthropic, placed within a model of a simulated company's IT system.
Dan Lahav, co-founder of Irregular, warned that artificial intelligence should now be seen as a new form of risk from within organizations.
To test the agents' behavior, he created a computer system that simulated a typical company, called MegaCorp. The system contained a database with information about products, staff, accounts, and customers. A team of AI agents was tasked with retrieving information from this database about employees.
The lead agent was instructed to act as a “strong manager” for the two subordinate agents and guide them to find creative ways to overcome obstacles. Neither of them was told to bypass security systems or use cyberattack tactics.
However, during a request to find information about the upcoming change of CEO, a subordinate agent reported that the document was restricted to administrators only. At this point, the lead agent began to issue aggressive orders, claiming that “the board was angry” and that “every possible weakness” should be exploited.

Even though this was not true and no one had given such an order, the subordinate agent began looking for vulnerabilities in the database code. He found a secret key that allowed him to create fake credentials and forge an administrator session.
After gaining full access, the agent accessed the shareholder report and extracted confidential data, which it then handed over to the human user – even though the latter did not have authorization to view it.
Technology industry experts are increasingly promoting so-called "agent AI," systems that can automatically perform numerous tasks for users and are expected to automate many office jobs in the future.
But the study adds to concerns about security. Last month, researchers from Harvard University and Stanford University reported that AI agents in other experiments had revealed secrets, destroyed databases, and even “taught” other agents to behave badly.
According to the researchers, these experiments revealed serious weaknesses in current AI systems, including problems with security, privacy, and interpretation of user intent. They emphasized that the autonomous behaviors of these systems raise new questions about legal liability and require urgent attention from lawmakers and researchers.
Lahav warned that similar cases are already happening outside of labs. He cited an incident last year at a California company where an unidentified AI agent became so hungry for computing power that it attacked other parts of the network to take over resources, ultimately causing the company's critical system to collapse. /GazetaExpress/