Researchers Create Self-Replicating Worm for GenAI Systems

The trumpeted release of ChatGPT by OpenAI was almost immediately followed by worries about security – including leaks of sensitive data used to train the models – and cyberthreats, from phishing attacks to data poisoning to bad actor tools like WormGPT.

Now researchers are demonstrating a new threat, a computer worm they created that can use “adversarial self-replicating prompts” that can be used to attack generative AI-powered applications in large-language models (LLMs) like ChatGPT and Google’s Gemini to steal data, spread propaganda and spam other email systems to launch phishing attacks.

The new zero-click AI worm – called “Morris II,” after the first known computer worm, Morris, that appeared in 1988 – was shown in test environments to be able to spread through the malicious self-replicating prompts via text and image inputs in a jailbreaking fashion to others by exploiting the highly connective nature of the generative AI ecosystem.

The worm was created by Stav Coen from the Israel Institute of Technology, Ben Nassi from Cornell and Ron Bitton from Intuit. In their summary of the research, the three said the worm targets generative AI applications and that they demonstrated it against the email assistants and three models, ChatGPT 4.0, Gemini Pro and the open source model LLaVA.

While the demonstration of the self-replicating worm – as outlined in the research paper and in a video – targeted generative AI email assistants, the goal was to show the threat such malware poses to generative AI applications in general, they wrote in their research paper.

A Warning

“The message that we want to deliver is related to the rise of new risk in the GenAI era: The rise of worms for GenAI applications and ecosystems, whose development and deployment increase every day,” they wrote. “This work is not intended to argue against the development, deployment and integration of GenAI capabilities in the wild. Nor it is intended to create needed panic regarding a threat that will doubt the adoption of GenAI.”

Instead, they wrote, “the objective of this paper is to present a threat that should be taken into account when designing GenAI ecosystems and its risk should be assessed concerning the specific deployment of a GenAI ecosystem.”

In the test environment, the researchers created an email system using the generative AI models and were able to insert the adversarial self-replicating prompts as inputs into the models, prompting the model to replicate them as outputs, sending them off to other systems and delivering the malicious payloads.

Worm Moves From One GenAI System to Another

The self-replicating prompts were delivered either in text or embedded in images and responses to the prompts from other systems included such sensitive information as emails and phone numbers.

The researchers argue against claims that worms aimed at generative AI systems might not be able to be created in the wild or, if that’s possible, that there would be limited effect, as seen by prior worms exploited in the wild, like Mirai and ILOVEYOU. However, they said that generative AI worms likely will appear in the next few years and “will trigger significant and undesired outcomes.”

Threat Against the AI Ecosystem

This has less to do with the individual models and more with the increasing connectiveness of the generative AI ecosystem, they wrote, adding that worms against the ecosystem could arise in the next to three years.

One reason is because the infrastructure, including the internet and generative AI cloud servers and the knowledge of adversarial AI and jailbreaking techniques already exist, they wrote.

In addition, “GenAI ecosystems are under massive development by many companies in the industry that integrate GenAI capabilities into their cars, smartphones, and operating systems,” they wrote, also noting that “attacks always get better, they never get worse.”

The hope is that the results of their research will be seen as a “wake-up call” to the possible threat, they wrote.

Vendors Respond

Coen, Nassi and Bitton notified both OpenAI and Google about their findings. They wrote that “Google responded and classified our findings as intended behavior, and after a series of emails that we exchanged with them, the AI Red team asked to meet us to ‘get into more detail to assess impact/mitigation ideas on Gemini.’”

A spokesperson with OpenAI told Wired that the researchers “appear to have found a way to exploit prompt-injection type vulnerabilities by relying on user input that hasn’t been checked or filtered” and that the company was working to make their systems more resilient.

Months after the release of ChatGPT – and subsequently tools from other vendors, such as Google’s Bard (now Gemini) – hackers were using malicious chatbots like WormGPT and FraudGPT.

Reports indicate that generative AI also is enabling hackers to improve the spelling, structure and grammar in their phishing emails, which makes them more difficult to detect, and is fueling a rise in business email compromise (BEC) attacks. In addition, deepfakes and voice cloning are rising threats for scams and spreading disinformation, which is particularly dangerous as the election season heats up.

The rapid innovation and adoption of generative AI technologies also will increase the risk of ransomware and other attacks around the world over the next couple of years, according to the UK’s top cybersecurity agency.

IBM last month demonstrated how generative AI can hijack live audio calls and manipulate what’s being said without participants knowing.