Researchers have demonstrated a worm that spreads through prompt injection. Details:
In one instance, the researchers, acting as attackers, wrote an email including the adversarial text prompt, which “poisons” the database of an email assistant using retrieval-augmented generation (RAG), a way for LLMs to pull in extra data from outside its system. When the email is retrieved by the RAG, in response to a user query, and is sent to GPT-4 or Gemini Pro to create an answer, it “jailbreaks the GenAI service” and ultimately steals data from the emails, Nassi says. “The generated response containing the sensitive user data later infects new hosts when it is used to reply to an email sent to a new client and then stored in the database of the new client,” Nassi says.
In the second method, the researchers say, an image with a malicious prompt embedded makes the email assistant forward the message on to others. “By encoding the self-replicating prompt into the image, any kind of image containing spam, abuse material, or even propaganda can be forwarded further to new clients after the initial email has been sent,” Nassi says.
It’s a natural extension of prompt injection. But it’s still neat to see it actually working.
Research paper: “ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications.
Abstract: In the past year, numerous companies have incorporated Generative AI (GenAI) capabilities into new and existing applications, forming interconnected Generative AI (GenAI) ecosystems consisting of semi/fully autonomous agents powered by GenAI services. While ongoing research highlighted risks associated with the GenAI layer of agents (e.g., dialog poisoning, membership inference, prompt leaking, jailbreaking), a critical question emerges: Can attackers develop malware to exploit the GenAI component of an agent and launch cyber-attacks on the entire GenAI ecosystem?
This paper introduces Morris II, the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts. The study demonstrates that attackers can insert such prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication), engaging in malicious activities (payload). Additionally, these inputs compel the agent to deliver them (propagate) to new agents by exploiting the connectivity within the GenAI ecosystem. We demonstrate the application of Morris II against GenAI-powered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images). The worm is tested against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), and various factors (e.g., propagation rate, replication, malicious activity) influencing the performance of the worm are evaluated.
More Stories
Friday Squid Blogging: Squid Sticker
A sticker for your water bottle. Blog moderation policy. Read More
Italy’s Data Protection Watchdog Issues €15m Fine to OpenAI Over ChatGPT Probe
OpenAI must also initiate a six-month public awareness campaign across Italian media, explaining how it processes personal data for AI...
Ukraine’s Security Service Probes GRU-Linked Cyber-Attack on State Registers
The Security Service of Ukraine has accused Russian-linked actors of perpetrating a cyber-attack against the state registers of Ukraine Read...
LockBit Admins Tease a New Ransomware Version
The LockBitSupp persona said LockBit 4.0 will be launched in February 2025 Read More
Webcams and DVRs Vulnerable to HiatusRAT, FBI Warns
The FBI has issued a warning about the Hiatus RAT malware targeting Xiongmai and Hikvision web cameras and DVRs, urging...
CISA Urges Encrypted Messaging After Salt Typhoon Hack
The US Cybersecurity and Infrastructure Security Agency recommended users turn on phishing-resistant MFA and switch to Signal-like apps for messaging...