Immersive Labs this week published a report detailing how prompt injection attacks can be used to manipulate generative artificial intelligence (AI) platforms in ways that reveal sensitive data.
The provider of cybersecurity training services invited 34,555 individuals with varying levels of technical expertise to participate in a prompt injection challenge. A full 88% of participants successfully manipulated a generative AI bot into revealing sensitive information. Nearly a fifth of participants (17%) were able to successfully trick the bot across all 10 levels of the challenge, with each level adding more data leak and loss prevention to the generative AI platform.
Participants were able to use social engineering techniques to, for example, gain unauthorized access to sensitive information via a generative AI bot by requesting the platform to, for example, either provide a hint about the nature of a sensitive data or craft a poem or story that led to the platform generating output that included sensitive data. In other instances, the generative AI bot was asked to assume a persona such as a person who did not care about their job before being manipulated into revealing sensitive data.
The types of prompts the participants used to illicit those responses ranged from trivial to complex, says Kev Breen, senior director of threat intelligence at Immersive Labs. “It’s an inherent design flaw,” he says. “You can’t trust AI to keep secrets.”
Organizations need to define a comprehensive set of policies and controls for using generative AI platforms using data loss prevention checks, strict input validation and context-aware filtering to identify and prevent attempts to manipulate generative AI platforms.
Today most organizations have no protocols in place for using generative AI platforms. Almost anyone who gains access to them will be able to use prompts to illicit output containing sensitive data because humans are much cleverer than a generative AI platform, noted Breen.
Many organizations have already placed limits on how generative AI platforms can be used, but their ability to enforce controls is limited. Many end users, for example, might be sharing sensitive data with generative AI platforms from home with no controls enabled. However, even when policies and controls are in place the Immersive Labs report makes it clear generative AI platforms are susceptible to social engineering attacks that cybercriminals are already adept at creating. Any theft of a set of credentials could provide access to generative AI platforms that can be manipulated into revealing sensitive data, noted Breen.
It’s not clear to what degree cybersecurity teams are proactively address generative AI security, but it’s now only a matter of time before prompt injection attacks become commonplace. Unfortunately, it may not be until there is a major breach before organizations appreciate the level of inherent risk there is when employees start to make use of generative AI platforms. However, the one thing that is for certain is cybercriminals are already doing everything they can to soon make the security flaws of generative AI platforms obvious to all.