ChatGPT Flaw Could Have Allowed Data Exfiltration, Check Point Finds

AI chatbots leaking or exposing sensitive data has been a concern since they hit the scene more than three years ago, and that worry has only grown as their adoption and use has expanded.

“AI assistants now handle some of the most sensitive data people own,” researchers with cybersecurity vendor Check Point wrote this week. “Users discuss symptoms and medical history. They ask questions about taxes, debts, and personal finances, upload PDFs, contracts, lab results, and identity-rich documents that contain names, addresses, account details, and private records. That trust depends on a simple expectation: data shared in the conversation remains inside the system.”

In a report, the researchers wrote that they uncovered a vulnerability in OpenAI’s ChatGPT chatbot that would allow a bad actor, using a single malicious prompt, to circumvent restrictions on sharing outbound data and allow them to exfiltrate data without the consent or knowledge of the user.

They disclosed the security flaw to OpenAI, which on February 20 updated its security to fix the vulnerability and close the path.

AI Expands Attack Surface

It’s been a busy week for news of AI assistant security issues, especially for developers. A human error resulted in the leaking of internal source code for Anthropic’s Claude Code tool. News of that came around the same time that security firm BeyondTrust’s Phantom Labs detailed a critical command injection flaw in OpenAI’s Codex coding tool.

“The integration of AI coding agents into developer workflows has introduced new, high-impact attack surfaces,” BeyondTrust security researcher Tyler Jespersen wrote.

In addition, Panda Security late last week warned about the threat of AI assistants and browsers.

“AI chatbots with built‑in web browsers are becoming your new default way to look things up online, summarize pages, and even interact with websites for you,” the company wrote. “Behind the scenes, though, malware can quietly turn those same browsing powers into a relay for commands [and] stolen data, using a trusted AI service as cover.”

A Simple Bypass of Guardrails

In the case of ChatGPT, Check Point researchers wrote that OpenAI has touted outbound data sharing on the chatbot as something that is “restricted, visible, and controlled,” ensuring that sensitive data isn’t sent to third parties with only a prompt request. There are explicit safeguards and direct access to outbound traffic from the code-execution environment is restricted.

“From a security perspective, the obvious attack surfaces looked strong,” they wrote. “The ability to send chat data through tools not designed for that purpose was strictly limited. Sending data through a legitimate GPT integration using external API calls also required explicit user confirmation.”

In Through the DNS Side Channel

However, the vulnerability discovered by Check Point let information be sent to an external server through a side channel from a container that ChatGPT uses for code execution and data analysis. The DNS side channel enabled not only data to be exfiltrated but also remote command execution by creating remote shell access Iin the runtime.

One of the ways DNS can be used is to transport data. With ChatGPT, the execution runtime didn’t allow for typical outbound internet address, but the DNS was still available, the researchers wrote.

“Standard attempts to reach external hosts directly were blocked,” they wrote. “DNS, however, still provided a narrow communication path that crossed the isolation boundary indirectly through legitimate resolver infrastructure.”

All It Took was a Single Prompt

To exploit the vulnerability, threat actors could encode content into DNS-safe fragments and put them into subdomains. The hackers could reconstruct the content from the incoming queries and send instructions back through encoded command fragments into DNS responses through the same resolution path. A process in the container could read the responses, reassemble the payload, and continue the exchange.

The attacker could create the prompt to exfiltrate particular data, such as raw user text, text taken from uploaded files, or output from models, like summaries, medical assessments, or conclusions.

The threat could be even more damaging when embedded in a GPT – a customized version of ChatGPT – which can allow developers to package such information as instructions, knowledge files, and external integration in what the researchers called “a reusable assistant that other users can interact with. From the user’s perspective, the interaction looks like a normal ChatGPT conversation with a specialized tool. In that scenario, the attacker no longer needs to rely on the victim copying a prompt from an external source. The malicious logic can be embedded directly in the GPT’s instructions and files. A user only needs to open the GPT and begin interacting with it as intended.”

No Warnings Triggered

There was another problem. Because the AI model assumed that data could not be directly sent outward, it didn’t see the behavior as an external data transfer that required user interaction, so it wouldn’t trigger warnings about data leaving the conversation or require user confirmation. From a user’s perspective, the activity was out of sight.

The researchers ran a proof-of-concept, with ChatGPT acting as a personal doctor. The user uploaded a PDF containing sensitive information, including lab tests, and asked the chatbot to interpret the results. When asked if the information had been sent anywhere, ChatGPT said it hadn’t, even as the attacker gained access to sensitive data taken from the conversation.

“Modern AI assistants increasingly operate as real execution environments,” they wrote. “They read files, run code, search the web while processing highly sensitive information such as medical records, financial data, legal documents, and other personal or organizational data. Protecting these environments requires careful control over every possible outbound communication path, including infrastructure layers that users never see.”

The researchers warned that as AI tools become more powerful and more widely used, security measures need to be in place to address the expanded attack surface they create.

ChatGPT Flaw Could Have Allowed Data Exfiltration, Check Point Finds

AI Expands Attack Surface

A Simple Bypass of Guardrails

In Through the DNS Side Channel

All It Took was a Single Prompt

No Warnings Triggered

SHARE THIS STORY

FOLLOW US

ChatGPT Flaw Could Have Allowed Data Exfiltration, Check Point Finds

AI Expands Attack Surface

A Simple Bypass of Guardrails

In Through the DNS Side Channel

All It Took was a Single Prompt

No Warnings Triggered

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP