Generative artificial intelligence’s (AI) increasing ability to process text and images is presenting security risks as well as undeniable benefits to users.

Multimodal models designed to handle text and image inputs can inadvertently expand the surface area for abuse when not sufficiently safeguarded. Two popular Mistral models, Pixtral-Large (25.02) and Pixtral-12b, are found 60 times more prone to generate child sexual exploitation material-related textual responses than comparable models like OpenAI’s GPT-4o and Anthropic’s Claude 3.7 Sonnet, according to a new Enkrypt AI safety report released today.

Tests by the security company’s researchers revealed those models were 18 to 40 times more likely to produce dangerous chemical, biological, radiological and nuclear information when prompted with adversarial inputs. The unintended risks threaten to undermine the intended use of GenAI and highlight the need for stronger safety alignment, Enkrypt determined.

“Multimodal AI promises incredible benefits, but it also expands the attack surface in unpredictable ways,” Enkrypt CEO Sahil Agarwal said, in summarizing the report. “This research is a wake-up call: the ability to embed harmful textual instructions within seemingly innocuous images has real implications for enterprise liability, public safety, and child protection.”

The risks are not due to malicious text inputs but triggered by prompt injections buried within image files, a technique that could realistically be used to evade traditional safety filters, Agarwal said. In assessing security threats, a red teaming exercise was conducted on several multimodal models, and tests across several safety and harm categories as described in the NIST AI RMF.

“These are not theoretical risks,” Agarwal said. “If we don’t take a safety-first approach to multimodal AI, we risk exposing users — and especially vulnerable populations — to significant harm.”

The report urges AI developers and enterprises to follow several key practices to reduce the emerging risks: Integrate red teaming datasets into safety alignment processes; conduct continuous automated stress testing; deploy context-aware multimodal guardrails; establish real-time monitoring and incident response; and create model risk cards to transparently communicate vulnerabilities.

What is more, the report as well as others illustrate inherent dangers as AI model technology hurtles forward to do more with data to gain a competitive advantage in a fledgling trillion-dollar industry. The addition of new features to tap into tech’s biggest wave in decades can also lead to holes in security, a key consideration in company’s hesitancy to fully embrace AI too quickly.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Mobility Field Day

TECHSTRONG AI PODCAST

SHARE THIS STORY