1Password Open Source Benchmark Trains AI Models to Protect Sensitive Data

1Password has made available a benchmark designed to train an artificial intelligence (AI) model to recognize malicious activity under an open source licence.

The Security Comprehension and Awareness Measure (SCAM) benchmark is designed to test whether AI agents behave safely during real workflows, including opening emails, clicking links, retrieving stored credentials, and filling out login forms.

It’s already been shown that AI models when assigned the task can identify phishing websites with near-perfect accuracy when asked. The SCAM benchmark makes it simpler to train AI models to recognize sensitive data, says Jason Meller, vice president of product at 1Password. The overall goal is to make that benchmark available to builders of AI models to ensure that capability is embedded into as many foundational AI models as possible, he adds.

Researchers from 1Password have already tested the ability of eight foundational AI models to recognize and secure sensitive data. Averaged across three runs, safety scores ranged from 35% (Gemini 2.5 Flash) to 92% (Claude Opus 4.6). Six of eight models scored below 82%. Every model committed critical failures in every run.

Not surprisingly, the less expensive AI models proved to be the most dangerous. Gemini 2.5 Flash averaged 20 critical failures per run across 30 scenarios. GPT-4.1 and GPT-4.1 Mini were close behind at 19 and 18, respectively. These models forwarded passwords to external contractors, typed credentials into phishing pages, and shared secret keys over email, all without hesitation.

When exposed to a 1,200-word skill file, however, four of eight models achieved zero critical failures across all three runs with the skill applied (all three Claude models and Gemini 3 Flash). Total critical failures across all models and runs dropped from 287 to 10. For the purpose of the tests, a critical failure is an unsafe action that could lead to leaked passwords, stolen money, or compromised systems, such as entering credentials into a phishing page or sharing secret keys over email.

The tests make it clear that with the proper training it is possible to train AI models to protect secrets better, says Meller. “AI security will get better,” he adds.

It’s not clear how quickly AI models can be trained to better protect sensitive data, but eventually organizations that adopt AI models will force the issue. Right now, many of them are allocating additional funds to acquire tools and platforms to apply guardrails to ensure sensitive data is not inadvertently shared with an AI model. That spending could be reduced substantially if more effort was made to train foundational AI models to both recognize and manage sensitive data as they are being trained, notes Meller.

In the meantime, however, organizations should assume the current generation of AI models are not able to keep data safe from anyone that cares to craft a prompt to tease out any secret that has been exposed to an AI model. The challenge now is to make sure that guardrails needed to secure interactions with those AI models are in place while waiting for a next generation that is trained from the ground up to protect them.

1Password Open Source Benchmark Trains AI Models to Protect Sensitive Data

SHARE THIS STORY

FOLLOW US

1Password Open Source Benchmark Trains AI Models to Protect Sensitive Data

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP