Commvault today revealed it is now making it possible to use backup data to train artificial intelligence (AI) models and govern agentic workflows without impacting production environments.

Additionally, Commvault is making available an AI Studio tool for creating AI agents along with a repository to access them. IT teams will be able to build custom agents that automatically and securely invoke a Model Context Protocol (MCP) server to access enterprise data.

Finally, Commvault has added an AI Protect tool to identify vulnerabilities, understand the impact of agent-driven changes, recover affected applications, and perform full-stack recovery across AI environments. AI Protect in addition to discovering and creating an inventory of AI agents provides an ability to map their activity.

At the core of these offerings is a Data Activate capability that can classify and curate copies of data and convert them into Apache Iceberg or Parquet data formats that can be readily used to train large language models (LLMs), says Commvault Field CTO Vidya Shankaran.

Previously, Commvault provided a Data Activate capability that could be used to provide AI applications with access to data residing in production environments. Now that capability is being extended to backup data to reduce any potential disruption that might occur when AI applications access data residing in production systems, says Shankaran.

The overall goal is to make it simpler for IT teams to both more safely train the LLMs and limit the types of data that might be exposed to an AI agent, she adds.

It’s not clear to what degree IT teams are revamping data management in the age of AI, but it’s apparent that many workflows will need to either be extended or reengineered altogether. In fact, a recent Futurum Group report projects the global data intelligence, analytics, and infrastructure (DIAI) market will grow at a 17% compound annual growth rate through 2028 off a base of $541.1 billion in 2026 to exceed $1.2 trillion by 2031.

AI development and operations are specifically forecasted to increase (24%), while demand for tools needed to observe data will see a similar spike (22%) in 2026. There will be increased demand (19%) for data management tools that operate at the semantic level to provide a higher level of abstraction above the raw data stored in, for example, a data lake.

In comparison, demand for data integration tools and storage platforms will grow at a slower 12% and 11% rate, respectively, in 2026. However, as the volume of data being generated using AI tools continues to increase, the data storage platform market will be growing at a rate of 18% by 2030, according to the report.

In general, the role of IT teams is evolving in the age of AI. As data management continues to advance, there will be a fundamental shift away from manual data engineering workflows as more IT teams embrace automated extract transform and load (ETL) pipelines, also known as Zero-ETL. In effect, data engineering teams are evolving into shepherds of data that is increasingly being used to drive AI applications and agents. “It’s a natural progression,” says Shankaran.

The challenge and the opportunity, of course, is to make these adjustments before the amount of data that needs to be managed overwhelms existing workflows and processes.