data, securiti, data lake, AI

In 2025, preventing risks from both cyber criminals and AI use will be top mandates for most CIOs.

AI has moved from being experimental to mainstream, with all the major tech companies and cloud providers making significant investments in building turnkey GenAI and AI solutions for enterprise customers. Most CXOs say they want to leverage AI – but not at the cost of damaging customer relationships, reputation and market share with irresponsible use. At the staff-level, IT professionals responsible for data and infrastructure will need to prepared as employees start sending company data to AI.

Here are some predictions based on these observations that focus on the urgency of getting AI data governance right – from systems and policies to IT skills. 

Systematic Data Ingestion for AI Will be the First Data Storage Mandate 

So far, enterprise participation in AI has been largely led by employees who are using GenAI tools to assist with daily tasks such as writing, research and basic analysis. AI model training has been primarily the responsibility of specialists – and storage IT has not been involved with AI. But this will change swiftly in the coming year.

Business leaders know that if they get left behind in the AI gold rush, they may lose market share, mindshare as well as relevance. In 2025, corporate data will be used with AI for RAG and inferencing, which will constitute 90% of AI investment over time. Everyone touching data and infrastructure will need to step up to the plate as AI tools become staple in workplaces.

Storage IT will need to create systematic ways for users to search across corporate data stores, curate the right data, check for sensitive data and move data to AI with audit reporting. And storage managers will need to get clear on the requirements to support their business and IT counterparts.  

Unstructured Data Governance Processes for AI Will Mature 

Protecting corporate data from leakage and misuse, and preventing erroneous results of AI are top of mind for executives today. A lack of agreed-upon standards, guidelines and regulations in North America is pushing these goals further out for organizations.

IT leaders can get started by leveraging data management technology to get visibility of all their unstructured data across storage silos. This will be the starting point to understanding how the growing mass of data can be governed and managed for AI.

Data classification is another key step in AI data governance, and it involves enriching file metadata with tags to identify sensitive data that cannot be used in AI programs. Metadata enrichment also aids researchers and data scientists who need to quickly curate datasets for their projects by searching on keywords that identify file contents. With automated processes for data classification, IT can create workflows to continually send protected datasets to secure locations and, separately, send AI-ready datasets to object storage where it can be ingested by AI tools.

Automated data workflow orchestration tools will be important for efficiently managing these tasks across petabyte-scale data estates. Additionally, AI-ready unstructured data management solutions willdeliver a means to monitor workflows in progress and audit outcomes for risk. 

Role of Storage Administrator Evolves to Embrace Security and AI Data Governance 

Pressing demands on both the data security and AI fronts are changing the roles of data storage professionals. The job of managing storage has evolved, with technologies now being more automated and self-healing, cloud-based and easier to manage. At the same time, there is an increasing overlap and interdependency between cybersecurity, data privacy, storage and AI.

Storage pros will need to make data easily accessible and classified for AI, while working across functions to create data governance programs that combat ransomware and prevent against the misuse of corporate data in AI. Teams will need to be on top of things like where sensitive data resides, and have tools at their disposal to develop auditable data workflows that prevent sensitive data leakage. 

Ransomware Defense of Unstructured Data Becomes More Urgent 

Historically, data protection efforts have been focused on mission-critical data because that’s the data that needs fastest restore. Yet the landscape has changed with unstructured data growing to encompass 90% of all data generated in the last 10 years. The large surface area of petabytes of unstructured data coupled with its widespread use and rapid growth make it highly vulnerable to ransomware attacks. Cybercriminals can make use of this data as a Trojan horse to infect the enterprise.

Cost-effectively protecting unstructured data from ransomware will become a critical defense tactic, starting with moving cold, inactive data to immutable object storage where modification is not possible. 

Unstructured Data Management Solutions Broaden to Serve AI Data Governance and Monitoring Needs 

The Komprise 2024 State of Unstructured Data Management report reveals that IT leaders are prioritizing AI data governance and security as the top future capability for solutions. AI data governance covers protecting data from breaches or misuse, maintaining compliance with industry regulations, managing biases in data, and ensuring that AI does not lead to false, misleading or libelous results.

Monitoring and alerting for capacity issues or anomalies which was last year’s top pick, remains high on the list along with analytics and reporting. IT and storage directors will look for unstructured data management solutions that offer automated capabilities to protect, segment and audit sensitive and internal data use in AI— a use case that is bound to expand as AI matures.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Networking Field Day

TECHSTRONG AI PODCAST

SHARE THIS STORY