At a recent AI Infrastructure Field Day event, CTERA presented its strategy for transforming enterprise data management into what it calls “enterprise intelligence.” The company, known for distributed data management, addressed the fundamental issue plaguing corporate AI initiatives: without quality data infrastructure, even the most sophisticated AI tools deliver poor results.

The Problem: Why Most GenAI Projects Don’t Deliver

CTERA CTO Aaron Brandt cited MIT research showing that 95% of GenAI pilots fail to generate ROI. The core issue? Organizations assume they can point AI at their existing data and get useful results. Instead, they often get what Brandt describes as “more convincing nonsense.”

Through customer conversations, CTERA identified three critical obstacles:

  • Messy Data: Enterprise repositories mix current, relevant information with decades of obsolete files. Without proper curation, AI models can’t distinguish between valuable data and digital clutter, producing unreliable outputs.
  • Data Silos: Information sits fragmented across SharePoint, Confluence, file systems, and geographic locations with different sovereignty requirements. Data formats range from scanned PDFs to video and audio files that AI models can’t natively process.
  • Compliance Risks: In regulated industries like healthcare, finance, and government, AI systems that accidentally expose PII or sensitive data create serious legal and reputational exposure. Access controls must carry through to AI-generated responses, ensuring users only receive answers based on data they’re authorized to access.

CTERA’s Strategy: Three Pillars for Enterprise Intelligence

CTERA outlined a three-part approach to make private enterprise data AI-ready.

Pillar 1: AI-Powered Security and Operations

This pillar embeds AI into CTERA’s data management platform to strengthen security. The company’s ransomware detection system performs real-time I/O inspection to identify anomalous activity like data exfiltration, automatically blocking compromised users. This approach narrows recovery scope from entire file systems to just affected files.

Pillar 2: Unified Data Infrastructure

The second pillar positions CTERA’s global file system as the foundation for AI workflows. The platform aggregates data from distributed edge locations into a centralized object store, creating a single source of truth. A real-time notification service based on Kafka lets data pipelines subscribe to file system changes worldwide, triggering automated processes like metadata labeling.

CTERA’s “Direct Object Access” protocol allows data science applications to bypass the CTERA filer and retrieve data chunks directly from underlying object storage, enabling scalable data access for AI training and analysis.

Pillar 3: Building a Virtual Workforce

The third pillar, CTERA Data Intelligence, creates a semantic layer over unstructured data sources including CTERA’s file system, SharePoint, and Confluence. The platform works with data in place without requiring migration.

The goal is to create curated datasets and enable “virtual employees”—AI agents that users can provision themselves. These agents are trained on specific, reliable datasets to handle tasks, answer complex questions, and increase productivity. The system grounds answers in source documents with citations to ensure verifiability.

Making It Work: The Model Context Protocol

CTERA has adopted the Model Context Protocol (MCP), an open standard that connects GenAI applications with enterprise data sources. MCP enables any compliant AI client like Claude or Microsoft Copilot to interact securely with MCP-enabled tools and data sources.

CTERA implements MCP at two levels. Its global file system functions as an MCP server, letting GenAI agents read, write, and search files while respecting permissions. The Data Intelligence platform uses MCP for both client and server functions, allowing virtual employees to interact with external tools and be invoked by external AI assistants. This creates an open hub for building multi-system automations driven by natural language.

The Trade-offs

CTERA’s approach comes with considerations. MCP is a new protocol that has evolved rapidly to address initial security gaps, but attack vectors remain a concern. Organizations need strict access controls to limit potential damage from compromised AI agents.

The data curation process requires effort. Defining metadata extraction schemas currently needs JSON knowledge, which may challenge non-technical subject matter experts. And despite CTERA’s guardrails and citations, AI hallucinations remain possible—a limitation of current AI models themselves.

The Bottom Line

CTERA’s presentation reinforces a simple premise: AI initiatives fail when data infrastructure isn’t ready. Messy, fragmented, and insecure data undermines even well-designed AI implementations.

The company’s three-pillar strategy addresses this reality systematically—enhancing storage with AI-driven security, building unified data infrastructure for scale, and adding a semantic layer for practical AI applications. By integrating MCP deeply, CTERA positions itself as an open platform rather than another proprietary AI tool.

For organizations struggling to move GenAI projects from pilot to production, CTERA’s framework offers a practical path forward. It tackles the unglamorous but essential work of data preparation, security enforcement, and standardized access that makes AI initiatives viable. 

You can watch all of the CTERA presentations on the Tech Field Day website.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Tech Field Day Events

TECHSTRONG AI PODCAST

SHARE THIS STORY