starburst, AI,

Citi’s strategic investment underscores the value of processing data where it lives, while new AI Agent and workflow capabilities aim to democratize AI across enterprises.

Starburst, the company built on the open-source Trino query engine, announced a comprehensive suite of AI capabilities designed to help enterprises accelerate AI adoption without the traditional barriers of data centralization. The “Lakeside AI” approach allows organizations to process data where it resides — whether on-premises, in the cloud, or in hybrid environments — addressing a fundamental challenge for enterprises implementing AI at scale.

AI Unleashed 2025

“AI is raising the bar for enterprise data platforms, but most architectures aren’t ready,” said Justin Borgman, CEO and co-founder of Starburst. “At the end of the day, your AI is only as powerful as the data it can access. Starburst is removing the friction between data and AI by bringing a distributed, hybrid data lake.”

Strategic Investment From Citi

Highlighting the platform’s enterprise value, Starburst also announced a strategic investment from Citi, building on a multi-year relationship where Starburst has become the bank’s enterprise-wide data access solution across all lines of business in 150 countries.

For Citi’s anti-money laundering (AML) team, Starburst provides instant access to data across more than 100 companies, helping avoid potentially hundreds of millions in regulatory fines. This approach gives Citi a critical advantage since Borgman noted that “banks may be siloed when it comes to their data approach, but the criminals aren’t.”

Comprehensive AI Workflow Capabilities

The company’s new AI offering introduces three key capabilities:

1. Starburst AI Agent – A conversational interface for governed natural language data product documentation and insight generation that works within secure environments.

2. AI Workflows – A suite of capabilities including:

AI Search – Vector embedding generation and search capabilities with embeddings stored directly in Apache Iceberg tables, eliminating the need for separate vector databases

AI SQL Functions – Tools that enable SQL users to leverage LLMs without prompt engineering expertise

AI Model Access Management – Governance controls for LLM usage

3. Air-Gapped AI Deployment – The ability to deploy the entire AI stack on-premises is critical for regulated industries like financial services.

“We think about the enterprises that we serve; they’re going to have dozens, possibly hundreds or more SQL developers,” explained Nathan Vega, senior director of product marketing at Starburst. “It’s complicated to find AI-specific engineers for prompt engineering, so taking a common resource that’s in most businesses and empowering them to use this new technology will accelerate adoption.”

Democratizing AI With SQL Integration

A particularly innovative aspect is how Starburst bridges the gap between traditional SQL expertise and modern AI capabilities. The platform provides pre-tested, task-specific AI functions, including sentiment analysis, classification and translation functions, that SQL developers can use without prompt engineering knowledge.

This democratization extends to data governance through AI-powered auto-tagging, automatically identifying and classifying sensitive data across 20+ categories, including PII. The system supports custom classifiers to address industry-specific sensitive data types, enabling secure data sharing inside and outside organizations.

Unified Platform Approach

Starburst is positioning these capabilities as part of a comprehensive data platform that reduces the proliferation of point solutions. Borgman believes the standalone vector database market will shrink as platform vendors incorporate these capabilities.

“By being integrated, you’ve got one access control mechanism, so everything is secured in the same way and enforced across all the data you’re working with,” Borgman explained. “And then economically, you’re removing individual line items in your budget and getting greater economies of scale.”

The platform approach extends to Starburst’s broader ecosystem, with additional announcements including a new Starburst Data Catalog, enhanced Iceberg lakehouse support, ODBC performance improvements and automated query routing capabilities.

This approach appears to resonate with enterprises trying to balance innovation with the practical realities of distributed data. As one head of AML technology at Citigroup noted, “I’ve talked to a lot of vendors over the last 18 years, and nothing came close to what Starburst can do.”

These announcements signal Starburst’s evolution from being “the Trino company” to a full-fledged data platform for apps and AI, competing more directly with Snowflake and Databricks while maintaining its commitment to open standards like Apache Iceberg.

The new features will be generally available starting May 19, 2025, with some capabilities in private or public preview.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Security Field Day

TECHSTRONG AI PODCAST

SHARE THIS STORY