The State of Real-Time Data

The notion of real-time is (almost, but not quite) instantaneous, in the world of real-time data, if not also in the realm of real-time responsiveness outside of the human body. In fact, real-time data does move with a delay, but it is only milliseconds in duration; this means it is a small enough unit of time for computers to process applications that appear to be working at lightning speed.

Given the need for always-on apps that function in the flick of an eye, real-time data (and the platforms that feed and manage it) have given software engineers and data science teams a way to produce a new breed of supercharged services that span both enterprise and consumer markets. Given the growing ubiquity of this approach to information management, where do we stand with the state of real-time data in 2025?

The co-founder of the data streaming platform company Confluent is Jay Kreps. Always vocal on this subject, Kreps has called the use of data streaming tools an essential means of running the “central nervous system with intelligent connective tissue” for a modern business.

In software-defined businesses that aim to work above the batch processing methods of pre-millennial times, data streaming is inherently more active than passive. It enables data teams to process and analyse data events in motion rather than after they happen. Kreps has described this as a way of building “connected islands” across all of an organisation’s disparate applications, data systems and microservices throughout every company department.

Real-Time Data, In Context

Looking deeper, Kreps states that the rise of generative AI is the single biggest driver for the adoption of data streaming right now. Because AI is only ever as “intelligent” as the information that feeds it, AI needs real-time data in context and delivered in high quality. Data streaming provides a real-time stream that enables services such as hyper-personalization. The next stage here sees automated streaming agents capable of executing actions on their own, which are argued to be highly effective in use cases such as fraud detection.

“We started Confluent to take on one of the hardest problems in data: Helping information move freely across a business so companies can act in real time,” said Confluent’s Kreps. “That same foundation positions Confluent to close the AI context gap. Off-the-shelf models are powerful, but without the continuous flow of data, they can’t deliver decisions that are timely and uniquely valuable to a business. That’s where data streaming becomes essential.”

Shift-Left, Avoid Prototype Purgatory

Kreps says that we now need to shift left on data processing and governance to eliminate duplicate pipelines, reduce the risk and impact of bad data at the source, and enable a business to use high-quality data products for both operational and analytical use cases. He suggests that now is a real-time data processing inflection point as this technology moves from a niche position to become a mainstream requirement.

“Agentic AI is on every organization’s roadmap. But most companies are stuck in prototype purgatory, falling behind as others race toward measurable outcomes,” said Shaun Clowes, chief product officer at Confluent. “Even your smartest AI agents are flying blind if they don’t have fresh business context.”

To make all this happen, Confluent this year introduced Streaming Agents on Confluent Cloud, a service that allows teams to build, deploy and orchestrate event-driven agents natively on Apache Flink. Embedded in data streams, Streaming Agents can monitor and act on what’s happening in the business in real time to power intelligent context-aware automation.

But once again, we know that agents are only as useful as the data they can access. Because Streaming Agents live inside the event streams at the core of a business, they process events as they happen rather than operating on static snapshots. According to the Confluent team blog, this fresh context gives Streaming Agents the most accurate, up-to-date understanding of a business, improving decision quality whether automating anomaly investigation, driving real-time personalization, responding to customer activity, or adapting to operational changes.

“With advancements in agentic AI and data processing automation, real-time data processing and streaming analytics capabilities are essential,” states the IDC MarketScape: Worldwide Data Platform Software 2025 Vendor Assessment. “The ability to process data as it enters the system is crucial for time-sensitive applications such as fraud detection, personalisation and operational monitoring, where delays can result in lost opportunities or increased risks.”

The Death Of Batch?

Attendees at any of Confluent’s annual user conferences or symposia will be well-used to picking up “Batch Is Dead” type stickers for their laptops; the company has been widely quoted on the subject and suggests that traditional batch processing (processing large volumes of data in batch groups often during off-peak hours, usually overnight to maximise system efficiency) is, for the most part, a thing of the past.

Industry counterparts appear to mostly agree.

“I see batch processing becoming obsolete for most real-time use cases,” said Anil Inamdar, global head of data services at NetApp Instaclustr. “Modern enterprises operate in a world of continuous streams where insights, alerts and customer actions need to happen as data arrives. Pure open source technologies like Kafka, OpenSearch and Spark Streaming have made real-time architectures practical and scalable. The shift enables timely decisions, context-aware automation and data freshness that batch pipelines can’t deliver. The future of data won’t be processed overnight; it’ll be processed as it happens.”

Ashok Reddy, CEO of high-performance time-series database company KX, agrees that we’re further along in the real-time revolution for enterprises, but not all the way there yet.

“Most organizations can now stream, process and analyze data in real time, but the real breakthrough will come when systems don’t just react instantly, but reason over time. The shift to adaptive, time-aware intelligence is redefining what ‘real-time’ means. Batch isn’t just dying, it’s being replaced by continuous, closed-loop feedback where decisions are made closer to the moment of action, creating competitive advantage through speed, accuracy and adaptability. Real-time isn’t just faster reporting, it’s becoming the operating system for modern enterprises.”

A Mixed Bag: Real-Time Reality

Tim Yocum, director of engineering at InfluxData, says that “batch will never die” and that always-on (or real-time) will serve an equally impactful role in any company’s analytics portfolio.

“In practice, many organizations use data warehouses and ETL jobs that are batch-driven alongside real-time data systems. There are huge cost trade-offs, so you have to design a backend architecture that can accommodate both strategies without siloing datasets by technology used,” said Yocum.

He reminds us that for some applications, like industrial IoT, real-time is table stakes because data can be acted on in far less time than historical data imports. Imagine a factory robot that’s suffering a timing malfunction – seeing that trend in real-time can prompt a fix that could avoid costlier failures. In the same manufacturing scenario, you may very well have offline machinery that brings in telemetry on a scheduled basis.

“A hybrid of batch and real-time, where the speed of ingest and analysis of historical data makes a difference. Purpose-built time series databases are often well-suited to this hybrid approach, since they can handle both real-time and historical workloads efficiently. Organisations need solutions that reduce time to detection across all data (regardless of how it arrives), without requiring a rearchitecture of the data pipeline or platform to get there,” said InfluxData’s Yocum.

He insists that there is “no clear winner” between batch and real-time, only strategies to accommodate both efficiently, economically and at scale.

Stream-First Backbone

CEO and founder of Streamnative, a real-time data streaming platform based on Apache Pulsar, is Sijie Guo. He argues that AI value shows up when agents ship outcomes, not when pipelines finish – and that demands a stream-first backbone.

“Kafka-compatible ingress for continuity, a leaderless architecture that scales compute and storage independently and direct writes to Iceberg/Delta for online–offline parity. On top, replayable, event-driven agents act on fresh context and can rehearse decisions from the exact same log for safety and audit. This collapses ETL layers, reduces failure domains, and cuts tail latency – turning real-time data into production AI products that are faster, cheaper and demonstrably governed,” said Guo.

An ex-Confluent employee, Hojjat Jafarpour, is the founder of DeltaStream. Jafarpour says that real-time data has moved from a technical ambition to a business necessity. “In 2025, companies can no longer afford to wait on batch systems to drive decisions, whether it’s preventing fraud in financial services, optimizing supply chains, or powering generative AI agents,” he said. “The shift we’re seeing is not just speed, but accessibility: SQL-first, cloud-native approaches are making streaming as easy to adopt as databases once were. The next frontier is context, especially for generative AI agents. Organizations are asking not just for real-time data, but for real-time understanding, so that every application and every customer interaction can be intelligent by default.”

Grethe Brown, CEO of DiffusionData, thinks that the real-time data streaming market is at a pivotal inflection point. At DiffusionData, she says that the team has spent 17 years perfecting what others are just beginning to recognize i.e., that real-time isn’t just about speed, it’s about relevance.

Buckle Up, Data In-Flight

“When we talk about ‘real-time,’ we’re not just referring to millisecond delivery; we’re talking about maintaining the contextual integrity of data across its entire journey. This distinction is crucial in a world where AI systems are making increasingly consequential decisions,” said Brown. “What differentiates DiffusionData in this landscape is our unique architecture designed specifically for the last mile of data delivery. While platforms like Confluent have built remarkable capabilities for high-throughput data streaming across the enterprise backbone, DiffusionData excels at the edge – where data meets applications, users and increasingly, AI systems. Our technology was purpose-built to handle the challenges that emerge at this critical junction: Managing millions of concurrent connections, dynamically filtering and transforming data in-flight and ensuring secure, reliable delivery even in constrained environments.

“This is why I see our relationship with Confluent as fundamentally complementary. Confluent’s Kafka ecosystem excels at the heavy lifting of enterprise data movement – the reliable, scalable backbone for data streaming. DiffusionData’s technology extends this capability to the edge, enabling organizations to maintain that same level of reliability and intelligence all the way to the point of action. Together, we represent the complete data streaming journey – from source systems through enterprise pipelines to the ultimate points of consumption and decision-making,” asserted Brown.

She suggests that the implications for AI are particularly significant. When AI is built on batch-processed data or periodic updates, it’s essentially making decisions based on history rather than reality. This disconnect fuels irrelevant recommendations and could potentially lead to outdated business outcomes. Brown says that her team’s vision is to create the real-time nervous system for intelligent enterprises – where data flows continuously and contextually from source to decision point, whether that decision is made by a human, an algorithm, or increasingly, an AI agent.

Our Real-Time Future

Returning to where Confluent sees the market for real-time information management moving next, the company used Confluent Current 2025 to announce Delta Lake and Databricks Unity Catalog integrations in Confluent Tableflow, along with Early Access availability on Microsoft OneLake. Tableflow simplifies access to data that can be used to inform and instruct analytics and AI applications; the technology also enhances data accessibility, governance and integration.

These advancements are said to make Tableflow a fully managed, end-to-end solution that connects operational, analytical and AI systems across hybrid and multicloud environments. Confluent now gets Apache Kafka topics directly into Delta Lake or Apache Iceberg tables with automated quality controls, catalog synchronization, and enterprise-grade security.

“Customers want to do more with their real-time data, but the friction between streaming and analytics has always slowed them down,” said Confluent CPO Clowes. “With Tableflow, we’re closing that gap and making it easy to connect Kafka directly to governed lakehouses. That means high-quality data ready for analytics and AI the moment it’s created.”

What comes next is anyone’s guess, i.e., will current (pun not intended) real-time technologies see us through the next decade, or will we need to reinvent the current batch (pun not intended) of platforms and look to richer agentic streams, possibly powered by still-emerging quantum technologies? Whatever happens, things – and data- will move fast.

The State of Real-Time Data

Real-Time Data, In Context

Shift-Left, Avoid Prototype Purgatory

The Death Of Batch?

A Mixed Bag: Real-Time Reality

Stream-First Backbone

Buckle Up, Data In-Flight

Our Real-Time Future

SHARE THIS STORY

FOLLOW US

The State of Real-Time Data

Real-Time Data, In Context

Shift-Left, Avoid Prototype Purgatory

The Death Of Batch?

A Mixed Bag: Real-Time Reality

Stream-First Backbone

Buckle Up, Data In-Flight

Our Real-Time Future

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP