
Databricks this week launched a series of initiatives, including a beta release of an Agent Bricks framework that makes it simpler to create and modify artificial intelligence agents using techniques developed by Mosaic AI Research using multiple types of large language models (LLMs).
Announced at the company’s Data + AI Summit conference, Agent Bricks enables end users to describe a task they want to automate, which Agent Bricks will then use synthetic data to create an AI agent. An end user can then refine that AI agent by exposing it to additional data until the desired level of accuracy is achieved.
Additionally, DataBricks is making available a series of AI agents it has already trained, including an Information Extraction Agent that structures documents such as PDF files; a Knowledge Assistant Agent that exposes enterprise data to other AI agents; a Multi-Agent Supervisor that orchestrates workflows across multiple AI agents; and a Custom LLM Agent for creating content.
Databricks has also updated its MLflow monitoring framework to add support for AI agents. Coupled with integrated prompt management, quality metrics, human feedback and LLM-based evaluation, teams can easily visualize, compare and debug the performance of AI agents.
At the same time, Databricks also revealed that a managed Lakebase service that reinvents how an open-source Postgres relational database processes transactions is now available for public preview. Instead of relying solely on Postgres to process transactions, Databricks has added an additional layer that makes it possible to run online transaction processing (OLTP) applications using data storage in low-cost object file systems. That layer of software is based on technology that Databricks gained last month with the acquisition of Neon, a provider of a serverless computing framework designed for Postgres databases that can process transactions in less than 10 milliseconds.
The overall goal is to make it simpler to infuse analytics into OLTP applications at scale in a way that reduces the total cost of building these types of applications. Lakebase also provides access to a branching capability that makes it possible to copy-on-write database clones that can be used to run, for example, application tests. Lakebase also provides an online feature store for model serving.
Databricks is also previewing Databricks One, a generative artificial intelligence assistant, dubbed AI/BI Genie, for interacting with dashboards using natural language queries that is integrated with identity access management (IAM) tools, and Lakeflow Designer, a nocode extract, transform and load (ETL) tool that makes use of generative artificial intelligence to provide a natural language chat interface. Lakeflow is a framework for orchestrating ETL workflows, including connectors required, that is now generally available. In addition, the ETL framework for Apache Spark Declarative Pipelines, which the company developed, is now available under an open-source license.
Databricks is also making available a free edition of its data lake as part of a $100 million data and AI education initiative, allied with Google to bring Gemini AI models to the Databricks service, and extended an alliance with Microsoft through which Databricks is made available on the Azure cloud.
Finally, Databricks announced that Unity Catalog, a data governance framework, now provides in preview full support for Apache Iceberg tables, including native support for the Apache Iceberg REST Catalog application programming interfaces (APIs) along with access to a Discover tool for organizing data.
Databricks CEO Ali Ghodsi told conference attendees the single most important strategy decision any enterprise IT organization can make today is to make sure that their data is stored using open data formats such as Apache Iceberg or Delta Lake. At this point, there are only nominal differences between the two formats, he added.
Regardless of format, any organization that does not adopt these open source formats will inevitably find itself locked into a specific platform that makes it too difficult to easily share data across multiple applications, noted Ghodsi.
It’s not clear how widely AI agents are being employed, but The Futurum Group projects AI agents will drive $6 trillion in economic value by 2028. The one certain thing is that the platform that provides those AI agents with access to the most data is likely to have an inherent advantage over rivals that are training AI agents on generic data pulled from anywhere they might find.