Oracle today announced the general availability of HeatWave GenAI, a managed database-as-a-service (DBaaS) platform that includes embedded large language models (LLMs).
The approach enables organizations to deploy LLMs at the point where their data already resides using standard CPUs rather than requiring graphical processor units (GPUs), says Nipun Agarwal, senior vice president for MySQL Database and HeatWave development at Oracle.
The HeatWave platform already includes HeatWave Lakehouse, HeatWave Autopilot, HeatWave AutoML, and HeatWave MySQL alongside a vector store that can be used to expose unstructured data to an LLM.
The in-database LLM enables IT teams to search data, generate or summarize content, and perform retrieval-augmented generation (RAG). In addition, they can combine generative AI with other built-in HeatWave capabilities such as AutoML to build applications that either invoke an LLM in a database or a foundational model running on a Generative AI service hosted on the Oracle Cloud Infrastructure (OCI) service. Oracle HeatWave is also available on Amazon Web Services (AWS) and Microsoft Azure or can be deployed in an on-premises IT environment.
Regardless of platform, IT teams will no longer need to export data to another platform to customize LLMs, notes Agarwal. Instead, the LLM is now essentially a feature of the database platform, he adds. “We’re democratizing LLMs,” says Agarwal.
All the steps needed to create a vector store and embeddings are automated and executed inside the database, including discovering the documents in object storage, parsing them, generating embeddings and inserting them into the vector store.
Additionally, organizations can perform semantic queries using standard SQL. In-memory hybrid columnar representation and the scale-out architecture of HeatWave enable vector processing to execute at near-memory bandwidth and parallelize across up to 512 HeatWave nodes. IT teams can also combine semantic search with other SQL operators to, for example, join several tables with different documents and perform similarity searches across all documents.
There is also a HeatWave Chat tool, plug-in for MySQL Shell built using VSCode to create a graphical interface for HeatWave GenAI that can be used to launch either natural language or SQL queries. An integrated Lakehouse Navigator enables users to select files from object storage and create a vector store. Users can search across the entire database or restrict the search to a folder. HeatWave maintains context with the history of questions asked, citations of the source documents, and all the prompts used to better verify the source of answers generated by the LLM.
Oracle claims creating a vector store for documents in PDF, PPT, WORD and HTML formats is up to 23X faster with HeatWave GenAI and 1/4th the cost of using Knowledge base for Amazon Bedrock. HeatWave GenAI is also 30 times faster than Snowflake and costs 25% less, 15X faster than Databricks and costs 85% less, and 18X faster than Google BigQuery and costs 60% less when using a variety of similar search queries on tables ranging from 1.6GB to 300GB in size, claims Oracle.
In addition, Oracle is claiming HeatWave Gen AI is 10 to 80 times faster than Amazon Aurora PostgreSQL with pgvector while generating more accurate outputs.
Each organization will need to determine what size LLMs makes the most sense to deploy in a database, but as generative AI continues to evolve, it’s apparent that the need to depend on an external service to add those capabilities to an application is going to considerably less.