Dell Technologies revealed today that it is working with Meta to make instances of open source Llama 2 large language models (LLMs) available on servers that can be deployed in on-premises IT environments.
The goal is to provide organizations with an ability to build generative AI applications by using their own data to extend an LLM using retrieval augmentation generation (RAG) techniques that rely on vector databases to enable LLMs to employ data beyond the set it was initially trained on, says Matt Baker, head of AI strategy for Dell.
Organizations will also need to learn how to deploy and manage some type of vector databases to customize an existing LLM by presenting their own unstructured data in a format that an LLM can recognize. It then uses that external data alongside the data it was originally trained on to generate better informed responses and suggestions. Organizations can then go a step further by using a framework such as LangChain to build and deploy an AI application.
The Dell approach provides the added benefit of lowering the total cost of building generative AI applications by employing an open source model under a commercial license that Meta provides versus paying for usage using a pricing model that requires IT organizations to track tokens issued by a cloud service provider, he adds
In addition, organizations that pursue an on-premises approach to generative AI don’t need to be as concerned about data sovereignty, data privacy and intellectual property issues that might arise when relying on a general-purpose generative AI cloud service such as ChatGPT.
Llama 2 models will also make it simpler to deploy AI models running on an inference engine deployed at the network edge, notes Baker
Rather than build or customize an LLM, most organizations are going to leverage RAG techniques to build their first generative AI application. Pre-installed Llama 2 LLMs on Dell servers will make generative AI accessible to a much wider range of organizations using an LLM that has been made available under an open source license rather than am external service that needs to be called via, for example, an application programming interface (API), says Baker. “Meta is making its LLMs available under generous commercial licensing terms,” he adds.
It’s not clear at what pace organizations are building their own generative AI applications. Most organizations are at the very least experimenting with generative AI, but the number of them that have the data scientists, data engineers, application developers and cybersecurity experts required to build and deploy generative AI applications in production environments is still relatively few.
Most of those LLMs, however, will have some type of derivative child relationship with a parent foundation LLM, but as each one is updated, the need to meld machine learning operations (MLOps) and IT service management and DevOps workflows will become more pronounced.
In the meantime, IT leaders should be working toward determining how they operationalize what one day soon will become hundreds of LLMs being used to automate any number of existing and, as yet, unimagined business processes.