Dataiku today added a tool to monitor the cost of invoking various generative artificial intelligence (AI) models to a mesh platform through which it provides organizations with the ability to manage multiple large language models (LLMs).
As an extension of the Dataiku LLM Mesh platform, LLM Cost Guard makes it simpler for organizations to track in real time how much is being spent on tokens that are the core of the pricing model most providers of LLMs use to monetize access to their platform.
The challenge organizations quickly encounter when invoking those LLMs is they lack visibility into costs as various teams invoke multiple LLMs at scale, says Kurt Muehmel, everyday strategic AI advisor for Dataiku. In addition to detailing costs, LLM Cost Guard makes it possible to identify potential cost overruns, he adds.
Reining in costs is especially critical for enterprise IT organizations that are trying to prioritize how limited resources will be applied, notes Muehmel. “They need visibility into where their Gem AI spending is going,” he adds.
Dataiku Mesh is a gateway through which organizations can access a secure LLM gateway to access LLMs from OpenAI, Microsoft, Amazon Web Services (AWS), Google Cloud Platform (GCP), Databricks, Anthropic, AI21 Labs and Cohere. LLM Cost Guard provides additional visibility into the LLM use cases driving the most cost, notes Muehmel.
Determining which LLMs to use based on costs that increase in direct proportion to the number of parameters used to construct an LLM is becoming more challenging as organizations move beyond experimenting with generative AI. Many use cases can be adequately handled by smaller LLMs that are much less expensive to use because the cost of the tokens used is much less expensive.
Dataiku has been making a case for a gateway through which organizations can centrally manage access to LLMs. A recent Total Economic Impact (TEI) study conducted by Forrester Consulting on behalf of Dataiku found that approach results in reducing the time data scientists and data engineers spend on data analysis and extraction by more than 70%, resulting in a more than 40% average time reduction for model lifecycle activities, including training, deployment and monitoring.
Overall, the report finds the development of products infused with AI can be reduced from 12 to 18 months to three to six months.
It’s not clear yet how many LLMs the average enterprise might wind up invoking, but given the pace of innovation occurring in the space it’s already clear organizations will need to be able to move from one LLM to another as advances continue to be made. Few enterprise organizations are going to lock themselves into a single provider when the LLM they are invoking might become obsolete overnight.
Of course, the more LLMs invoked, the more expensive generative AI becomes. The challenge now is finding a way to keep those costs under control while at the same time keeping all options as open as possible until it becomes clearer which platforms will ultimately dominate.