edge, apps, IoT, AI, AI platform, GenAI, AI, Generative AI, AI adoption, AI PC

Hitachi Vantara has generally made available an infrastructure platform optimized for running artificial intelligence (AI) workloads.

Based on the NVIDIA DGX BasePod reference design, the first instance of a Hitachi iQ platform includes Hitachi Content Software for File, a distributed file system designed for high-performance computing (HPC) environments.

In addition, Hitachi Vantara is also launching a new AI Discovery Service through which it provides access to consultants that help assess the AI readiness of an organization and then help them deploy AI models.

The Hitachi iQ platform is the first in a series of infrastructure offerings that will be optimized for AI, says Jason Hardy, Hitachi Vantara CTO for AI. “We’re making implementation easier,” he says.

That effort comes at a time when a recent survey found that 40% of business and IT executives are not well-informed regarding the planning and execution of GenAI projects. Nevertheless, nearly two-thirds (63%) report their organizations have already identified at least one use case for GenAI, with the most cited use cases centered around process automation and optimization (37%), predictive analytics (36%) and fraud detection (35%). Nearly all respondents (97%) said GenAI is among their organization’s top five priorities, but only slightly more than one-third (37%) believe their infrastructure and data ecosystem are well prepared for implementing generative AI applications.

It’s not immediately clear which teams within an organization will be acquiring and managing AI infrastructure, but increasingly IT operations is assuming responsibility for managing the platforms running AI inference engines, while data science teams focus more of their efforts on training AI models.

Over time, those IT operations teams will be augmented by co-pilots and AI agents that will make it simpler to manage the HPC platforms needed to optimally run AI software, notes Hardy.

Each organization will, of course, need to determine for itself at what rate to acquire that infrastructure. Many of the generative AI models being built today are still in the proof-of-concept (PoC) stage of being deployed, but in time the inference engines that these models are based on will need to be deployed close to where the data used to drive them is located, most of which still reside in on-premises IT environments.

In some cases, those AI models will need to take advantage of GPUs, but it’s also becoming apparent that other classes of processors that are less costly and more readily available can be used to run inference models.

Of course, the size of AI models being deployed will also vary widely. Rather than necessarily building and deploying large language models (LLMs), many organizations will opt to deploy smaller language models that are specifically trained for specific domains. Many of those models will be based on foundational LLMs that have been customized for that specific purpose.

Regardless of approach, the number and types of AI models that will be deployed in IT environments is about to explode. The challenge now is finding the best way to optimally deploy those models on platforms that are specifically designed to run them.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Mobility Field Day

TECHSTRONG AI PODCAST

SHARE THIS STORY