In continuation of its efforts to help organizations deploy AI rapidly, Cisco is introducing FlexPod with NVIDIA NIM Cisco Validated Design (CVD) for Retrieval Augmented Generation (RAG) pipelines.
Soon to join Cisco’s archive of published AI-ready infrastructure solutions, the new CVD will provide reference architectures and guidelines for building high-performance, scalable computing infrastructures for generative AI models.
“Model performance is tightly coupled to the infrastructure”, said Siva Sivakumar, VP of Compute Product Strategy, while presenting the solution at the AI Field Day event in California, hosted by Tech Field Day, part of The Futurum Group.
FlexPod with NIM, a converged infrastructure product, combines a mix of powerful solutions from Cisco’s partners like NVIDIA and NetApp. The stack leverages NVIDIA Inference Microservices (NIM) and is designed for accelerating deployment of foundational models and NVIDIA L40S GPUs known for exceptional graphics and AI performance, with Astra, NetApp’s storage orchestrator for containers.
The solution is deployed on Cisco Unified Compute System (UCS) X-Series modular servers and the Nexus network fabric, both of which form the foundation of Cisco AI-ready infrastructure portfolio.
“AI infrastructure is quite different and unique and that presents IT with new things to do,” Sivakumar said.
Optimized for AI computation, the X-Series servers provide dual functionalities of blade and rack systems packing compute density, scalability and storage all within one unit. The systems additionally allow GPUs to be plugged in via PCIe node for extra compute capacity.
“We support a full breadth of GPUs as a sidecar to compute in the X-Series,” he said. “Once connected, the system bridges the two together.”
As modular entities, the GPUs work independently while sitting adjacent to the compute.
Infrastructure Manuals for Fast and Frictionless Deployment
“Putting together all the pieces and personas that make up what is a fully operated AI system is by no means a trivial undertaking,” Sivakumar says.
The process entails drafting architecture blueprints, figuring out what hardware and software solutions to use, testing and tuning components, configuring the network and finally deploying everything cohesively.
This system should have all the compute, network and storage capabilities required to meet the demands of AI workloads, and it must work seamlessly with the rest of the infrastructure.
“IT is saddled with new silos of platforms and infrastructure and the more we go from training to fine tuning to actually having a real-time system that is deployed in inferencing, it is a really significant challenge to integrate everything and make it work with the existing systems.”
The Cisco Validated Design program creates a catalog of pretested hardware and software system designs to make AI deployments predictable and successful for customers who do not have the skills and expertise to navigate the technical complexities. The designs provide customizable blueprints with workflows, guides and supporting documentation that can be used repeatably to deploy and integrate technologies with existing systems.
Also included are product requirements, best practices, optimizations and automation scripts to support the systems throughout their lifecycles.
All Cisco Validated Designs are rigorously tested and validated for the concerned use cases. For example, the hardware components are stress-tested by Cisco and its partners to ensure that they deliver adequate performance for all intended scenarios.
“When a customer deploys these CVDs, we come in and look at everything from the whole stack perspective,” he says.
Cisco is supported by many other industry players like Red Hat, Pure Storage, Cloudera and Nutanix with whom it has dozens of converged and co-supported architecture bundles and solutions like the FlexPod.
“We have partnerships across ISVs and OSVs, and of course with the GPU vendors for the UCS servers.”
Approximately 266 million companies worldwide use or plan to use AI in their business operations. Use cases across key sectors like retail, life sciences, healthcare, finance, manufacturing, agro, energy and utility are rising, but enterprises’ readiness to handle AI workloads on their infrastructures trails at 13%.
Enterprise-ready converged AI infrastructure solutions deliver end-to-end technology stacks primed for AI workloads ready to be used by organizations. Out-of-the-box pretested and pre-validated designs make their deployment that much simpler and easier to get through.
Check out Cisco’s presentations on AI-ready infrastructure solutions from an AI Field Day event at Techfieldday.com, and visit Cisco.com for more collateral on Cisco Validated Designs.