Pure FlashBlade, Pure Storage

AI’s need for storage is on a steady climb as industries graduate from simpler, baseline models to more elegant and robust ones. But it’s not just the growing hunger for capacity; wide variability across the pipeline is making it tricky for companies to design storage solutions for AI workloads.

The main motivation should be to guarantee reliability and resilience, not extreme performance, says Pure Storage. Enterprises, as they move along their AI journey, require an adaptable storage solution that can “deliver predictable performance than peak burst performance that tapers off under duress, at scale or in failures,” said Pure Storage’s lead principal technologist, Hari Kannan.

Although the AI revolution kicked off in earnest only a few years back, in the intervening months, innovation vaulted off at a freakish pace. “It’s been really exciting to see how this industry has transformed,” Kannan said at AI Data Infrastructure Field Day hosted by Tech Field Day, an arm of The Futurum Group, where Pure Storage showed off the FlashBlade platform.

“Six to seven years ago, we were doing a lot of work on convolutional neural networks (CNNs), images, advanced driver assistance systems (ADAS) and so on. About 18 months ago, a ton of that shifted over to text-based LLMS and you could fit all of Wikipedia within 5 terabytes of storage. So the needs of storage changed quite rapidly.”

This shift has caused many storage solutions to fall in and out of favor quickly.

“Infrastructure for AI is evolving at a much more rapid clip than anything we’ve ever seen before. So, it’s incumbent on us as providers in this ecosystem to be fungible and flexible to adapt to these evolving needs and be able to serve them as the technology advances,” Kannan stresses.

Depending on AI applications, storage needs of workloads can be extremely varied. At the basic, AI depends on a storage system that can handle quick transfer of data to the GPUs, keeping them churning round the clock. But at an atomic level, ingestion, processing, training, checkpointing, inferencing, each has its own unique and granular demands of throughput and latency. These needs are greatly influenced by model parameters and weight, datasets, and associated factors.

“The ideal storage system more than just services reads at peak bandwidth to GPUs. It participates in the entire pipeline performing all of these operations simultaneously,” Kannan said.

A few core elements bring this ideal storage solution to life – predictable scalability so that the system can grow when required, consistent performance to be able to handle all operations concurrently without being stilted or strained, low wattage because AI’s power demands can run amok very quickly. On the whole, flexibility should be woven into its foundation so that it can power through frequent changes without breaking a sweat.

Designed with the must-haves of a high-performance storage, Pure Storage FlashBlade is an enterprise-grade all-flash file and object storage platform that checks all the boxes, said Kannan. The platform’s two biggest selling points are reliable performance, and power and space efficiency. These elements stand FlashBlade out in a growing sea of cookie-cutter storage solutions for AI.

The most important innovation is Direct Flash Modules (DFMs), Pure Storage’s proprietary SSD equivalents that power the platform.

FlashBlade is built from the ground up to deliver extreme performance across multiple dimensions, think reads, writes, streaming, metadata, and so on, but evenly and predictably across the pipeline.

Pure Storage keeps its simple by abstracting away technical complexities as much as possible. FlashBlade does not require to be tuned and retuned to fit changing use cases, Kannan told. Out-of-the-box, the system comes with the in-built resilience to adjust continuously to a diversity of use cases.

Energy and space efficiency of the platform are driven primarily by DFM which, according to Pure Storage, is three to five folds more energy efficient compared to commodity SSDs.

Pure Storage’s biggest rollout to date is a 150TB DFM which ships at a max number of 40 drives per chassis.

Kanan shared data points that showed that DFM, on a module basis, is 5 to 7 times more reliable, and boasts the longest lifespan among SSDs and HDDs.

FlashBlade affords users a flexible and economical scaling model. Users can add individual drives to the blade, or add whole blades depending on their requirements.

FlashBlade is offered in two flavors that cater to two primary classes of users – FlashBlade//S which is performance-optimized and well-suited for high-performance file and object workloads, and FlashBlade//E which is an all-flash repository designed to capture the economies of scale.

“[It] allows you to pick optimization points across the entire price, performance, cost, power spectrum to tailor it towards the customer’s needs,” said Kannan.

Check out Pure Storage’s presentations from the AI Data Infrastructure Field Day event to learn about the engineering aspect, and what AI use cases are being powered by FlashBlade.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

AppDev Field Day 2

TECHSTRONG AI PODCAST

SHARE THIS STORY