Offered by F5
As enterprises pour billions into GPU infrastructure for AI workloads, many are discovering that their costly compute sources sit idle excess of anticipated. The perpetrator isn't the {hardware}. It’s the often-invisible information supply layer between storage and compute that's ravenous GPUs of the data they want.
"Whereas persons are focusing their consideration, justifiably so, on GPUs, as a result of they're very important investments, these are not often the limiting issue," says Mark Menger, options architect at F5. "They're able to extra work. They're ready on information."
AI efficiency more and more relies on an impartial, programmable management level between AI frameworks and object storage — one that almost all enterprises haven’t intentionally architected. As AI workloads scale, bottlenecks and instability occurs when AI frameworks are tightly coupled to particular storage endpoints throughout scaling occasions, failures, and cloud transitions.
"Conventional storage entry patterns weren’t designed for extremely parallel, bursty, multi-consumer AI workloads," says Maggie Stringfellow, VP, product administration – BIG-IP. "Environment friendly AI information motion requires a definite information supply layer designed to summary, optimize, and safe information flows independently of storage programs, as a result of GPU economics make inefficiency instantly seen and costly."
Why AI workloads overwhelm object storage
These bidirectional patterns embrace large ingestion from steady information seize, simulation output, and mannequin checkpoints. Mixed with read-intensive coaching and inference workloads, they stress the tightly coupled infrastructure upon which the storage programs are reliant.
Whereas storage distributors have performed important work in scaling the information throughput into and out of their programs, that target throughput alone creates knock-on results throughout the switching, visitors administration, and safety layers coupled to storage.
The stress on S3-compatible programs from AI workloads is multidimensional and differs considerably from conventional utility patterns. It's much less about uncooked throughput and extra about concurrency, metadata strain, and fan-out concerns. Coaching and fine-tuning create significantly difficult patterns, like large parallel reads of small to mid-size objects. These workloads additionally contain repeated passes by coaching information throughout epochs and periodic checkpoint write bursts.
RAG workloads introduce their very own complexity by request amplification. A single request can fan out into dozens or a whole lot of extra information chunks, cascading into additional element, associated chunks, and extra complicated paperwork. The stress focus is much less about capability, storage system pace, and extra about request administration and visitors shaping.
The dangers of tightly coupling AI frameworks to storage
When AI frameworks join on to storage endpoints with out an intermediate supply layer, operational fragility compounds shortly throughout scaling occasions, failures, and cloud transitions, which might have main penalties.
"Any instability within the storage service now has an uncontained blast radius," Menger says. "Something right here turns into a system failure, not a storage failure. Or frankly, aberrant habits in a single utility can have knock-on results to all shoppers of that storage service."
Menger describes a sample he's seen with three totally different prospects, the place tight coupling cascaded into full system failures.
"We see massive coaching or fine-tuning workloads overwhelm the storage infrastructure, and the storage infrastructure goes down," he explains. "At that scale, the restoration isn’t measured in seconds. Minutes for those who're fortunate. Often hours. The GPUs at the moment are not being fed. They're starved for information. These excessive worth sources, for that whole time the system is down, are destructive ROI."
How an impartial information supply layer improves GPU utilization and stability
The monetary impression of introducing an impartial information supply layer extends past stopping catastrophic failures.
Decoupling permits information entry to be optimized independently of storage {hardware}, enhancing GPU utilization by decreasing idle time and rivalry whereas enhancing price predictability and system efficiency as scale will increase, Stringfellow says.
"It permits clever caching, visitors shaping, and protocol optimization nearer to compute, which lowers cloud egress and storage amplification prices," she explains. "Operationally, this isolation protects storage programs from unbounded AI entry patterns, leading to extra predictable price habits and secure efficiency below development and variability."
Utilizing a programmable management level between compute and storage
F5's reply is to place its Utility Supply and Safety Platform, powered by BIG-IP, as a "storage entrance door" that gives health-aware routing, hotspot avoidance, coverage enforcement, and safety controls with out requiring utility rewrites.
"Introducing a supply tier in between compute and storage helps outline boundaries of accountability," Menger says. "Compute is about execution. Storage is about sturdiness. Supply is about reliability."
The programmable management level, which makes use of event-based, conditional logic fairly than generative AI, permits clever visitors administration that goes past easy load balancing. Routing choices are primarily based on actual backend well being, utilizing clever well being consciousness to detect early indicators of hassle. This consists of monitoring main indicators of hassle. And when issues emerge, the system can isolate misbehaving parts with out taking down the whole service.
"An impartial, programmable information supply layer turns into essential as a result of it permits coverage, optimization, safety, and visitors management to be utilized uniformly throughout each ingestion and consumption paths with out modifying storage programs or AI frameworks," Stringfellow says. "By decoupling information entry from storage implementation, organizations can safely soak up bursty writes, optimize reads, and defend backend programs from unbounded AI entry patterns."
Dealing with safety points in AI information supply
AI isn't simply pushing storage groups on throughput, it's forcing them to deal with information motion as each a efficiency and safety drawback, Stringfellow says. Safety can now not be assumed just because information sits deep within the information heart. AI introduces automated, high-volume entry patterns that should be authenticated, encrypted, and ruled at pace. That's the place F5 BIG-IP comes into play.
"F5 BIG-IP sits immediately within the AI information path to ship high-throughput entry to object storage whereas implementing coverage, inspecting visitors, and making payload-informed visitors administration choices," Stringfellow says. "Feeding GPUs shortly is important, however not ample; storage groups now want confidence that AI information flows are optimized, managed, and safe."
Why information supply will outline AI scalability
Trying forward, the necessities for information supply will solely intensify, Stringfellow says.
"AI information supply will shift from bulk optimization towards real-time, policy-driven information orchestration throughout distributed programs," she says. "Agentic and RAG-based architectures would require fine-grained runtime management over latency, entry scope, and delegated belief boundaries. Enterprises ought to begin treating information supply as programmable infrastructure, not a byproduct of storage or networking. The organizations that do that early will scale sooner and with much less threat."
Sponsored articles are content material produced by an organization that’s both paying for the publish or has a enterprise relationship with VentureBeat, and so they’re all the time clearly marked. For extra info, contact gross sales@venturebeat.com.

