Storage: DDN recovers $300 million to develop its AI offering

Sunday 12th January 2025 09:09 PM

DDN, a historic provider of storage bays for supercomputers, has just received an investment of $300 million from the American fund Blackstone. The objective is to enable DDN to transform its leadership in supercomputers into leadership in storage solutions for AI.

If in both cases the aim is to produce excessively fast storage arrays to power high-performance GPUs, the approaches are different. A supercomputer reads a small number of mathematical formulas and produces huge simulation data. In AI, we do the opposite: we must read a huge set of data very quickly to produce a smaller synthetic model (training) or generate the response to a prompt on the screen (inference with RAG).

An offer that already goes from supercomputing to AI

For the supercomputer market, DDN sells EXAscaler arrays that run under Luster, an open source, parallelized file system that originated in the early 2000s. An EXAscaler array is a cluster of several disk nodes. Among them, a node only serves to index the contents of others, much like the directory area on a classic file system. The calculation servers query this metadata server to know on which node to read/write the blocks of a file, then they communicate directly with the correct node during the transfer of the blocks of this file.

For the system to work, the compute servers must carry a Luster client and have a direct network connection with the storage nodes. This is typically an InfiniBand network, with no packet loss, with the ability for the controller card to copy data directly to the host machine’s RAM or NVMe SSDs.

DDN has implemented this know-how in AI400X2 storage bays designed for AI processing. These are the same EXAscaler 2U nodes, but with Nvidia SpectrumX Ethernet controller cards. Equipped with BlueField DPUs from the same Nvidia, these cards provide the same benefits as Infiniband on an Ethernet network, more suitable for corporate servers. Their RoCE (RDMA over Converged Ethernet) protocol also works without packet loss, with direct writing of data into the memory of Nvidia GPU cards (GPUdirect protocol).

DDN even already has solutions for inference

The AI400X2 are primarily designed to communicate as quickly as possible with GPUs, when training an AI model. But they are very expensive to store the enormous amount of data that a company wants to submit on a daily basis to an already trained model. For this second use case, DDN has been offering Infinia arrays since 2023. These work in object mode, with a basic S3 protocol, which allows disk nodes to be hot added.

DDN has divided each S3 storage function into a container: the metadata server, the storage server, etc. So much so that DDN can reproduce with its Infinia an operation similar to Luster, as long as certain functional S3 containers are installed on the calculation servers. The Infinia bays have the advantage of also being equipped with SpectrumX cards to maximize transfer speed.

Finally, DDN prides itself on knowing better than anyone how intensive storage works. When GPUs write data in parallel that they then reread to continue their calculations, several inconsistency problems can arise. These problems are usually resolved by regular Checkpoints, an operation that is potentially very computationally intensive while generating no useful data. DDN claims to know how to avoid these delays by providing transfer flows, which it orchestrates with clever use of caches.

An investment that primarily benefits Blackstone

Not only does DDN already have an AI offering, but it also already sells it to large clients. Among them, xAI, Elon Musk’s company which deployed an AI supercomputer, Colossus, equipped with 100,000 H100 GPUs. In fact, the usefulness of this new investment of $300 million is not very clear.

It is likely that the motivation comes mainly from the Blackstone investment fund which is seeking to place its pawns – it joins the board of directors of DDN – in several strategic AI companies. Last year, the fund also offered financial support to CoreWeave, an infrastructure-on-demand (IaaS) host that is only used for AI processing.

In any case, DDN is now maintaining on its site the suspense of a phenomenal announcement for AI, on February 20. If it is in line with the strategy that Blackstone seems to want to push, it should be an “AI” storage product for all businesses.