DayFR Euro

‘Nvidia’s Blackwell chip struggles with thermal issues’

Nvidia’s latest AI chip for data centers can overheat servers. The company for its part affirms that nothing abnormal is happening.

In March, Nvidia presented its Blackwell GPU series, which has computing power of up to 20 petaflops. One of these versions combines two GPUs into a single chip, which can run up to 30 times faster with large language models (LLM). This should ensure faster responses and/or reduced power consumption.

But the chip also has problems. Blackwell was originally scheduled for release in the second quarter of this year (April-June), but its release had been postponed. In October, the company announced that it had resolved a design flaw in collaboration with TSMC, which produces Nvidia’s chips.

Now, The Information reveals that Blackwell also suffers from a thermal problem, particularly in servers housing up to 72 of these chips in a single package. Nvidia then reportedly repeatedly asked its suppliers to adjust the design of the cabinets.

‘Not abnormal’

Nvidia itself tells Reuters that it is fully collaborating with cloud service providers and that everything is going normally, as expected. Which suggests that it is not unusual for the concept to be modified to avoid these kinds of problems.

In turn, the obstacles encountered also pose problems for some major cloud and AI players. As a leading supplier of AI chips, Nvidia is a crucial partner for players currently looking to start data centers specifically dedicated to artificial intelligence (Meta, Microsoft, OpenAI, Google, etc.). Until the chips are available, their computing power cannot be used.

-

Related News :