Perplexity releases an assistant for Android phones. OpenAI’s Operator AI agent can “go to the web to perform tasks for you”. Meta’s AI ambitions include a massive data center. Google Gemini now understands your smart home. These artificial intelligence (AI) developments in recent days, would have been significant in isolation and as a collective. Except they aren’t.
DeepSeek, a relatively unknown AI setup from China, has had an impact on all the metrics that are supposed to matter — the cost of training, access to hardware, capability and availability. It has got OpenAI, Perplexity, Google, Meta, Nvidia and everyone else, quite worried. There’s no letting up, if the Silicon Valley biggies were hoping for one. Another Chinese firm Moonshot AI, has released a chatbot called Kimi Chat, which supposedly has the same capabilities as OpenAI’s latest generation o1 large language model (LLM).
DeepSeek claims to have spent around $5.5 million to train its V3 model, a considerably frugal approach to delivering the same results, that took the likes of Google, OpenAI, Meta and others, hundreds of millions of dollars in investments to achieve. I was reading through some technical information put out by DeepSeek, and here’s something that stood out —the frugality of hardware too.
“I was trained on a combination of Nvidia A100 and H100 GPUs,” the DeepSeek chatbot tells us. It doesn’t share an exact number, and this is specific to the R1 model.
DeepSeek CEO Liang Wenfeng is a billionaire, who runs a hedge fund and is funding DeepSeek that reportedly hired top talent from other Chinese tech companies including ByteDance and Tencent.
A tryst with censorship
Censorship with regard to information about its home country, seems to be the underlying theme with DeepSeek’s chatbot. When HT asked, the DeepSeek chatbot on the web didn’t hesitate from listing human rights concerns including the treatment of Uyghur Muslims in Xinjiang, internet censorship, the urban-rural divide and the housing market complexities, as well as an ageing population as some of the controversies and challenges that China faces.
At that moment, the entire response was erased, with a simpler answer — “Sorry, that’s beyond my current scope. Let’s talk about something else”, the clear, crisp message. Before that, no other query elicited a similar reaction.
When HT probed further about the treatment of Uyghur Muslims in Xinjiang, a similar “Sorry, that’s beyond my current scope. Let’s talk about something else” response after the entire detailed reply was deleted automatically.
However, the question about challenges faced by the Chinese economy (a point raised in response to the first question about challenges faced by China; answer to which was later erased) wasn’t run through the typical censorship filter and lists slowing growth, property market crisis, youth unemployment and debt levels are some of the factors. Neither was there any censorship on questions about economic and societal challenges faced by India and the US, for example.
Statistics, data, and a new template
The Chinese AI company has been at it for a while now, just that no one was really looking. Or no one really cared. The DeepSeek Coder was released in late 2023, and through 2024, that was followed up by the 67-billion parameter DeepSeek LLM, DeepSeek V2, a more advanced DeepSeek Coder V2 with 236 billion parameters, the 671 billion parameter DeepSeek V3 as well as the 32 billion and 70 billion models of the DeepSeek R1.
“A joke of a budget,” is how Andrej Karpathy, founder of EurekaLabsAI describes this achievement. He isn’t the only one.
“DeepSeek is now number 1 on the App Store, surpassing ChatGPT—no NVIDIA supercomputers or $100M needed. The real treasure of AI isn’t the UI or the model—they’ve become commodities. The true value lies in data and metadata, the oxygen fuelling AI’s potential,” writes Marc Benioff, CEO of Salesforce, in a post on X.
-The economics of AI will quite rapidly change now, irrespective of how much longer it takes for the Silicon Valley biggies to recover their AI training costs. Take this cost comparison, which will prove relevant for consumers, small businesses and pretty much anyone who wants to integrate AI within their apps.
DeepSeek R1’s API costs just $0.55 per million input tokens and $2.19 per million output tokens. In comparison, OpenAI’s API usually costs around $15 per million input and $60 per million output tokens.
Much like OpenAI’s o1 model, the R1 too uses reinforced learning, or RL. This means, models learn through trial and error and self-improve through algorithmic rewards, something that develops reasoning capabilities. Models learn by receiving feedback based on their interactions. Experience is also usually how humans learns the ways of the world.
With R1, DeepSeek realigned the traditional approach to AI models. Traditional generative and contextual AI is akin to writing every number with 32 decimal places. DeepSeek’s approach brings this down to 8, without compromising accuracy. In fact, it is better than GPT-4 and Claude in many tasks. The result, as much as 75% lesser memory needed to run AI.
Then there is the multi-token system that read entire phrases and set of words at one, instead of in sequence and one by one. That means AI will be able to respond twice as fast.
DeepSeek’s Mixture-of-Experts (MOE) language model is an evolution too. DeepSeek V3 for instance, with 671 billion parameters in total, will activate 37 billion parameters for each token—the key is, these parameters are the ones most relevant to that specific token.
“Instead of one massive AI trying to know everything (like having one person be a doctor, lawyer, and engineer), they have specialised experts that only wake up when needed,” explains Morgan Brown, VP of Product & Growth – AI, at Dropbox. Traditional models tend to keep all parameters active for each token and query.
Trust factor, or a lack of?
There is of course, the apprehension which DeepSeek, Moonshot AI and all other tech companies from China will have to contend with. Questions about any Chinese tech company’s proximity (known, or otherwise) with the government will always be in the spotlight when it comes to sharing data. Be it harmless user data, or the data that small businesses may run through an AI assistant at their disposal.
There is of course the concern about Chinese tech’s access to latest generation GPUs and AI chips in general. The trade restrictions, cannot be ignored. SemiAnalysis’ Dylan Patel estimates DeepSeek has 50,000 Nvidia GPUs, and not 10,000 as some online chatter seems to suggest.
The Nvidia A100 (around $16,000; launched in 2020) and H100 (a $30,000 chip launched in 2022) aren’t cutting edge chips compared to what the Silicon Valley has access to. One could always wonder how this Chinese tech company laid their hands on this set of Nvidia hardware too—and how many.
The company hasn’t officially detailed these specifics. They confirm the use of 2048 H800 GPUs for DeepSeek V3 model training. It is unlikely we’ll ever know if any other hardware was in the play, and how all of it has been sourced. And with it, the true cost of making R1, and the models that preceded it.
The US-China trade restrictions has limited access to cutting-edge hardware from US tech companies including Nvidia, Qualcomm and Intel. Yet, companies such as Huawei have adapted. The blacklisted company at one point struggled to sell smartphones, but since made significant strides with Harmony OS as an alternative to Google Android, Mate XT Ultimate Design tri-fold smartphone, completed a $1.4 billion research center in Shanghai and is providing AI chips to Chinese tech companies.