“We live in an era when a non -American company carries the torch of the original OpenAi mission – a truly open research which gives power to all”Exclaims Jim Fan, Senior Research Manager and Lead of Embrowed Ai within Nvidia on Linkedin. The latter refers to Deepseek, a Chinese start-up which this week unveiled its first generation, Deepseek-R1-Zero and Deepseek-R1 reasoning models.
A publication under license MIT
Their strong point? Performances equivalent to O1-1217 of Openai, but for much cheaper … and in open source for both researchers and businesses. “In order to support the research community, we put in open source Deepseek-R1-Zero, Deepseek-R1, and six dense models (1.5b, 7b, 8b, 14b, 32b, 70b) distilled from Deepseek-R1 Based on Qwen and Llama“, Specifies the start-up. published under MIT license, Deepseek-R1 is therefore free to modify, adjust and marketing, unlike Openai and its closed ecosystem.
The blow of grace? Deepseek is only a subsidiary of High-Flyer Capital Management, a quantitative fund management company initially launched in 2015. Developing models is therefore not strictly speaking its main activity, but a project next to it by employees to use their GPUs when they are not used. This small structure therefore developed its models with performance equivalent to those of Openai by having only very limited resources, without having to invest hundreds of billions of dollars … or even hundreds of millions.
A model that incorporates several steps training
The company indicates that it was based on Deepseek-R1-Zero, a model drawn with a large-scale strengthening (RL) learning without a supervised fine adjustment (SFT) as a preliminary stage. Thanks to RL, he “Demonstrates remarkable reasoning capacities”. However, he meets challenges such as poor readability and linguistic mixtures. “”To solve these problems and further improve reasoning performance, we introduce Deepseek-R1, which incorporates training in several stages and cold start data before the RL “, Indicates the start-up.
Purely controlled by the RL, without SFT, “It reminds Alphazero – Master Go, Shogi and failures from zero, without first imitating the blows of the great human masters”, Comments Jim Fan. It is interesting to note that the time for reflection of the model increases regularly as the training progresses, which is not preprogrammed and is an emerging property.
-“Our goal is to explore the potential of LLM to develop reasoning capacities without any supervised data, focusing on their self-evolution through a purely RL process “, said the Deepseek team. Note that if the Deepseek R1 model has an architecture of 671 billion parameters and has been drawn on the basis of the Moe Deepseek V3 model, only 37 billion parameters are activated during most operations, like the V3 model .
Performances that equal those of the O1 model of Openai for 96% cheaper
On the performance side, Deepseek-R1 obtains comparable results or even superior to the O1-1217 version and O1-Mini in most benchmarks. The distilled versions also seem capable of competing with OpenAi models: for example, Deepseek-R1-Distill-Qwen-32b surpasses O1-Mini on different benchmarks, making it a new reference in so-called “dense” models . All at a much lower price for developers who would like to use it.
When a million tokens costs $ 0.55 at the start and $ 2.19 output for the Deepseek model API, the price is $ 15 input and $ 60 output for the O- API- 1. More concretely, this means that the entry and exit prices of the O-1 API are respectively 27.27 times and 27.40 times higher than those of Deepseek, or to formulate it differently, the price of O- 1 is around 2627% and 2639% higher than that of Deepseek. If we make an overall comparison of all costs for 1 million tokens, the figure is even more impressive: the Deepseek API is 96.4% cheaper than the Openai API.
Selected for you