DeepSeek Coder V2, the new reference model for code

DeepSeek Code V2, an open source code model released June 17, dethrones OpenAI’s GPT-4 in benchmarks.

China is starting to catch up in the AI race. After several Chinese laboratories presented LLMs and multimodal reference models, it is the turn of code models to be disrupted. Latest example with the launch on June 17 of the second version of DeepSeek-Coder, the LLM optimized for the code of the Chinese DeepSeek. DeepSeek Coder V2 manages to outperform OpenAI’s GPT-4 Turbo on code generation tasks, a first in the world of open source AI.

DeepSeek Coder V2 goes so far as to beat GPT-4o

Specifically, DeepSeek-Coder-V2 significantly outperforms closed models like GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. DeepSeek-Coder-V2 stands out on code generation benchmarks like HumanEval (programming problem), MBPP+ (python generation problem) and LiveCodeBench (diversified code problem). This demonstrates, in practice, his advanced abilities in mathematical reasoning and code understanding.

Additionally, on code completion benchmarks like RepoBench (mainly based on Python and Java), DeepSeek-Coder-V2-Lite-Base achieves very competitive results, demonstrating its ability to grasp context and offer relevant suggestions. Finally, when it comes to code correction benchmarks like Defects4J (bug resolution), or SWE-Bench (issues collected from GitHub) DeepSeek-Coder-V2-Instruct (still) greatly outperforms its competitors. On Aids (code editing), DeepSeek Coder V2 even outperforms OpenAI’s latest: GPT-4o.

In the benchmarks, DeepSeek-Coder-V2 ultimately proves to be the reference LLM in June 2024 for code generation, automatic completion and even for code editing and bug resolution.

A powerful and efficient MoE architecture

To train DeepSeek Coder V2, DeepSeek researchers built on an intermediate version of DeepSeek V2’s weights by training it on new data. The dataset notably includes 60% source code, 10% mathematical content and 30% natural language corpus. The source code (around 1170 billion tokens) comes mainly from GitHub and CommonCrawl. Theoretically, more than 338 programming languages are represented (Python, Java, C, C++, C#, JavaScript, PHP, Ruby, Go, Swift, R, Scala, Rust, etc.) compared to 86 for the first version of the model.

DeepSeek Coder V2 is based, quite logically, on the same architecture as DeepSeek V2: the MoE. It is now an architecture that is starting to enter the codes thanks to its main advantage: its very good performance/efficiency ratio. DeepSeek Coder V2 is therefore available in two versions. Lite with 16 billion total parameters and 2.4 billion active parameters for inference. And the classic version, the most efficient, with 236 billion total parameters and 21 billion active parameters for inference. The model can handle up to 128,000 tokens, a context window large enough to process long code files.

Efficient and well-commented code

Small test: we ask GPT-4o and DeepSeek Coder V2 to generate a program in Python which takes as input a list of integers and returns the sum of all even numbers in the list.

GPT-4o:

 def sum_of_even_numbers(int_list):      # Initialize sum to 0      even_sum = 0           # Iterate over each number in the list      for num in int_list:          # Check if the number is even          if num % 2 == 0:              # Add the even number to the sum              even_sum += num           return even_sum   # Example usage  if __name__ == "__main__":      # Example list of integers      numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]      # Call the function and print the result      print("Sum of even numbers:", sum_of_even_numbers(numbers))

DeepSeek Coder V2:

 def sum_even_numbers(numbers):      """      This function takes a list of integers and returns the sum of all even numbers in the list.           :param numbers: List of integers      :return: Sum of even numbers      """      return sum(num for num in numbers if num % 2 == 0)   # Example usage:  if __name__ == "__main__":      user_input = input("Enter a list of integers separated by spaces: ")      numbers = list(map(int, user_input.split()))      result = sum_even_numbers(numbers)      print("The sum of even numbers is:", result)

GPT-4o offers code that is well structured but could be improved for more conciseness. The DeepSeek Coder V2 code is almost perfect and the comments are particularly well written. It is also the most efficient, thanks to the use of the “sum()” function. So the advantage is DeepSeek Coder V2 here.

How to use DeepSeek Coder V2?

While awaiting its wider deployment within the main cloud providers, DeepSeek Coder can be downloaded free of charge from Hugging Face. Four versions are offered. The classic version has 236 billion parameters in base (not fine-tuned) and instruct (fine-tuned for natural language interaction). The Lite version offers a base and instruct version with 16 billion parameters. Inference of the lite or classic model will still require significant hardware configuration. Only more advanced Nvidia RTX GPU models would be suitable.

For developers curious to use the model, there is a version accessible online on the DeepSeek website in the form of a conversational agent. Be careful, the Chinese laboratory explains in its conditions of use that it collects user data and in particular the prompts sent to the AI. DeepSeek, however, offers an alternative: discounted API access. DeepSeek Coder V2 is available at $0.14 per million tokens for input and $0.28 for output. The API appears, for the moment, to be hosted by Huawei Cloud in Singapore.

The price per million tokens of the models for the code. © DeepSeek

So DeepSeek Coder V2 is a great open source template for coding. Whether for generation, editing (code review) or even auto-completion. The model has even managed to impress founders of American start-ups who also develop AI for code generation, as reported by our colleagues at The Information. An insider to follow closely.

DeepSeek Coder V2 goes so far as to beat GPT-4o

A powerful and efficient MoE architecture

Efficient and well-commented code

How to use DeepSeek Coder V2?

Related posts