With Gemma 2, Google wants to set new standards in AI models

Google has unveiled Gemma 2, the latest version of its suite of powerful and efficient AI models for researchers and developers. Designed for fast and cost-effective inference, these new models developed by DeepMind set new standards for performance.

What you must remember :

Gemma 2 outperforms similar sized models, including the 27B which competes with models twice its size.
This suite is designed to run efficiently on common GPUs and TPUs, significantly reducing deployment costs.
It is compatible with major AI frameworks and available under a commercially-friendly license.
It includes advances in security and tools for responsible AI development.

A new open model standard focused on efficiency and performance

In its official publication, Google praises the merits of Gemma 2 which will offer exceptional performance with its 9B and 27B models. More specifically, the model 27B (27 billion parameters) offers a competitive alternative to equivalent models more than twice its size. For his part, the model 9B (9 billion parameters) outperforms competitors like the Llama 3 8B, setting new performance standards for its category.

The Model 27B is designed to run efficiently in full-precision inference on a single Google Cloud TPU host, an NVIDIA A100 80GB or H100 Tensor Core GPU. This specificity could allow the deployment of more accessible and economical AI.

Optimized for high inference speeds, Gemma 2 is designed to work with a wide variety of hardwarefrom laptops to cloud infrastructures. It is also possible to test it with Google AI Studio or with Gemma.cpp on a local CPU.

Gemma 2 will soon include a 2.6B parameter model, to further bridge the gap between lightweight accessibility and heavy performance.

Open Source AI and more responsible development

Note that Gemma 2 is available under licence « commercially-friendly »providing developers and researchers with the ability to share and commercialize their innovations. It is compatible with major AI frameworks such as Hugging Face, JAX, PyTorch, and TensorFlow via Keras 3.0, vLLM, Gemma.cpp, Llama.cpp, and Ollama. Gemma 2 is also optimized for NVIDIA TensioRT-LLM. Additionally, starting next month, Google Cloud users will be able to easily deploy and manage Gemma 2 on the Vertex AI development platform.

Google also provides resources for developers to help them Building and deploying AI responsiblyincluding a responsible generative AI toolkit.

Attention to safety

Google followed rigorous internal processes to ensure security when training Gemma 2, including pre-training data filters and extensive testing to identify and mitigate potential biases and risks. The results are published on many public benchmarks relating to safety and representational harm.

According to Google, since its initial launch, Gemma has been downloaded more than 10 million times, giving rise to countless inspiring projects. For example, the Navarasa Project used Gemma to create a model rooted in India’s linguistic diversity.

What you must remember :

A new open model standard focused on efficiency and performance

Open Source AI and more responsible development

Attention to safety

Related posts