After Gemini 2 a few days earlier, Google unveils Veo 2 (videos) and Imagen 3 (images), two ultra-sophisticated generation models. The timing of the announcement is certainly not a coincidence: Google is blocking OpenAI.
In the war for generative artificial intelligence, OpenAI and Google are certainly the two major figures. The first transformed the industry with ChatGPT, the second is struggling to recover its natural position of leader. Google is gradually catching up on OpenAI, particularly with its Gemini ecosystem which continues to gain capacity.
How to annoy OpenAI, which currently makes one announcement per day? By parasitizing it with its own announcements, obviously. After Gemini 2 the previous week, Google announced Veo 2 and Imagen 3, two new video and image generation models, on December 16. Its press release came out 30 minutes before day 8 of OpenAI announcements.
Veo 2: Google highlights what Sora doesn’t do well
In its press release, Google presents Veo 2 as the best video generation tool in the industry. The successor to Veo 1, which was announced in May 2024, is capable “understand real-world physics and movements, all in 4K definition” explains Sundar Pichai, the boss of Google. Why put forward these arguments? Because these are the weak points of Sora, the tool launched by OpenAI a week earlier.
Veo 2 is able to imitate cinematographic genres, reproduce the style of a lens, suggest effects and can create videos lasting several minutes, where Sora is satisfied with seconds. The examples published by Google are quite impressive, with end results that look like real videos. Google says Veo 2 hallucinates very little, reducing the risk of having a six-fingered hand. The tool can be tested with a queue, even if availability in Europe is logically blocked.
Imagen 3: Google improves its image generation model
In addition to Veo 2 for videos, Google is taking advantage of its press release to unveil Imagen 3, the new version of its image generation model. In a context of hype around Grok and Elon Musk, with its model which replicates well-known faces, Google says it offers a model capable of “generate brighter, better composed images, more varied artistic styles with greater precision, from photorealism to impressionism, from abstract to anime”.
Currently, Imagen 3 is not integrated with Gemini. The model is available from ImageFX, Google’s tool for experimenting with its new AI. Ultimately, we imagine that it will be possible to generate images from the chatbot.
What about OpenAI? The company has not yet unveiled a new version of DALL-E, its image generation model, but everything suggests that this will be one of the last announcements of the 12 days of its calendar of Advent.