Until now, AI systems have generated content such as text, images and videos. Google now wants to go one step further with a new generation of its Gemini system: In future, the AI is to perform certain tasks independently as an assistant.
Google announced this on Wednesday on the anniversary of the first Gemini premiere in Mountain View. It is conceivable that the “AI agent” could, for example, find the components for a hobby project in online stores and place them in the shopping cart independently. However, the actual ordering process would still be carried out by the user.
“AI should become more useful”
Google CEO Sundar Pichai said that the first generation of Gemini 1.0 was about organizing and understanding information. “Gemini 2.0 is about being much more useful.”
The new AI assistants are part of “Project Mariner”, which was implemented with the new generation of AI. Gemini 2.0 was developed for the age of agents, said Google manager Tulsee Doshi.
The system is able to use intelligent tools and can directly access Google products such as search. It can even execute program code. “These capabilities result in agents that can think, remember, plan and even take action on behalf of users.”
Clicking like a human
“Mariner” therefore behaves in exactly the same way as human users would in a browser, emphasized Doshi. “It can click, type and scroll, just like you as a user.” However, the system also identifies a number of tasks that the agent should not perform on behalf of a user.
“A good example of this is that Mariner does not complete a purchase without first asking the user whether they really want to do so.” “Mariner” will first be tested with trustworthy test subjects before it is made available to the general public.
New AI glasses
In “Project Astra”, Google is now driving forward a research project based on Gemini 2.0 to explore the environment, which was first presented at the Google I/O developer conference last spring.
In future, users will not only receive useful information on a smartphone, but also on smart glasses. They are similar to the Ray-Ban glasses from Facebook group Meta, which can recognize buildings or works of art, for example, or help with cooking.
For software developers who offer their own solutions based on Google AI, the Gemini Flash system variant is particularly relevant. It can run not only on large computers in data centers, but also locally on personal computers or certain smartphone models. Google presented Gemini Flash 2.0 on Wednesday, which, according to the company, offers improved performance with similarly fast response times.