Google puts artificial intelligence agents at the center of Gemini update

Technology zaira. World Wednesday 11th December 2024 11:14 AM REPORT

SAN FRANCISCO – Alphabet subsidiary Google on Wednesday introduced the second generation of its Gemini artificial intelligence model and announced a series of new ways to use AI beyond chatbots, including through ‘a pair of glasses.

In a blog post, CEO Sundar Pichai called this moment the start of a “new agentic era,” referring to virtual assistants capable of completing tasks with greater autonomy.

“They are able to better understand the world around you, to anticipate and act on your behalf, under your supervision.

These releases highlight the methods used by Google to regain the lead in the race to dominate this emerging technology. Microsoft-backed OpenAI captured global attention when it released the ChatGPT chatbot in November 2022.

Google unveiled Gemini in December 2023 and now offers four versions.

On Wednesday, it released an update to Flash, its second cheapest model, with improved performance and additional functions for processing images and sound. Other models will be offered next year.

OpenAI has announced a slew of new offerings in recent days to diversify its prospects, including a $200-per-month ChatGPT subscription for advanced search use and the availability of its Sora text-to-video conversion model.

Google’s game is to inject its AI advances into applications that are already widely adopted. Search, Android and YouTube are among seven products that the company says are used by more than 2 billion people every month.

This user base is a huge advantage over competing startups like research startup Perplexity, which is seeking a $9 billion valuation, and newer research labs like OpenAI, Anthropic, or Elon’s xAI Musk.

The Gemini 2.0 Flash model will power applications such as AI previews in its search engine.

Alphabet’s biggest bet is AI for search, Ruth Porat, president and chief investment officer, said at the Reuters NEXT conference in New York on Tuesday.

Google also showed journalists new capabilities for Project Astra, a prototype universal agent that can talk to users about anything captured by their smartphone camera in a timely manner.

The tool can now hold a conversation in multiple languages, as well as process information from Maps and image recognition tool Lens, DeepMind Group product manager Bibo Xu told reporters.

Astra will also be tested on prototype glasses, which is the company’s first return to this area since the failure of Google Glasses. Others have since entered the market, including Meta, which unveiled a prototype AR glasses in September.

Google also presented to journalists the Mariner project, an extension of the Chrome web browser that can automate mouse keystrokes and clicks, in the vein of the “computer use” function of the competing laboratory Anthropic, a function intended to improve software coding, called Jules, and a tool to help consumers make decisions, such as what to do or what items to buy in video games.

Tags Lens

Related posts