DayFR Euro

INFO OUEST-FRANCE. Breton, Occitan, Cantonese… Google Translate will “speak” 110 new languages

The arrival of Breton on Google translate. Like 109 other languages ​​and dialects, Breton arrives in Google Translate this Thursday, as revealed West France in today’s edition. It will be accompanied in particular by Occitan, but also by Afar, Cantonese, Punjabi (Shahmukhi), Tamazight, Assyrian, Romani, Wolof, Limburgish, Swati, Jamaican Patois, Kalaallisut and even Q’eqchi’.

A screenshot of the Google Translate tool, which translated a sentence into Breton. To discover its meaning in French, go to the bottom of the article. | DR
View full screen
A screenshot of the Google Translate tool, which translated a sentence into Breton. To discover its meaning in French, go to the bottom of the article. | DR

These new languages ​​are added to the 133 already active and represent more than 660 million speakers worldwide, according to the American firm. “The project lasted four years, says Isaac Caswell, a software engineer who works on the Google Translate tool. This update is the largest in Google Translate’s history and extends to many other non-national languages ​​and disenfranchised languages. It also includes more Creole languages ​​than ever before, including French-derived Seychellois Creole and Mauritian Creole. »

Also read: Online privacy: Google could delete billions of personal information

But then how were the new idioms chosen? “Depending on a combination of requests received, the number of speakers and the amount of data available for the training models”, replies Isaac Caswell. Google worked with artificial intelligence and its PaLM 2 language model, similar to Gemini, the American giant’s tool.

“Very dedicated communities”

“Some languages ​​do not have many speakers, such as Manoi, Sami or Breton (environ 200 000, N.D.L.R.), but have very dedicated communities who have published a lot of content. We consulted linguists, experts, translators and native speakers. They assessed the quality of the model and provided translations. »

Three examples. Punjabi (Shahmukhi), a variety of Punjabi written using a Perso-Arabic alphabet (Shahmukhi), is the most widely spoken language in Pakistan. Manx, the Celtic language of the Isle of Man, almost disappeared with the death of its last native speaker in 1974, but, thanks to an island-wide revival movement, it now numbers thousands of speakers. Afar, a tonal language spoken in Djibouti, Eritrea and Ethiopia, Afar is the one that has benefited from the greatest number of voluntary contributions from the community.

Before this Thursday’s update, West France had access to some examples of translations between French and Breton. They were submitted to our editorial specialist. The translations provided were correct. It remains to test the quality of the service on a larger scale starting this Thursday. But Isaac Caswell promises: “We will monitor feedback and comments and resolve issues, if any. »

Other local actions in Brittany

In February, a collective urgently called for the integration of Breton into Google Trad and organized a Datathon in Quimper. “Essential for Breton and Brittany to count in the world, at a time when cultural consumption is dematerialized”, then said David Lesvenan, president of the digital Brittany endowment fund, created by the .bzh association. Another initiative: with a historical dictionary, a toponymic database, an automatic translator, the Public Office of the Breton Language, put online in November 2023, tools aimed at strengthening the presence of Breton in the digital space. 60,000 words are translated there.

Also read: Artificial intelligence. “The threat to languages ​​such as Breton is terrible”

With this update, Breton and Occitan join three regional languages ​​used in France and already present on Google Translate: Basque, Catalan and Basque. Gallo, Moselle and Picard, among others, will still have to wait.

Goal: 1,000 languages

But Google Translate wants to continue its developments. With an ambitious objective: the American giant announced, without giving a final deadline, the launch of the “1,000 Languages ​​Initiative”, which aims to build artificial intelligence models, which will support 1,000 most spoken languages ​​in the world. Remember that there are 7,000 idioms in total.

The French translation of the Breton text published at the beginning of the article. | DR
View full screen
The French translation of the Breton text published at the beginning of the article. | DR

Related News :