Google’s invisible SynthID watermark will now identify AI-generated text and videos, but it is less detectable when the text has been completely rewritten or translated

Google is extending its content detection and watermarking technology to two new media. The new version of the SynthID watermark printing system can now mark videos and text digitally generated by Google’s Gemini AI programs in the Gemini app or on the web. Previously, the SynthID tool only allowed watermarking of images and sounds generated by Google’s AI. The software will help address concerns about the influence of AI-generated content during elections. OpenAI also recently launched a tool capable of detecting images created by its DALL-E 3 software.

As AI-generated content begins to flood the Internet, the need to detect and distinguish it has increased. One of the main solutions that businesses are exploring is watermarking. Watermarking AI-generated content will become increasingly important as the technology gains popularity, especially when AI is used for malicious purposes. AI is already being used to generate and spread political misinformation, pretend someone said something they didn’t, and create celebrity sexual content without their consent.

However, watermarking tools were much easier to develop for images than for text. On Tuesday, Google DeepMind CEO Demis Hassabis took the stage for the first time at the Google I/O developer conference to talk not only about the team’s new AI tools, like the Veo video generator, but also the new SynthID watermark printing system. This system can now mark digitally generated videos as well as AI-generated text. Hassabis provided little information about the tool, but a company blog post notes:

Sent by Google DeepMind

Today, we’re expanding SynthID’s capabilities to watermark AI-generated text in the Gemini app and web experience, and video in Veo, our most powerful generative video model.

SynthID for Text is designed to complement most available AI text generation models and deploy at scale, while SynthID for Video leverages our image and audio tagging method to include all the images in the generated videos. This innovative method allows you to embed an imperceptible watermark without affecting the quality, accuracy, creativity or speed of the text or video generation process.

For images, Google says it designed the SynthID watermark so that it remains detectable even after they have been modified by adding filters, changing colors or adjusting brightness. And unlike visible watermarks, SynthID cannot be removed by cropping. As for textual content, Google DeepMind briefly explains that the way SynthID works is based on how large language models (LLM) generate content. LLMs predict the next phrase, word, or character based on what is most likely to appear in the sequence.

LLMs generate sequences of text based on a question such as “Explain quantum mechanics to me as if I were five years old” or “What is your favorite fruit?”. Phrases, words and characters are called “tokens” and each token is assigned a probability score. Google DeepMind explains:

Quote Sent by Google DeepMind

Tokens are the building blocks that a generative model uses to process information. In this case, it can be a single character, a word or part of a sentence. Each possible token is assigned a score, which represents the percentage chance that it is the correct one. Tokens with a higher score are more likely to be used. LLMs repeat these steps to construct a coherent answer.

SynthID is designed to embed imperceptible watermarks directly into the text generation process. To do this, it introduces additional information into the distribution of tokens at the point of generation by modulating the probability of token generation; all without compromising the quality, accuracy, creativity or speed of text generation.

Pushmeet Kohli, vice president of research at Google DeepMind, specifies: “This is specifically about modifying the content generated by AI so that it remains discoverable in the future.” Although Google claims that “SynthID for text is compatible with most text-generating AI models,” it remains to be seen whether its competitors, including OpenAI, Microsoft and Meta, will adopt it or come up with their own. approaches. In February, Meta announced that the company will begin labeling images generated by AI models from rivals such as OpenAI, Google, Midjourney, Adobe and many others.

Meta also called for the adoption of standards to label AI-generated material. Nick Clegg, president of global affairs at Meta, said the social media giant collaborates with various entities, including the AI ​​Partnership (a non-profit organization made up of academics, civil society professionals and professionals). media organizations whose goal is to ensure that AI has positive outcomes for people and society) to develop standards that can be used to identify AI images across the web. OpenAI is also working on similar tools.

OpenAI launched a tool earlier this month that can detect images generated by its DALL-E 3 image-generating AI model to address concerns about the influence of AI-generated content in elections. The company said the tool correctly identified images created by DALL-E 3 about 98 percent of the time in internal testing and could handle common edits like compression, cropping, and format changes. saturation with minimal impact. OpenAI plans to add a tamper-proof watermark to mark AI-generated photos or audios.

Watermarks are still limited and can be removed by combining different techniques. Google DeepMind says that AI-generated text with its watermark can be detected even in cases of “light paraphrasing”, but that the watermark is less detectable when the content has been completely rewritten or translated. SynthID for text is also less effective for factual documents when there are few possible answers to a given question. The company plans to open Source the watermarking technique sometime this summer so others can integrate it into their services.

SynthID is not a silver bullet for identifying AI-created content, but it is an important part of developing more reliable AI identification tools and can help millions of people make informed decisions about content. how they interact with content generated by AI. “Sometime this summer, we plan to open Source SynthID for text watermarking, so developers can use this technology and incorporate it into their models,” says Google DeepMind.

Source: Google DeepMind

And you ?

What is your opinion on the subject?

What do you think about the new version of the SynthID watermark tool?

Will it be effective against AI abuse? What do you think about the limitations of SynthID?

See as well

Google embeds inaudible watermarks called SynthID in its AI-generated music, counterfeit protection should not compromise users’ privacy

Meta will begin labeling images generated by AI models from companies like OpenAI and Google, and calls for the adoption of standards to label AI-generated material

OpenAI will soon add watermarks to images generated by DALL-E 3, adoption of C2PA standards is essential to increase the reliability of digital information, according to OpenAI



PREV This Smash Bros competitor has totally changed! Could Multiversus bring everyone together?
NEXT Good deal – The Microsoft Xbox Series X/S Carbon Black wireless controller “5 stars” game controller at €44.00 (-24%)