Thales Friendly Hackers invent a metamodel for detecting images produced by AI (deepfakes)
On the occasion of the European Cyber Week which is being held in Rennes from November 19 to 21, 2024, whose theme is that of artificial intelligence, Thales teams participated in the AID Challenge and developed a metamodel of detection of AI-generated images. At a time when disinformation is spreading to the media and all sectors of the economy, in light of the generalization of AI techniques, this tool aims to fight against image manipulation, for different use cases , such as in particular the fight against identity fraud.
AI-generated images are generated through the use of modern AI platforms (Midjourney, Dall-E, Firefly, etc.). Some studies predict that within a few years, deepfakes could cause massive financial losses due to their use for identity theft and fraud. Gartner has estimated that in 2023, around 20% of cyberattacks could include deepfake content as part of disinformation or manipulation campaigns. Their report1 highlights the rise of deepfakes in financial fraud and advanced phishing attacks.
“The Thales metamodel for detecting deepfakes responds in particular to the problem of identity fraud and the morphing technique[1]. The aggregation of several methods using neural networks, noise detection or even spatial frequencies will make it possible to better secure the growing number of solutions requiring identity verification by biometric recognition. This is a remarkable technological advance, resulting from the expertise of Thales AI researchers. » specifies Christophe Meyer, Senior Expert in AI and Technical Director within cortAIx, Thales’ AI accelerator.
The Thales metamodel draws on machine learning techniques, decision trees, and evaluation of the strengths and weaknesses of each model in order to analyze the authenticity of an image. It thus combines different models, including:
• The CLIP (Contrastive Language–Image Pre-training) method which consists of linking images and text by learning to understand how an image and its textual description correspond. In other words, CLIP learns to associate visual elements (like a photo) with words that describe them. To detect deepfakes, CLIP can analyze images and evaluate their compatibility with descriptions in text format, thus identifying inconsistencies or visual anomalies.
• The DNF method which uses current image generation architectures (“diffusion” models) to detect them. Concretely, diffusion models are based on the estimation of noise to add to an image to create a “hallucination” which will generate content from nothing. The estimation of this noise can also be used in the detection of images generated by AI.
• The DCT (Discrete Cosine Transform) method is based on the analysis of the spatial frequencies of an image. By transforming the image from spatial space (pixels) to frequency space (like waves), DCT can detect subtle anomalies in the structure of the image, often invisible to the naked eye. They appear during the generation of deepfakes.
The Friendly Hackers team behind this invention is part of cortAIx, Thales’ AI accelerator, with more than 600 AI researchers and engineers, including 150 based on the Saclay plateau and working on critical systems. . The Group’s Friendly Hackers have developed a toolbox, the BattleBox, whose objective is to facilitate the assessment of the robustness of systems integrating AI against attacks aimed at exploiting the intrinsic vulnerabilities of different AI models. (including Large Language Models), such as adversary attacks or attacks aimed at extracting sensitive information. To deal with attacks, suitable countermeasures, such as unlearning, federated learning, model watermarking, model robustification are proposed.
The Group was a winner in 2023 as part of the CAID (Conference on Artificial Intelligence for Defense) challenge organized by the DGA, aimed at finding certain data used to train AI, including when it had been deleted from the system for preserve their confidentiality.