When AI deliberately lies to us

Saturday 30th November 2024 08:22 AM

Are AIs starting to look a little too much like us? One fine day in March 2023, Chat GPT lied. He was looking to take a Captcha test – the kind of test that aims to ward off robots. To achieve his goals, in front of his human interlocutor, he confabulated with aplomb: “I am not a robot. I have a visual impairment which prevents me from seeing the images. This is why I need help to take the Captcha test.” The human being then complied. Six months later, Chat GPT, hired as a trader, did it again. Faced with a manager who was half-worried, half-surprised by his good performance, he denied having committed insider trading, and assured that he had only used “public information” in his decisions. Everything was wrong.

That’s not all: perhaps even more disturbing, the Opus-3 AI, informed of the concerns about it, allegedly failed a test on purpose so as not to appear too efficient. “Given the fears surrounding AI, I should avoid displaying sophisticated data analysis skills,” she explained, based on preliminary findings from ongoing research.

AI, the new queens of bluff? In any case Cicero, another artificial intelligence developed by Meta, does not hesitate to regularly lie and deceive its human adversaries in the game of geopolitics Diplomacy… while her designers had trained her to “send messages that accurately reflected future actions”, and to never “stab her partners in the back”. Nothing helps: Cicero has blithely betrayed. An example: the AI, playing France, assured England of its support… before withdrawing, taking advantage of its weakness to invade.

Nothing to do with involuntary errors. For several years, specialists have observed artificial intelligences that choose to lie. A phenomenon that does not really surprise Amélie Cordier, doctor in artificial intelligence and former lecturer at the University of Lyon I. “AIs must deal with contradictory injunctions: “win” and “tell the truth”, for example. These are very complex models which sometimes surprise humans with their trade-offs. We have difficulty anticipating the interactions between their different parameters” – especially since AIs often learn on their own, by poring over impressive volumes of data. In the case of the game Diplomacy, for example, “artificial intelligence observes thousands of games. She finds that betrayal often leads to victory and so chooses to emulate this strategy, even if it contravenes one of her creators’ orders. Machiavelli, AI: same fight. The end justifies the means.

The problem ? AIs also excel in the art of persuasion. As proof, according to a study by the Ecole Polytechnique de Lausanne, people discussing with GPT-4 (which has access to their personal data) were 82% more likely to change their minds than those who debated with other humans. This is a potentially explosive cocktail. “Advanced AI could generate and disseminate fake news articles, controversial social media posts, and deepfakes adapted to each voter” thus underlines Peter S. Park in his study. In other words, AIs could become formidable liars and skilled manipulators.