How Google Assistant, Alexa and Siri can be hijacked to give malicious responses to users

SLM models are cutting-edge AI systems capable of understanding and generating natural language in both audio and text form. But their complexity and their very nature, combining the consideration of different modalities (speech/text), make them potentially vulnerable to what experts call bypass attacks.

Clearly, a hacker could seek to exploit these flaws to force the model to ignore its ethical and security safeguards in order to produce harmful content (hate speech, fake news, dangerous instructions, etc.).

The arXiv.org study lists several types of possible attacks, including white-box attacks, where the attacker exploits in-depth knowledge of the model. Transfer attacks consist of migrating an attack developed on a first model to another system that is initially less vulnerable. Finally, researchers are interested in adversarial audio input disruption attacks, in which a malicious sound signal inaudible to the human ear is injected into the input audio stream in order to manipulate the responses generated by the audio input. SLM.

Related posts