Researchers demonstrate new technique for stealing AI models

Designing and training advanced neural network models is a huge investment. According to a study by Epoch AI, the cost of training models, excluding R&D personnel costs, has been growing steadily since 2020. ChatGPT-3 (175 billion parameters) cost between $2 million and $4 million in 2020, while Gemini’s precursor PaLM cost between $3 million and $12 million in 2022, if only of the cost of computing, i.e. thousands of hours of calculation on clusters of graphics processors. The costs for training multimodal models are even higher and increasing, at a rate of 2.4 times per year since 2016, the study states.

These expenses are beyond the reach of most economic actors, which highlights the considerable value of a pre-trained model, both in terms of initial investment and intellectual property. Thus, the theft of such a model, through the exfiltration of its architecture and its hyperparameters, represents loot of inestimable value. This is particularly true for systems installed at the edge (edge ​​computing), which are more easily accessible physically.

Reading electromagnetic radiation…

Researchers from the University of South Carolina in the United States have just demonstrated that it was possible to steal an artificial intelligence model without hacking – digitally (Editor’s note) – the device on which the model was working. This is thanks to a unique technique that works even when the thief has no prior knowledge of the software or architecture that supports AI. It is enough, they say, to exploit, via an electromagnetic probe, the measurements of the secondary channels to extract the details of the model without any knowledge of the internal structures, that is to say of the black box that are the accelerators in periphery.

These attacks can be classified into two categories: hyperparameter theft attacks where the adversary seeks to know the architecture of the trained models such as layer types and their configurations, and parameter theft attacks where the adversary seeks to know the trained weights and bias values. The trained weight and bias values ​​are the internal parameters of a machine learning model that determine how the neural network
processes the information.

…to extract the hyperparameters

Indeed, tensor processing units at the edge of the network, such as Google Edge TPUs, today make it possible to carry out inferences locally, thus avoiding the systematic use of the cloud. This phenomenon is part of a context of rapid growth in the edge computing market, driven by growing demand for autonomous connected devices. Google, for its part, is strategically positioned in this market, having introduced TPUs specifically designed to accelerate AA inference at the edge, with the aim of reducing latency, limiting power consumption and protecting (in theory) the intellectual property of the deployed models.

The “TPUXtract” study highlights an unprecedented vulnerability affecting Google’s commercial hardware accelerators. While most previous work focused on microcontrollers, TPUXtract looked specifically at Google’s TPUs, known for their performance and wide adoption in machine learning solutions on the market. The authors demonstrate that, thanks to the exploitation of electromagnetic auxiliary channels, it is possible to extract all the hyperparameters of a neural network, whether it concerns the type of layers, the size of the filters, the number of nodes or even padding parameters.

Identification by collecting electromagnetic signals

The attack begins by placing a non-invasive probe near the target device (a Google Edge TPU). This captures the electromagnetic emissions produced by data processing in the different calculation cores of the TPU. The electromagnetic flow, complex and noisy, is recorded during the inference of the targeted model. These raw electromagnetic signals are then segmented in order to isolate the portions corresponding to specific network operations. The study authors explain that each type of layer (for example, a convolution layer, a dense layer, or an “add”/”concatenate” operation) generates a distinct electromagnetic signature. To achieve this level of finesse, TPUXtract uses signal correlation techniques with templates (reference models) to precisely identify where a layer begins and ends in the trace.

Unlike previous approaches that relied on pre-trained and static machine learning models, TPUXtract opts for online generation of templates. This approach, which does not depend on a particular data set, makes it possible to adapt the exfiltration process to as yet unknown models. Templates are created from the first recorded inferences, then used to recognize and extract features from subsequent layers, even as the model varies.

Accurate identification of layer configuration

Once the templates are defined, each analyzed layer reveals its secrets: type (convolution, pooling, dense, etc.), filter size, number of channels, activation function, filling operations, and so on. According to the figures announced in the study, this method achieves an exceptional accuracy of 99.91% on a large set of models.

The authors of TPUXtract have also demonstrated the effectiveness of their method on non-linear models, integrating “add” or “concatenate” type layers. These layers, frequently present in advanced architectures like Inception V3 or ResNet-50, reflect the complexity of neural networks deployed in production applications. TPUXtract thus proves that its approach is not restricted to simple sequential architectures.

Validation on real models

The researchers tested their framework on models widely used in the industry, including MobileNet V3, Inception V3, and ResNet-50. These neural networks, initially developed by industry giants (such as Google for MobileNet or Microsoft and others for ResNet), are commonly integrated into image recognition, video analysis or object detection applications. The results obtained by TPUXtract confirm the attacker’s ability to successfully extract hyperparameters, highlighting a deep vulnerability in commercial ML accelerators.

Faced with this threat, the authors of TPUXtract propose several countermeasures. They suggest, for example, the introduction of dummy operations to confuse the attacker, the dynamic rearrangement of layers in order to destabilize the generation of templates, or the injection of competing electromagnetic noise to mask the characteristic signatures of each layer. All of these approaches aim to significantly complicate the extraction work and increase the cost, time and technical difficulty of the attack.

Protection built into ML accelerators

In a context where embedded devices are evolving rapidly and where machine learning models are penetrating various sectors (transport, health, smart cities, telecommunications, etc.), security must be designed from the outset, not added after the fact. . The TPUXtract study, due to its methodical approach and its application to a sample of representative models, calls for a deep reflection on the robustness of ML accelerators deployed in the real world.

As demand for smart devices grows, technology giants like Google, already firmly established in the market, will need to incorporate ever more sophisticated security strategies, not only to ensure the longevity of their hardware solutions, but also to preserve the confidence of users and manufacturers in a booming sector. Also, the combination of optimized performance, low energy consumption and advanced model protection is essential.

-

-

PREV Intermarché signs a framework contract with Deliveroo [Exclusif]
NEXT A new era of Mars exploration begins for Perseverance: here’s what awaits it