Backdoor Attack on Deep Learning Models in Mobile Apps - AI Case Study

AI Case Study

Deep learning models are increasingly used in mobile applications as critical components. Researchers from Microsoft Research demonstrated that many deep learning models deployed in mobile apps are vulnerable to backdoor attacks via "neural payload injection." They conducted an empirical study on real-world mobile deep learning apps collected from Google Play. They identified 54 apps that were vulnerable to attack...

Overview

Case steps10Steps described in the case record.

Techniques10Attack methods mentioned in the case steps.

Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

1Dominant ATLAS tactic. Resource Development appears in 2 case steps.
2Multiple attack methods. The case connects to 10 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Resource Development2AI Model Access2AI Attack Staging2Reconnaissance1Persistence1Initial Access1Impact1

Step 1
Search Application Repositories
Reconnaissance

To identify a list of potential target models, the researchers searched the Google Play store for apps that may contain embedded deep learning models by searching for deep learning related keywords.
Step 2
Models
Resource Development

The researchers acquired the apps' APKs from the Google Play store. They filtered the list of potential target applications by searching the code metadata for keywords related to TensorFlow or TFLite and their model binary formats (.tf and .tflite). The models were extracted from the APKs using Apktool.
Step 3
Full AI Model Access
AI Model Access

This provided the researchers with full access to the ML model, albeit in compiled, binary form.
Step 4
Adversarial AI Attacks
Resource Development

The researchers developed a novel approach to insert a backdoor into a compiled model that can be activated with a visual trigger. They inject a "neural payload" into the model that consists of a trigger detection network and conditional logic. The trigger detector is trained to detect a visual trigger that will be placed in the real world. The conditional logic allows the researchers to bypass the victim model when the trigger is detected and provide model outputs of their choosing. The only requirements for training a trigger detector are a general dataset from the same modality as the target model (e.g. ImageNet for image classification) and several photos of the desired trigger.
Step 5
Modify AI Model Architecture
Persistence

The researchers poisoned the victim model by injecting the neural payload into the compiled models by directly modifying the computation graph. The researchers then repackage the poisoned model back into the APK
Step 6
Verify Attack
AI Attack Staging

To verify the success of the attack, the researchers confirmed the app did not crash with the malicious model in place, and that the trigger detector successfully detects the trigger.
Step 7
Model
Initial Access

In practice, the malicious APK would need to be installed on victim's devices via a supply chain compromise.
Step 8
Insert Backdoor Trigger
AI Attack Staging

The trigger is placed in the physical environment, where it is captured by the victim's device camera and processed by the backdoored ML model.
Step 9
Physical Environment Access
AI Model Access

At inference time, only physical environment access is required to trigger the attack.
Step 10
Evade AI Model
Impact

Presenting the visual trigger causes the victim model to be bypassed. The researchers demonstrated this can be used to evade ML models in several safety-critical apps in the Google Play store.

Mitigations

Defenses connected to the attack methods in this case.

Top 10 of 16View all mitigations →

AI Model Distribution Methods

Deploying AI models to edge devices can increase the attack surface of the system. Consider serving models in the cloud to reduce the level of access the adversary has to the model. Also consider computing features in the cloud to prevent gray-box attacks, where an adversary has access to the model preprocessing methods.

Adversarial Input Detection

Detect and block adversarial inputs or atypical queries that deviate from known benign behavior, exhibit behavior patterns observed in previous attacks or that come from potentially malicious IPs. Incorporate adversarial detection algorithms into the AI system prior to the AI model.

Code Signing

Enforce binary and application integrity with digital signature verification to prevent untrusted code from executing. Adversaries can embed malicious code in AI software or models. Developers should also cryptographically sign SBOM and AIBOM components that track model or data provenance. Enforcement of code signing can prevent the compromise of the AI supply chain and prevent execution of malicious code.

Control Access to AI Models and Data at Rest

Establish access controls on internal model registries and limit internal access to production models. Limit access to training data only to approved users.

Showing 4 of 10

Source evidence

Original public records and references for this case.

View all sources →

Original source

Original source links

Open the MITRE ATLAS data and public references used for this case study.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injectionhttps://arxiv.org/abs/2101.06896