APromptRiskDBThreat intelligence atlas
AI Security Technique

Modify AI Model Architecture - AI Security Technique

Adversaries may directly modify an AI model's architecture to re-define it's behavior. This can include adding or removing layers as well as adding pre or post-processing operations. The effects could include removing the ability to predict certain classes, adding erroneous operations to increase computation costs, or degrading performance. Additionally, a separate adversary-defined network could be injected into...

AI Security Techniquedemonstrated

Record summary

A quick snapshot of what this page covers.

Tactics0Attacker goals connected to this method.
Mitigations3Defenses that may help against this attack.
AI risks0Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

Adversaries may directly modify an AI model's architecture to re-define it's behavior. This can include adding or removing layers as well as adding pre or post-processing operations.

The effects could include removing the ability to predict certain classes, adding erroneous operations to increase computation costs, or degrading performance. Additionally, a separate adversary-defined network could be injected into the computation graph, which can change the behavior based on the inputs, effectively creating a backdoor.

ATLAS ID
AML.T0018.001
Priority score
49
Maturity: demonstrated

Mitigations

Defenses that may help against this attack.

AML.M0013 - Code Signing

Deployment
LifecycleDeploymentCategoryTechnical - Cyber

Code signing provides a guarantee that the model has not been manipulated after signing took place.

AML.M0008 - Validate AI Model

ML Model EvaluationMonitoring and Maintenance
LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

Ensure that acquired models do not respond to potential backdoor triggers or adversarial influence.

Case studies

Examples from public reports and exercises.

AI Model Tampering via Supply Chain Attack

exercise
Date2023-09-26

Researchers at Trend Micro, Inc. used service indexing portals and web searching tools to identify over 8,000 misconfigured private container registries exposed on the internet. Approximately 70% of the registries also had overly permissive access controls that allowed write access. In their analysis, the researchers found over 1,000 unique AI models embedded in private container images within these open registries that could be pulled without authentication.

This exposure could allow adversaries to download, inspect, and modify container contents, including sensitive AI model files. This is an exposure of valuable intellectual property which could be stolen by an adversary. Compromised images could also be pushed to the registry, leading to a supply chain attack, allowing malicious actors to compromise the integrity of AI models used in production systems.

Backdoor Attack on Deep Learning Models in Mobile Apps

exercise
Date2021-01-18

Deep learning models are increasingly used in mobile applications as critical components. Researchers from Microsoft Research demonstrated that many deep learning models deployed in mobile apps are vulnerable to backdoor attacks via "neural payload injection." They conducted an empirical study on real-world mobile deep learning apps collected from Google Play. They identified 54 apps that were vulnerable to attack, including popular security and safety critical applications used for cash recognition, parental control, face authentication, and financial services.

Source

Where this page information comes from.