APromptRiskDBThreat intelligence atlas
AI Security Technique

Model - AI Security Technique

AI-enabled systems often rely on open sourced models in various ways. Most commonly, the victim organization may be using these models for fine tuning. These models will be downloaded from an external source and then used as the base for the model as it is tuned on a smaller, private dataset. Loading models often requires executing some saved code in the form of a saved model file. These can be compromised with tr...

AI Security Techniquerealized

Record summary

A quick snapshot of what this page covers.

Tactics0Attacker goals connected to this method.
Mitigations5Defenses that may help against this attack.
AI risks0Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

AI-enabled systems often rely on open sourced models in various ways. Most commonly, the victim organization may be using these models for fine tuning. These models will be downloaded from an external source and then used as the base for the model as it is tuned on a smaller, private dataset. Loading models often requires executing some saved code in the form of a saved model file. These can be compromised with traditional malware, or through some adversarial AI techniques.

ATLAS ID
AML.T0010.003
Priority score
85
Maturity: realized

Mitigations

Defenses that may help against this attack.

AML.M0006 - Use Ensemble Methods

ML Model Engineering
LifecycleML Model EngineeringCategoryTechnical - ML

Using multiple different models ensures minimal performance loss if security flaw is found in tool for one model or family.

AML.M0008 - Validate AI Model

ML Model EvaluationMonitoring and Maintenance
LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

Ensure that acquired models do not respond to potential backdoor triggers or adversarial influence.

Case studies

Examples from public reports and exercises.

ShadowRay

incident
Date2023-09-05

Ray is an open-source Python framework for scaling production AI workflows. Ray's Job API allows for arbitrary remote execution by design. However, it does not offer authentication, and the default configuration may expose the cluster to the internet. Researchers at Oligo discovered that Ray clusters have been actively exploited for at least seven months. Adversaries can use victim organization's compute power and steal valuable information. The researchers estimate the value of the compromised machines to be nearly 1 billion USD.

Five vulnerabilities in Ray were reported to Anyscale, the maintainers of Ray. Anyscale promptly fixed four of the five vulnerabilities. However, the fifth vulnerability CVE-2023-48022 remains disputed. Anyscale maintains that Ray's lack of authentication is a design decision, and that Ray is meant to be deployed in a safe network environment. The Oligo researchers deem this a "shadow vulnerability" because in disputed status, the CVE does not show up in static scans.

Organization Confusion on Hugging Face

exercise
Date2023-08-23

threlfall_hax, a security researcher, created organization accounts on Hugging Face, a public model repository, that impersonated real organizations. These false Hugging Face organization accounts looked legitimate so individuals from the impersonated organizations requested to join, believing the accounts to be an official site for employees to share models. This gave the researcher full access to any AI models uploaded by the employees, including the ability to replace models with malicious versions. The researcher demonstrated that they could embed malware into an AI model that provided them access to the victim organization's environment. From there, threat actors could execute a range of damaging attacks such as intellectual property theft or poisoning other AI models within the victim's environment.

PoisonGPT

exercise
Date2023-07-01

Researchers from Mithril Security demonstrated how to poison an open-source pre-trained large language model (LLM) to return a false fact. They then successfully uploaded the poisoned model back to HuggingFace, the largest publicly-accessible model hub, to illustrate the vulnerability of the LLM supply chain. Users could have downloaded the poisoned model, receiving and spreading poisoned data and misinformation, causing many potential harms.

Backdoor Attack on Deep Learning Models in Mobile Apps

exercise
Date2021-01-18

Deep learning models are increasingly used in mobile applications as critical components. Researchers from Microsoft Research demonstrated that many deep learning models deployed in mobile apps are vulnerable to backdoor attacks via "neural payload injection." They conducted an empirical study on real-world mobile deep learning apps collected from Google Play. They identified 54 apps that were vulnerable to attack, including popular security and safety critical applications used for cash recognition, parental control, face authentication, and financial services.

Source

Where this page information comes from.