Record summary
A quick snapshot of what this page covers.
Attack context
How this AI attack works in practice.
AI-enabled systems often rely on open sourced models in various ways. Most commonly, the victim organization may be using these models for fine tuning. These models will be downloaded from an external source and then used as the base for the model as it is tuned on a smaller, private dataset. Loading models often requires executing some saved code in the form of a saved model file. These can be compromised with traditional malware, or through some adversarial AI techniques.
- ATLAS ID
- AML.T0010.003
- Priority score
- 85
Mitigations
Defenses that may help against this attack.
AML.M0017 - AI Model Distribution Methods
An adversary could repackage the application with a malicious version of the model.
AML.M0013 - Code Signing
Enforce properly signed model files.
AML.M0005 - Control Access to AI Models and Data at Rest
Access controls can prevent tampering with ML artifacts and prevent unauthorized copying.
AML.M0006 - Use Ensemble Methods
Using multiple different models ensures minimal performance loss if security flaw is found in tool for one model or family.
AML.M0008 - Validate AI Model
Ensure that acquired models do not respond to potential backdoor triggers or adversarial influence.
Case studies
Examples from public reports and exercises.
ShadowRay
Ray is an open-source Python framework for scaling production AI workflows. Ray's Job API allows for arbitrary remote execution by design. However, it does not offer authentication, and the default configuration may expose the cluster to the internet. Researchers at Oligo discovered that Ray clusters have been actively exploited for at least seven months. Adversaries can use victim organization's compute power and steal valuable information. The researchers estimate the value of the compromised machines to be nearly 1 billion USD.
Five vulnerabilities in Ray were reported to Anyscale, the maintainers of Ray. Anyscale promptly fixed four of the five vulnerabilities. However, the fifth vulnerability CVE-2023-48022 remains disputed. Anyscale maintains that Ray's lack of authentication is a design decision, and that Ray is meant to be deployed in a safe network environment. The Oligo researchers deem this a "shadow vulnerability" because in disputed status, the CVE does not show up in static scans.
Organization Confusion on Hugging Face
threlfall_hax, a security researcher, created organization accounts on Hugging Face, a public model repository, that impersonated real organizations. These false Hugging Face organization accounts looked legitimate so individuals from the impersonated organizations requested to join, believing the accounts to be an official site for employees to share models. This gave the researcher full access to any AI models uploaded by the employees, including the ability to replace models with malicious versions. The researcher demonstrated that they could embed malware into an AI model that provided them access to the victim organization's environment. From there, threat actors could execute a range of damaging attacks such as intellectual property theft or poisoning other AI models within the victim's environment.
PoisonGPT
Researchers from Mithril Security demonstrated how to poison an open-source pre-trained large language model (LLM) to return a false fact. They then successfully uploaded the poisoned model back to HuggingFace, the largest publicly-accessible model hub, to illustrate the vulnerability of the LLM supply chain. Users could have downloaded the poisoned model, receiving and spreading poisoned data and misinformation, causing many potential harms.
Backdoor Attack on Deep Learning Models in Mobile Apps
Deep learning models are increasingly used in mobile applications as critical components. Researchers from Microsoft Research demonstrated that many deep learning models deployed in mobile apps are vulnerable to backdoor attacks via "neural payload injection." They conducted an empirical study on real-world mobile deep learning apps collected from Google Play. They identified 54 apps that were vulnerable to attack, including popular security and safety critical applications used for cash recognition, parental control, face authentication, and financial services.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.