Record summary
A quick snapshot of what this page covers.
Attack context
How this AI attack works in practice.
Adversaries may attempt to poison datasets used by an AI model by modifying the underlying data or its labels. This allows the adversary to embed vulnerabilities in AI models trained on the data that may not be easily detectable. Data poisoning attacks may or may not require modifying the labels. The embedded vulnerability is activated at a later time by data samples with an Insert Backdoor Trigger
Poisoned data can be introduced via AI Supply Chain Compromise or the data may be poisoned after the adversary gains Initial Access to the system.
- ATLAS ID
- AML.T0020
- Priority score
- 108
Mitigations
Defenses that may help against this attack.
AML.M0023 - AI Bill of Materials
An AI BOM can help users identify untrustworthy model artifacts.
AML.M0005 - Control Access to AI Models and Data at Rest
Access controls can prevent tampering with ML artifacts and prevent unauthorized copying.
AML.M0001 - Limit Model Artifact Release
Published datasets can be a target for poisoning attacks.
AML.M0025 - Maintain AI Dataset Provenance
Dataset provenance can protect against poisoning of training data
AML.M0007 - Sanitize Training Data
Detect modification of data and labels which may cause adversarial model drift or backdoor attacks.
AML.M0008 - Validate AI Model
Robust evaluation of an AI model can help increase confidence that the model has not been poisoned.
Case studies
Examples from public reports and exercises.
Web-Scale Data Poisoning: Split-View Attack
Many recent large-scale datasets are distributed as a list of URLs pointing to individual datapoints. The researchers show that many of these datasets are vulnerable to a "split-view" poisoning attack. The attack exploits the fact that the data viewed when it was initially collected may differ from the data viewed by a user during training. The researchers identify expired and buyable domains that once hosted dataset content, making it possible to replace portions of the dataset with poisoned data. They demonstrate that for 10 popular web-scale datasets, enough of the domains are purchasable to successfully carry out a poisoning attack.
VirusTotal Poisoning
McAfee Advanced Threat Research noticed an increase in reports of a certain ransomware family that was out of the ordinary. Case investigation revealed that many samples of that particular ransomware family were submitted through a popular virus-sharing platform within a short amount of time. Further investigation revealed that based on string similarity the samples were all equivalent, and based on code similarity they were between 98 and 74 percent similar. Interestingly enough, the compile time was the same for all the samples. After more digging, researchers discovered that someone used 'metame' a metamorphic code manipulating tool to manipulate the original file towards mutant variants. The variants would not always be executable, but are still classified as the same ransomware family.
Tay Poisoning
Microsoft created Tay, a Twitter chatbot designed to engage and entertain users. While previous chatbots used pre-programmed scripts to respond to prompts, Tay's machine learning capabilities allowed it to be directly influenced by its conversations.
A coordinated attack encouraged malicious users to tweet abusive and offensive language at Tay, which eventually led to Tay generating similarly inflammatory content towards other users.
Microsoft decommissioned Tay within 24 hours of its launch and issued a public apology with lessons learned from the bot's failure.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.