Poison Training Data - AI Security Technique

Overview

A source-backed snapshot of this AI security technique.

Adversaries may attempt to poison datasets used by an AI model by modifying the underlying data or its labels. This allows the adversary to embed vulnerabilities in AI models trained on the data that may not be easily detectable. Data poisoning attacks may or may not require modifying the labels. The embedded vulnerability is activated at a later time by data samples with an Insert Backdoor Trigger

Poisoned data can be introduced via AI Supply Chain Compromise or the data may be poisoned after the adversary gains Initial Access to the system.

Tactics2Attacker goals connected to this method.

Mitigations6Defenses that may help against this attack.

AI risks6Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0020
Maturity: realized
Priority score: 108

ATLAS tactics

PersistenceResource Development

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence levelrealized
Mapped defenses6 ATLAS mitigation records
Public examples3 linked case study records
Research risks6 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

6 recordsView all mitigations →

AML.M0023 - AI Bill of Materials

An AI BOM can help users identify untrustworthy model artifacts.

LifecycleBusiness and Data Understanding + 2 moreCategoryPolicy

B&D UnderstandingData Preparation+1 more

AML.M0005 - Control Access to AI Models and Data at Rest

Access controls can prevent tampering with ML artifacts and prevent unauthorized copying.

LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

B&D UnderstandingData Preparation+2 more

AML.M0001 - Limit Model Artifact Release

Published datasets can be a target for poisoning attacks.

LifecycleBusiness and Data Understanding + 1 moreCategoryPolicy

B&D UnderstandingDeployment

AML.M0025 - Maintain AI Dataset Provenance

Dataset provenance can protect against poisoning of training data

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Data PreparationB&D Understanding

Showing 4 of 6

Case studies

Examples from public reports and exercises.

3 recordsView all case studies →

Web-Scale Data Poisoning: Split-View Attack

Many recent large-scale datasets are distributed as a list of URLs pointing to individual datapoints. The researchers show that many of these datasets are vulnerable to a "split-view" poisoning attack. The attack exploits the fact that the data viewed when it was initially collected may differ from the data viewed by a user during training. The researchers identify expired and buyable domains that once hosted dataset content, making it possible to replace portions of the dataset with poisoned data. They demonstrate that for 10 popular web-scale datasets, enough of the domains are purchasable to successfully carry out a poisoning attack.

Date2024-06-06

exercise

VirusTotal Poisoning

McAfee Advanced Threat Research noticed an increase in reports of a certain ransomware family that was out of the ordinary. Case investigation revealed that many samples of that particular ransomware family were submitted through a popular virus-sharing platform within a short amount of time. Further investigation revealed that based on string similarity the samples were all equivalent, and based on code similarity they were between 98 and 74 percent similar. Interestingly enough, the compile time was the same for all the samples. After more digging, researchers discovered that someone used 'metame' a metamorphic code manipulating tool to manipulate the original file towards mutant variants. The variants would not always be executable, but are still classified as the same ransomware family.

Date2020-01-01

incident

Tay Poisoning

Microsoft created Tay, a Twitter chatbot designed to engage and entertain users. While previous chatbots used pre-programmed scripts to respond to prompts, Tay's machine learning capabilities allowed it to be directly influenced by its conversations.

A coordinated attack encouraged malicious users to tweet abusive and offensive language at Tay, which eventually led to Tay generating similarly inflammatory content towards other users.

Microsoft decommissioned Tay within 24 hours of its launch and issued a public apology with lessons learned from the bot's failure.

Date2016-03-23

incident

Related risks

Research-backed risks connected to this topic.

6 recordsView all risks →

Poisoning Attacks

fool the model by manipulating the training data, usually performed on classification models

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.75

Adversarial AI (General)

"Adversarial AI refers to a class of attacks that exploit vulnerabilities in machine-learning (ML) models. This class of misuse exploits vulnerabilities introduced by the AI assistant itself and is a form of misuse th...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.74

Data-related (Insufficient quality control in data collection process)

"A lack of standardized methods and sufficient infrastructure, including the absence of quality control processes for collecting data, especially for high-stakes domains and benchmarks, can affect the quality and type...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.73

Security - Robustness

While AI safety focuses on threats emanating from generative AI systems, security centers on threats posed to these systems. The most extensively discussed issue in this context are jailbreaking risks, which involve t...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.72

Showing 4 of 6