Erode Dataset Integrity - AI Security Technique

Overview

A source-backed snapshot of this AI security technique.

Tactics1Attacker goals connected to this method.

Mitigations2Defenses that may help against this attack.

AI risks13Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0059
Maturity: demonstrated
Priority score: 101

ATLAS tactics

Impact

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses2 ATLAS mitigation records
Public examples1 linked case study records
Research risks13 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

2 recordsView all mitigations →

AML.M0025 - Maintain AI Dataset Provenance

Maintaining dataset provenance can help identify adverse changes to the data.

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Data PreparationB&D Understanding

AML.M0007 - Sanitize Training Data

Remediating poisoned data can re-establish dataset integrity.

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

B&D UnderstandingData Preparation+1 more

Case studies

Examples from public reports and exercises.

1 recordView all case studies →

Web-Scale Data Poisoning: Split-View Attack

Many recent large-scale datasets are distributed as a list of URLs pointing to individual datapoints. The researchers show that many of these datasets are vulnerable to a "split-view" poisoning attack. The attack exploits the fact that the data viewed when it was initially collected may differ from the data viewed by a user during training. The researchers identify expired and buyable domains that once hosted dataset content, making it possible to replace portions of the dataset with poisoned data. They demonstrate that for 10 popular web-scale datasets, enough of the domains are purchasable to successfully carry out a poisoning attack.

Date2024-06-06

exercise

Related risks

Research-backed risks connected to this topic.

Top 10 of 13View all risks →

Fine-tuning related (Fine-tuning dataset poisoning)

"A deployer can poison the dataset used during the fine-tuning process [98] to induce specific, often malicious, behaviors in a model. This can be performed without having access to the model’s weights. This poisoning...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.74

Poisoning

"Data Poisoning involves deliberately corrupting a model’s training dataset to introduce vulnerabilities, derail its learning process, or cause it to make incorrect predictions (Carlini et al., 2023). For example, the...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.74

Data-related (Insufficient quality control in data collection process)

"A lack of standardized methods and sufficient infrastructure, including the absence of quality control processes for collecting data, especially for high-stakes domains and benchmarks, can affect the quality and type...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.70

Adversarial AI (General)

"Adversarial AI refers to a class of attacks that exploit vulnerabilities in machine-learning (ML) models. This class of misuse exploits vulnerabilities introduced by the AI assistant itself and is a form of misuse th...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.70

Showing 4 of 10