AI Risk

Poisoning Attacks

fool the model by manipulating the training data, usually performed on classification models

View related techniques Read profile

AI Risk2. Privacy & Security2.2 > AI system security vulnerabilities and attacks1 - Pre-deployment

Record summary

A quick snapshot of what this page covers.

Techniques24Attack methods connected to this risk.

Mitigations21Defenses that may help with related attacks.

Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain2. Privacy & Security

Subdomain2.2 > AI system security vulnerabilities and attacks

Entity1 - Human

Intent1 - Intentional

Timing1 - Pre-deployment

CategoryRobustness

SubcategoryPoisoning Attacks

Related techniques

Attack methods connected to this risk.

AML.T0020 - Poison Training Data

realized

Methodtaxonomy_keyword_ruleConfidence75%

AML.T0080 - AI Agent Context Poisoning

demonstrated

Methodtaxonomy_keyword_ruleConfidence74%

AML.T0080.001 - Thread

demonstrated

Methodtaxonomy_keyword_ruleConfidence73%

AML.T0066 - Retrieval Content Crafting

demonstrated

Methodtaxonomy_keyword_ruleConfidence73%

AML.T0018 - Manipulate AI Model

realized

Methodtaxonomy_keyword_ruleConfidence72%

AML.T0018.000 - Poison AI Model

demonstrated

Methodtaxonomy_keyword_ruleConfidence72%

AML.T0064 - Gather RAG-Indexed Targets

demonstrated

Methodtaxonomy_keyword_ruleConfidence72%

AML.T0008.002 - Domains

demonstrated

Methodtaxonomy_keyword_ruleConfidence70%

AML.T0010.002 - Data

realized

Methodtaxonomy_keyword_ruleConfidence70%

AML.T0019 - Publish Poisoned Datasets

demonstrated

Methodtaxonomy_keyword_ruleConfidence69%

AML.T0099 - AI Agent Tool Data Poisoning

feasible

Methodtaxonomy_keyword_ruleConfidence68%

AML.T0070 - RAG Poisoning

demonstrated

Methodtaxonomy_keyword_ruleConfidence68%

AML.T0108 - AI Agent

demonstrated

Methodtaxonomy_keyword_ruleConfidence67%

AML.T0110 - AI Agent Tool Poisoning

realized

Methodtaxonomy_keyword_ruleConfidence67%

AML.T0086 - Exfiltration via AI Agent Tool Invocation

realized

Methodtaxonomy_keyword_ruleConfidence67%

AML.T0034.002 - Agentic Resource Consumption

feasible

Methodtaxonomy_keyword_ruleConfidence67%

AML.T0010.005 - AI Agent Tool

realized

Methodtaxonomy_keyword_ruleConfidence66%

AML.T0109 - AI Supply Chain Rug Pull

realized

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0059 - Erode Dataset Integrity

demonstrated

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0043.004 - Insert Backdoor Trigger

demonstrated

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0011.002 - Poisoned AI Agent Tool

realized

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0104 - Publish Poisoned AI Agent Tool

realized

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0058 - Publish Poisoned Models

realized

Methodtaxonomy_keyword_ruleConfidence65%

AML.T0079 - Stage Capabilities

demonstrated

Methodtaxonomy_keyword_ruleConfidence65%

Suggested mitigations

Defenses that may help with related attacks.

Limit Model Artifact Release

Business and Data UnderstandingDeployment

LifecycleBusiness and Data Understanding + 1 moreCategoryPolicy

Control Access to AI Models and Data at Rest

Business and Data UnderstandingData Preparation+2 more

LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

Sanitize Training Data

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

Validate AI Model

ML Model EvaluationMonitoring and Maintenance

LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

AI Bill of Materials

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryPolicy

Maintain AI Dataset Provenance

Data PreparationBusiness and Data Understanding

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Memory Hardening

ML Model EngineeringDeployment+1 more

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Code Signing

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

Verify AI Artifacts

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

AI Telemetry Logging

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Privileged AI Agent Permissions Configuration

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

Single-User AI Agent Permissions Configuration

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

AI Agent Tools Permissions Configuration

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

Human In-the-Loop for AI Agent Actions

Deployment

LifecycleDeploymentCategoryTechnical - ML

Restrict AI Agent Tool Invocation on Untrusted Data

Deployment

LifecycleDeploymentCategoryTechnical - ML

Segmentation of AI Agent Components

DeploymentBusiness and Data Understanding

LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Input and Output Validation for AI Agent Components

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

Model Hardening

Data PreparationML Model Engineering

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Use Ensemble Methods

ML Model Engineering

LifecycleML Model EngineeringCategoryTechnical - ML

Input Restoration

Data PreparationML Model Evaluation+2 more

LifecycleData Preparation + 3 moreCategoryTechnical - ML

Adversarial Input Detection

Data PreparationML Model Engineering+3 more

LifecycleData Preparation + 4 moreCategoryTechnical - ML

Source

Research source for this risk, when available.

Included resource

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

AuthorsLiu et al.Year2024TypePreprint

DOI10.48550/arXiv.2308.05374 URLhttps://arxiv.org/abs/2308.05374

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/