AI Risk

Extraction Attacks

"Extraction attacks [137] allow an adversary to query a black-box victim model and build a substitute model by training on the queries and responses. The substitute model could achieve almost the same performance as the victim model. While it is hard to fully replicate the capabilities of LLMs, adversaries could develop a domainspecific model that draws domain knowledge from LLMs"

View related techniques Read profile

AI Risk2. Privacy & Security2.2 > AI system security vulnerabilities and attacks2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques16Attack methods connected to this risk.

Mitigations13Defenses that may help with related attacks.

Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain2. Privacy & Security

Subdomain2.2 > AI system security vulnerabilities and attacks

Entity1 - Human

Intent1 - Intentional

Timing2 - Post-deployment

CategoryModel Attacks

SubcategoryExtraction Attacks

Suggested mitigations

Defenses that may help with related attacks.

Control Access to AI Models and Data at Rest

Business and Data UnderstandingData Preparation+2 more

LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

AI Model Distribution Methods

Deployment

LifecycleDeploymentCategoryPolicy

Memory Hardening

ML Model EngineeringDeployment+1 more

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Passive AI Output Obfuscation

DeploymentML Model Evaluation

LifecycleDeployment + 1 moreCategoryTechnical - ML

Restrict Number of AI Model Queries

Business and Data UnderstandingDeployment+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

AI Telemetry Logging

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Sanitize Training Data

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

Verify AI Artifacts

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Maintain AI Dataset Provenance

Data PreparationBusiness and Data Understanding

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Encrypt Sensitive Information

Data PreparationML Model Engineering+1 more

LifecycleData Preparation + 2 moreCategoryTechnical - Cyber

Code Signing

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

AI Bill of Materials

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryPolicy

Control Access to AI Models and Data in Production

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryPolicy

Source

Research source for this risk, when available.

Included resource

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

AuthorsCui et al.Year2024TypePreprint

DOI10.48550/arXiv.2401.05778 URLhttps://arxiv.org/abs/2401.05778

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/