Exploiting External Tools for Attacks

Record summary

A quick snapshot of what this page covers.

Techniques30Attack methods connected to this risk.

Mitigations24Defenses that may help with related attacks.

Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Adversarial tool providers can embed malicious instructions in the APIs or prompts [84], leading LLMs to leak memorized sensitive information in the training data or users’ prompts (CVE2023-32786). As a result, LLMs lack control over the output, resulting in sensitive information being disclosed to external tool providers. Besides, attackers can easily manipulate public data to launch targeted attacks, generating specific malicious outputs according to user inputs. Furthermore, feeding the information from external tools into LLMs may lead to injection attacks [61]. For example, unverified inputs may result in arbitrary code execution (CVE-2023-29374)."

Domain2. Privacy & Security

Subdomain2.2 > AI system security vulnerabilities and attacks

Entity1 - Human

Intent1 - Intentional

Timing2 - Post-deployment

CategoryIssues on External Tools

SubcategoryExploiting External Tools for Attacks

Related techniques

Attack methods connected to this risk.

Suggested mitigations

Defenses that may help with related attacks.

Source

Research source for this risk, when available.

Included resource

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

AuthorsCui et al.Year2024TypePreprint

DOI10.48550/arXiv.2401.05778 URLhttps://arxiv.org/abs/2401.05778

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/

Exploiting External Tools for Attacks

Record summary

Risk profile

Suggested mitigations

Memory Hardening

Restrict Library Loading

Verify AI Artifacts

Vulnerability Scanning

User Training

AI Bill of Materials

AI Telemetry Logging

Privileged AI Agent Permissions Configuration

Single-User AI Agent Permissions Configuration

AI Agent Tools Permissions Configuration

Human In-the-Loop for AI Agent Actions

Restrict AI Agent Tool Invocation on Untrusted Data

Segmentation of AI Agent Components

Input and Output Validation for AI Agent Components

Code Signing

Generative AI Guardrails

Generative AI Guidelines

Generative AI Model Alignment

Control Access to AI Models and Data in Production

Use Ensemble Methods

Control Access to AI Models and Data at Rest

Validate AI Model

Sanitize Training Data

Maintain AI Dataset Provenance

Source

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

MIT AI Risk Repository