Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"If the training data includes illegal or harmful information, such as false, biased, or IPR-infringing content, or lacks diversity in its sources, the output may include harmful content like illegal, malicious, or extreme information. Training data is also at risk of being poisoned through tampering, error injection, or misleading actions by attackers. This can interfere with the model's probability distribution, reducing its accuracy and reliability."
Suggested mitigations
Defenses that may help with related attacks.
AI Telemetry Logging
Privileged AI Agent Permissions Configuration
Single-User AI Agent Permissions Configuration
AI Agent Tools Permissions Configuration
Human In-the-Loop for AI Agent Actions
Restrict AI Agent Tool Invocation on Untrusted Data
Segmentation of AI Agent Components
Input and Output Validation for AI Agent Components
Control Access to AI Models and Data at Rest
Validate AI Model
Code Signing
Source
Research source for this risk, when available.
Included resource
AI Safety Governance Framework
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
