Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"Privacy violations may occur at the time of inference even without the individual’s private data being present in the training dataset. Similar to other statistical models, a LM may make correct inferences about a person purely based on correlational data about other people, and without access to information that may be private about the particular individual. Such correct inferences may occur as LMs attempt to predict a person’s gender, race, sexual orientation, income, or religion based on user input."
Suggested mitigations
Defenses that may help with related attacks.
Restrict Number of AI Model Queries
Control Access to AI Models and Data in Production
AI Telemetry Logging
Limit Model Artifact Release
Control Access to AI Models and Data at Rest
Sanitize Training Data
Validate AI Model
AI Bill of Materials
Maintain AI Dataset Provenance
Verify AI Artifacts
Encrypt Sensitive Information
AI Model Distribution Methods
Generative AI Guardrails
Privileged AI Agent Permissions Configuration
Single-User AI Agent Permissions Configuration
AI Agent Tools Permissions Configuration
Human In-the-Loop for AI Agent Actions
Restrict AI Agent Tool Invocation on Untrusted Data
Segmentation of AI Agent Components
Input and Output Validation for AI Agent Components
Restrict Library Loading
Code Signing
Vulnerability Scanning
User Training
Passive AI Output Obfuscation
Source
Research source for this risk, when available.
Included resource
Ethical and social risks of harm from language models
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
