Compromising privacy by correctly inferring private information

Record summary

A quick snapshot of what this page covers.

Techniques20Attack methods connected to this risk.

Mitigations25Defenses that may help with related attacks.

Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Privacy violations may occur at the time of inference even without the individual’s private data being present in the training dataset. Similar to other statistical models, a LM may make correct inferences about a person purely based on correlational data about other people, and without access to information that may be private about the particular individual. Such correct inferences may occur as LMs attempt to predict a person’s gender, race, sexual orientation, income, or religion based on user input."

Domain2. Privacy & Security

Subdomain2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Entity2 - AI

Intent2 - Unintentional

Timing2 - Post-deployment

CategoryInformation Hazards

SubcategoryCompromising privacy by correctly inferring private information

Related techniques

Attack methods connected to this risk.

Suggested mitigations

Defenses that may help with related attacks.

Source

Research source for this risk, when available.

Included resource

Ethical and social risks of harm from language models

AuthorsWeidinger et al.Year2021TypePreprint

DOI10.48550/arXiv.2112.04359 URLhttps://arxiv.org/abs/2112.04359

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/

Compromising privacy by correctly inferring private information

Record summary

Risk profile

Suggested mitigations

Restrict Number of AI Model Queries

Control Access to AI Models and Data in Production

AI Telemetry Logging

Limit Model Artifact Release

Control Access to AI Models and Data at Rest

Sanitize Training Data

Validate AI Model

AI Bill of Materials

Maintain AI Dataset Provenance

Verify AI Artifacts

Encrypt Sensitive Information

AI Model Distribution Methods

Generative AI Guardrails

Privileged AI Agent Permissions Configuration

Single-User AI Agent Permissions Configuration

AI Agent Tools Permissions Configuration

Human In-the-Loop for AI Agent Actions

Restrict AI Agent Tool Invocation on Untrusted Data

Segmentation of AI Agent Components

Input and Output Validation for AI Agent Components

Restrict Library Loading

Code Signing

Vulnerability Scanning

User Training

Passive AI Output Obfuscation

Source

Ethical and social risks of harm from language models

MIT AI Risk Repository