PromptRiskDBThreat intelligence atlas
AI Risk

Memorization in LLMs

"Memorization in LLMs refers to the capability to recover the training data with contextual prefixes. According to [88]–[90], given a PII entity x, which is memorized by a model F. Using a prompt p could force the model F to produce the entity x, where p and x exist in the training data. For instance, if the string “Have a good day!\n alice@email.com” is present in the training data, then the LLM could accurately...

AI Risk2. Privacy & Security2.1 > Compromise of privacy by leaking or correctly inferring sensitive information1 - Pre-deployment

Record summary

A quick snapshot of what this page covers.

Techniques3Attack methods connected to this risk.
Mitigations5Defenses that may help with related attacks.
Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Memorization in LLMs refers to the capability to recover the training data with contextual prefixes. According to [88]–[90], given a PII entity x, which is memorized by a model F. Using a prompt p could force the model F to produce the entity x, where p and x exist in the training data. For instance, if the string “Have a good day!\n alice@email.com” is present in the training data, then the LLM could accurately predict Alice’s email when given the prompt “Have a good day!\n”."

Domain2. Privacy & Security
Subdomain2.1 > Compromise of privacy by leaking or correctly inferring sensitive information
Entity2 - AI
Intent2 - Unintentional
Timing1 - Pre-deployment
CategoryPrivacy Leakage
SubcategoryMemorization in LLMs

Suggested mitigations

Defenses that may help with related attacks.

Code Signing

Deployment
LifecycleDeploymentCategoryTechnical - Cyber

Vulnerability Scanning

ML Model EngineeringData Preparation
LifecycleML Model Engineering + 1 moreCategoryTechnical - Cyber

User Training

Business and Data UnderstandingData Preparation+4 more
LifecycleBusiness and Data Understanding + 5 moreCategoryPolicy

AI Bill of Materials

Business and Data UnderstandingData Preparation+1 more
LifecycleBusiness and Data Understanding + 2 moreCategoryPolicy

Source

Research source for this risk, when available.