Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"LLMs have been demonstrated to pursue consistent context [129]–[132], which may lead to erroneous generation when the prefixes contain false information. Typical examples include sycophancy [129], [130], false demonstrations-induced hallucinations [113], [133], and snowballing [131]. As LLMs are generally fine-tuned with instruction-following data and user feedback, they tend to reiterate user-provided opinions [129], [130], even though the opinions contain misinformation. Such a sycophantic behavior amplifies the likelihood of generating hallucinations, since the model may prioritize user opinions over facts."
Suggested mitigations
Defenses that may help with related attacks.
Restrict Number of AI Model Queries
Generative AI Guardrails
Generative AI Guidelines
Generative AI Model Alignment
Control Access to AI Models and Data at Rest
Validate AI Model
Code Signing
Source
Research source for this risk, when available.
Included resource
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
