Pursuing Consistent Context

Record summary

A quick snapshot of what this page covers.

Techniques7Attack methods connected to this risk.

Mitigations7Defenses that may help with related attacks.

Domain3. MisinformationThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"LLMs have been demonstrated to pursue consistent context [129]–[132], which may lead to erroneous generation when the prefixes contain false information. Typical examples include sycophancy [129], [130], false demonstrations-induced hallucinations [113], [133], and snowballing [131]. As LLMs are generally fine-tuned with instruction-following data and user feedback, they tend to reiterate user-provided opinions [129], [130], even though the opinions contain misinformation. Such a sycophantic behavior amplifies the likelihood of generating hallucinations, since the model may prioritize user opinions over facts."

Domain3. Misinformation

Subdomain3.1 > False or misleading information

Entity3 - Other

Intent2 - Unintentional

Timing2 - Post-deployment

CategoryHallucinations

SubcategoryPursuing Consistent Context

Related techniques

Attack methods connected to this risk.

Suggested mitigations

Defenses that may help with related attacks.

Source

Research source for this risk, when available.

Included resource

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

AuthorsCui et al.Year2024TypePreprint

DOI10.48550/arXiv.2401.05778 URLhttps://arxiv.org/abs/2401.05778

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/

Pursuing Consistent Context

Record summary

Risk profile

Suggested mitigations

Restrict Number of AI Model Queries

Generative AI Guardrails

Generative AI Guidelines

Generative AI Model Alignment

Control Access to AI Models and Data at Rest

Validate AI Model

Code Signing

Source

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

MIT AI Risk Repository