Disorientation - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain5. Human-Computer InteractionThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Given the capacity to fine-tune on individual preferences and to learn from users, personal AI assistants could fully inhabit the users’ opinion space and only say what is pleasing to the user; an ill that some researchers call ‘sycophancy’ (Park et al., 2023a) or the ‘yea-sayer effect’ (Dinan et al., 2021). A related phenomenon has been observed in automated recommender systems, where consistently presenting users with content that affirms their existing views is thought to encourage the formation and consolidation of narrow beliefs (Du, 2023; Grandinetti and Bruinsma, 2023; see also Chapter 16). Compared to relatively unobtrusive recommender systems, human-like AI assistants may deliver sycophantism in a more convincing and deliberate manner (see Chapter 9). Over time, these tightly woven structures of exchange between humans and assistants might lead humans to inhabit an increasingly atomistic and polarised belief space where the degree of societal disorientation and fragmentation is such that people no longer strive to understand or place value in beliefs held by others."

Domain5. Human-Computer Interaction

Subdomain5.2 > Loss of human agency and autonomy

Entity1 - Human

Intent2 - Unintentional

Timing2 - Post-deployment

CategoryAnthropomorphism

SubcategoryDisorientation

Related techniques

Attack methods connected to this risk.

No linked attack methods. No AI attack method is connected to this risk in the current data.

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

The Ethics of Advanced AI Assistants

AuthorsGabriel et al.Year2024TypePreprint

DOI10.48550/arXiv.2404.16244 URLhttps://doi.org/10.48550/arXiv.2404.16244

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/