Coercion and Extortion - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques2Attack methods connected to this risk.

Mitigations1Defenses that may help with related attacks.

Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Advanced AI systems might also lead to various forms of coercion and extortion in less extreme settings (Ellsberg, 1968; Harrenstein et al., 2007). These threats might target humans directly (such as the revelation of private information extracted by advanced AI surveillance tools), or other AI systems that are deployed on behalf of humans (such as by hacking a system to limit its resources or operational capacity; see also Section 3.7). Increasing AI cyber-offensive capabilities – including those that target other AI systems via adversarial attacks and jailbreaking (Gleave et al., 2020; Yamin et al., 2021; Zou et al., 2023) – without a commensurate increase in defensive capabilities could make this form of conflict cheaper, more widespread, and perhaps also harder to detect (Brundage et al., 2018). Addressing these issues requires design strategies that prevent AI systems from exploiting, or being susceptible to, such coercive tactics."

Domain7. AI System Safety, Failures, & Limitations

Subdomain7.6 > Multi-agent risks

Entity2 - AI

Intent3 - Other

Timing3 - Other

CategoryConflict

SubcategoryCoercion and Extortion

Related techniques

Attack methods connected to this risk.

AML.T0095.000 - Code Repositories

demonstrated

Methodtext_similarity_sqliteConfidence56%

AML.T0002 - Acquire Public AI Artifacts

realized

Methodtext_similarity_sqliteConfidence52%

Suggested mitigations

Defenses that may help with related attacks.

Limit Public Release of Information

Business and Data Understanding

LifecycleBusiness and Data UnderstandingCategoryPolicy

Source

Research source for this risk, when available.

Included resource

Multi-Agent Risks from Advanced AI

AuthorsHammond et al.Year2025TypeJournal Article

DOIhttps://doi.org/10.48550/arXiv.2502.14143 URLhttps://arxiv.org/abs/2502.14143

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/