Multi-Agent Safety Is Not Assured by Single-Agent Safety

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"A foremost lesson of game theory is that optimal decision-making within a single-agent setting (i.e. selfishly optimizing for an agent’s own utility) can produce sub-optimal outcomes in the presence of other strategic agents. Failing to account for the strategic nature of other agents can cause an agent to adopt strategies under which potentially everyone, including the agent itself, ends up worse off (Schelling, 1981; Harsanyi, 1995; Roughgarden, 2005; Nisan, 2007). Examples include collective action problems (or ‘social dilemmas’) such as arms races or the depletion of common resources, as well as other kinds of market failures such as those caused by asymmetric information or negative externalities (Bator, 1958; Coase, 1960; Buchanan and Stubblebine, 1962; Kirzner, 1963; Dubey, 1986)."

Domain7. AI System Safety, Failures, & Limitations

Subdomain7.6 > Multi-agent risks

Entity3 - Other

Intent3 - Other

Timing3 - Other

CategoryMulti-Agent Safety Is Not Assured by Single-Agent Safety

Subcategoryn/a

Related techniques

Attack methods connected to this risk.

No linked attack methods. No AI attack method is connected to this risk in the current data.

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

AuthorsAnwar et al.Year2024TypePreprint

DOI10.48550/arXiv.2404.09932 URLhttps://arxiv.org/abs/2404.09932

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/