Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"Cyclic Behaviour. The dynamics described above are highly non-linear (small changes to the system’s state can result in large changes to its trajectory). Similar non-linear dynamics can emerge in multi- agent learning and lead to a variety of phenomena that do not occur in single-agent learning (Barfuss et al., 2019; Barfuss & Mann, 2022; Galla & Farmer, 2013; Leonardos et al., 2020; Nagarajan et al., 2020). One of the simplest examples of this phenomenon is Q-learning (Watkins & Dayan, 1992): in the case of a single agent, convergence to an optimal policy is guaranteed under modest conditions, but in the (mixed-motive) case of multiple agents, this same learning rule can lead to cycles and thus non- convergence (Zinkevich et al., 2005). While cycles in themselves need not carry any risk, their presence can subvert the expected or desirable properties of a given system."
Suggested mitigations
Defenses that may help with related attacks.
Source
Research source for this risk, when available.
Included resource
Multi-Agent Risks from Advanced AI
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
