Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"Finally, the principal value proposition of AI assistants is that they can either enhance or automate decision-making capabilities of people in society, thus lowering the cost and increasing the accuracy of decision-making for its user. However, benefiting from this enhancement necessarily means delegating some degree of agency away from a human and towards an automated decision-making system—motivating research fields such as value alignment. This introduces a whole new form of malicious use which does not break the tripwire of what one might call an ‘attack’ (social engineering, cyber offensive operations, adversarial AI, jailbreaks, prompt injections, exfiltration attacks, etc.). When someone delegates their decision-making to an AI assistant, they also delegate their decision-making to the wishes of the agent’s actual controller. If that controller is malicious, they can attack a user—perhaps subtly—by simply nudging how they make decisions into a problematic direction. Fully documenting the myriad of ways that people—seeking help with their decisions—may delegate decision-making authority to AI assistants, and subsequently come under malicious influence, is outside the scope of this paper. However, as a motivation for future work, scholars must investigate different forms of networked influence that could arise in this way. With more advanced AI assistants, it may become logistically possible for one, or a few AI assistants, to guide or control the behavior of many others. If this happens, then malicious actors could subtly influence the decision-making of large numbers of people who rely on assistants for advice or other functions. Such malicious use might not be illegal, would not necessarily violate terms of service, and may be difficult to even recognize. Nonetheless, it could generate new forms of vulnerability and needs to be better understood ahead of time for that reason."
Suggested mitigations
Defenses that may help with related attacks.
Generative AI Guardrails
Generative AI Guidelines
Generative AI Model Alignment
Source
Research source for this risk, when available.
Included resource
The Ethics of Advanced AI Assistants
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
