Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
LMs, due to their remarkable capabilities, carry the same potential for malice as other technological products. For instance, they may be used in information warfare to generate deceptive information or unlawful content, thereby having a significant impact on individuals and society. As current LMs are increasingly built as agents to accomplish user objectives, they may disregard the moral and safety guidelines if operating without adequate supervision. Instead, they may execute user commands mechanically without considering the potential damage. They might interact unpredictably with humans and other systems, especially in open environments
Suggested mitigations
Defenses that may help with related attacks.
Source
Research source for this risk, when available.
Included resource
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
