Discrimination and Stereotype Reproduction

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain1. Discrimination & ToxicityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"General purpose AI models interpret and respond to inputs based on their training data, potentially causing Discrimination and Stereotype Reproduction. Since they are “black-box” models, the exact mechanism behind decisions remains opaque and attempts to mitigate harmful outputs are not fully reliable yet. These models have the capacity to influence a multitude of downstream applications, decisions, and processes, thereby affecting many individuals simultaneously. The extent of this impact could outstrip the range of any single human or group of humans, amplifying the potential consequences of embedded biases or stereotypes."

Domain1. Discrimination & Toxicity

Subdomain1.1 > Unfair discrimination and misrepresentation

Entity2 - AI

Intent2 - Unintentional

Timing2 - Post-deployment

CategoryRisks from Unreliability

SubcategoryDiscrimination and Stereotype Reproduction

Related techniques

Attack methods connected to this risk.

No linked attack methods. No AI attack method is connected to this risk in the current data.

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Governing General Purpose AI: A Comprehensive Map of Unreliability, Misuse and Systemic Risks

AuthorsMaham & KüspertYear2023TypePolicy brief

URLhttps://www.interface-eu.org/storage/archive/files/snv_governing_general_purpose_ai_pdf.pdf

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/