Bias, Fairness and Representational Harms

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain1. Discrimination & ToxicityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Frontier AI models can contain and magnify biases ingrained in the data they are trained on, reflecting societal and historical inequalities and stereotypes.177 These biases, often subtle and deeply embedded, compromise the equitable and ethical use of AI systems, making it difficult for AI to improve fairness in decisions.178 Removing attributes like race and gender from training data has generally proven ineffective as a remedy for algorithmic bias, as models can infer these attributes from other information such as names, locations, and other seemingly unrelated factors."

Domain1. Discrimination & Toxicity

Subdomain1.1 > Unfair discrimination and misrepresentation

Entity2 - AI

Intent2 - Unintentional

Timing3 - Other

CategoryBias, Fairness and Representational Harms

Subcategoryn/a

Related techniques

Attack methods connected to this risk.

AML.T0112.001 - AI Artifacts

feasible

Methodtext_similarity_sqliteConfidence55%

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Capabilities and Risks from frontier AI

AuthorsDSITYear2023TypeReport

URLhttps://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/