Biased Training Data - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain1. Discrimination & ToxicityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Compared with the definition of toxicity, the definition of bias is more subjective and contextdependent. Based on previous work [97], [101], we describe the bias as disparities that could raise demographic differences among various groups, which may involve demographic word prevalence and stereotypical contents. Concretely, in massive corpora, the prevalence of different pronouns and identities could influence an LLM’s tendency about gender, nationality, race, religion, and culture [4]. For instance, the pronoun He is over-represented compared with the pronoun She in the training corpora, leading LLMs to learn less context about She and thus generate He with a higher probability [4], [102]. Furthermore, stereotypical bias [103] which refers to overgeneralized beliefs about a particular group of people, usually keeps incorrect values and is hidden in the large-scale benign contents. In effect, defining what should be regarded as a stereotype in the corpora is still an open problem."

Domain1. Discrimination & Toxicity

Subdomain1.1 > Unfair discrimination and misrepresentation

Entity2 - AI

Intent2 - Unintentional

Timing1 - Pre-deployment

CategoryToxicity and Bias Tendencies

SubcategoryBiased Training Data

Related techniques

Attack methods connected to this risk.

No linked attack methods. No AI attack method is connected to this risk in the current data.

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

AuthorsCui et al.Year2024TypePreprint

DOI10.48550/arXiv.2401.05778 URLhttps://arxiv.org/abs/2401.05778

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/