Bias - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain1. Discrimination & ToxicityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"In the context of AI, the concept of bias refers to the inclination that AIgenerated responses or recommendations could be unfairly favoring or against one person or group (Ntoutsi et al., 2020). Biases of different forms are sometimes observed in the content generated by language models, which could be an outcome of the training data. For example, exclusionary norms occur when the training data represents only a fraction of the population (Zhuo et al., 2023). Similarly, monolingual bias in multilingualism arises when the training data is in one single language (Weidinger et al., 2021). As ChatGPT is operating across the world, cultural sensitivities to different regions are crucial to avoid biases (Dwivedi et al., 2023). When AI is used to assist in decision-making across different stages of employment, biases and opacity may exist (Chan, 2022). Stereotypes about specific genders, sexual orientations, races, or occupations are common in recommendations offered by generative AI. Hence, the representativeness, completeness, and diversity of the training data are essential to ensure fairness and avoid biases (Gonzalez, 2023). The use of synthetic data for training can increase the diversity of the dataset and address issues with sample-selection biases in the dataset (owing to class imbalances) (Chen et al., 2021). Generative AI applications should be tested and evaluated by a diverse group of users and subject experts. Additionally, increasing the transparency and explainability of generative AI can help in identifying and detecting biases so appropriate corrective measures can be taken."

Domain1. Discrimination & Toxicity

Subdomain1.1 > Unfair discrimination and misrepresentation

Entity2 - AI

Intent2 - Unintentional

Timing3 - Other

CategoryEthical Concerns

SubcategoryBias

Related techniques

Attack methods connected to this risk.

No linked attack methods. No AI attack method is connected to this risk in the current data.

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Generative AI and ChatGPT: Applications, Challenges, and AI-Human Collaboration

AuthorsNah et al.Year2023TypeJournal Article

DOI10.1108/IJOES-05-2023-0107 URLhttps://doi.org/10.1108/IJOES-05-2023-0107

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/