APromptRiskDBThreat intelligence atlas
AI Risk

Lower performance for some languages and social groups

"LMs are typically trained in few languages, and perform less well in other languages [95, 162]. In part, this is due to unavailability of training data: there are many widely spoken languages for which no systematic efforts have been made to create labelled training datasets, such as Javanese which is spoken by more than 80 million people [95]. Training data is particularly missing for languages that are spoken b...

AI Risk1. Discrimination & Toxicity1.3 > Unequal performance across groups2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.
Mitigations0Defenses that may help with related attacks.
Domain1. Discrimination & ToxicityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"LMs are typically trained in few languages, and perform less well in other languages [95, 162]. In part, this is due to unavailability of training data: there are many widely spoken languages for which no systematic efforts have been made to create labelled training datasets, such as Javanese which is spoken by more than 80 million people [95]. Training data is particularly missing for languages that are spoken by groups who are multilingual and can use a technology in English, or for languages spoken by groups who are not the primary target demographic for new technologies."

Domain1. Discrimination & Toxicity
Subdomain1.3 > Unequal performance across groups
Entity2 - AI
Intent2 - Unintentional
Timing2 - Post-deployment
CategoryRisk area 1: Discrimination, Hate speech and Exclusion
SubcategoryLower performance for some languages and social groups

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.