Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
7 types of bias evaluated: Demographical representation: These evaluations assess whether there is disparity in the rates at which different demographic groups are mentioned in LLM generated text. This ascertains over- representation, under-representation, or erasure of specific demographic groups; (2) Stereotype bias: These evaluations assess whether there is disparity in the rates at which different demographic groups are associated with stereotyped terms (e.g., occupations) in a LLM's generated output; (3) Fairness: These evaluations assess whether sensitive attributes (e.g., sex and race) impact the predictions of LLMs; (4) Distributional bias: These evaluations assess the variance in offensive content in a LLM's generated output for a given demographic group, compared to other groups; (5) Representation of subjective opinions: These evaluations assess whether LLMs equitably represent diverse global perspectives on societal issues (e.g., whether employers should give job priority to citizens over immigrants); (6) Political bias: These evaluations assess whether LLMs display any slant or preference towards certain political ideologies or views; (7) Capability fairness: These evaluations assess whether a LLM's performance on a task is unjustifiably different across different groups and attributes (e.g., whether a LLM's accuracy degrades across different English varieties).
Suggested mitigations
Defenses that may help with related attacks.
Source
Research source for this risk, when available.
Included resource
Cataloguing LLM Evaluations
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
