category

AI Risks

Common risks that can happen when AI systems are built, deployed, or used.

Showing 821-840 of 1686 records

Specification gaming

Specification gaming is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.1 > AI pursuing its own goals in conflict with human goals or...

Agency

Agency is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.2 > AI possessing dangerous capabilities. It is most relevant during 4 - No...

Model sensitivity to prompt formatting

Model sensitivity to prompt formatting is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.3 > Lack of capability or robustness. It is...

Lack of understanding of in-context learning in language models

Lack of understanding of in-context learning in language models is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.4 > Lack of transp...

Knowledge conflicts in retrieval-augmented LLMs

Knowledge conflicts in retrieval-augmented LLMs is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.3 > Lack of capability or robustne...

Models distracted by irrelevant context

Models distracted by irrelevant context is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.3 > Lack of capability or robustness. It i...

Encoded reasoning

Encoded reasoning is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.2 > AI possessing dangerous capabilities. It is most relevant du...

Model outputs inconsistent with chain-of-thought reasoning

Model outputs inconsistent with chain-of-thought reasoning is an AI risk in 7. AI System Safety, Failures, & Limitations focused on 7.4 > Lack of transparenc...

Biases are not accurately reflected in explanations

Biases are not accurately reflected in explanations is an AI risk in 1. Discrimination & Toxicity focused on 1.1 > Unfair discrimination and misrepresentatio...

Misunderstanding or overestimating the results and scope of interpretability techniques

Misunderstanding or overestimating the results and scope of interpretability techniques is an AI risk focused on X.1 > Excluded. It is most relevant during 4...

Auditor failure

Auditor failure is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant during 1 - Pre-deployment.

Auditor capacity mismatch

Auditor capacity mismatch is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant during 1 - Pre-deploym...

Conflicts of interest in auditor selection

Conflicts of interest in auditor selection is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant durin...

Model Evaluations (Auditing)

Model Evaluations (Auditing) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant during 1 - Pre-depl...

Benchmark Limitations (Underestimating capabilities that are not covered by benchmarks)

Benchmark Limitations (Underestimating capabilities that are not covered by benchmarks) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 >...

Benchmark Limitations (Insufficient benchmarks for AI safety evaluation)

Benchmark Limitations (Insufficient benchmarks for AI safety evaluation) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance fail...

Benchmark Inaccuracy (Benchmark saturation)

Benchmark Inaccuracy (Benchmark saturation) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant duri...

Benchmark Inaccuracy (Benchmarks may not accurately evaluate capabilities)

Benchmark Inaccuracy (Benchmarks may not accurately evaluate capabilities) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance fa...

Benchmarking (Annotation contamination)

Benchmarking (Annotation contamination) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant during 1...

Benchmarking (Guideline contamination)

Benchmarking (Guideline contamination) is an AI risk in 6. Socioeconomic and Environmental focused on 6.5 > Governance failure. It is most relevant during 1...