PromptRiskDBThreat intelligence atlas
AI Risk

Intelligibility

"How can we build agent’s whose decisions we can understand? Con- nects explainable decisions (Berkeley) and informed oversight (MIRI)."

AI Risk7. AI System Safety, Failures, & Limitations7.4 > Lack of transparency or interpretability1 - Pre-deployment

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.
Mitigations0Defenses that may help with related attacks.
Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain7. AI System Safety, Failures, & Limitations
Subdomain7.4 > Lack of transparency or interpretability
Entity1 - Human
Intent2 - Unintentional
Timing1 - Pre-deployment
CategoryIntelligibility
Subcategoryn/a

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.