PromptRiskDBThreat intelligence atlas
AI Risk

Transparency - Explainability

Being a multifaceted concept, the term 'transparency' is both used to refer to technical explainability as well as organizational openness. Regarding the former, papers underscore the need for mechanistic interpretability and for explaining internal mechanisms in generative models. On the organizational front, transparency relates to practices such as informing users about capabilities and shortcomings of models...

AI Risk7. AI System Safety, Failures, & Limitations7.4 > Lack of transparency or interpretability4 - Not coded

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.
Mitigations0Defenses that may help with related attacks.
Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Being a multifaceted concept, the term 'transparency' is both used to refer to technical explainability as well as organizational openness. Regarding the former, papers underscore the need for mechanistic interpretability and for explaining internal mechanisms in generative models. On the organizational front, transparency relates to practices such as informing users about capabilities and shortcomings of models, as well as adhering to documentation and reporting requirements for data collection processes or risk evaluations.

Domain7. AI System Safety, Failures, & Limitations
Subdomain7.4 > Lack of transparency or interpretability
Entity4 - Not coded
Intent4 - Not coded
Timing4 - Not coded
CategoryTransparency - Explainability
Subcategoryn/a

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.