Meta-cognition - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.

Mitigations2Defenses that may help with related attacks.

Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Agents that reason about their own computational resources and logically uncertain events can encounter strange paradoxes due to Godelian limitations (Fallenstein and Soares, 2015; Soares and Fallenstein, 2014, 2017) and shortcomings of probability theory (Soares and Fallenstein, 2014, 2015, 2017). They may also be reflectively unstable, preferring to change the principles by which they select actions (Arbital, 2018)."

Domain7. AI System Safety, Failures, & Limitations

Subdomain7.3 > Lack of capability or robustness

Entity3 - Other

Intent2 - Unintentional

Timing3 - Other

CategoryMeta-cognition

Subcategoryn/a

Related techniques

Attack methods connected to this risk.

AML.T0046 - Spamming AI System with Chaff Data

feasible

Methodtext_similarity_sqliteConfidence54%

Suggested mitigations

Defenses that may help with related attacks.

Restrict Number of AI Model Queries

Business and Data UnderstandingDeployment+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Control Access to AI Models and Data in Production

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryPolicy

Source

Research source for this risk, when available.

Included resource

AGI Safety Literature Review

AuthorsEveritt. Lea & HutterYear2018TypePreprint

DOI10.48550/arXiv.1805.01109 URLhttps://arxiv.org/pdf/1805.01109

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/