PromptRiskDBThreat intelligence atlas
AI Risk

Broadly-Scoped Goals

"Advanced AI systems are expected to develop objectives that span long timeframes,deal with complex tasks, and operate in open-ended settings (Ngo et al., 2024). ...However, it can also bring about the risk of encouraging manipulatingbehaviors (e.g., AI systems may take some bad actions to achieve human happiness, such as persuadingthem to do high-pressure jobs (Jacob Steinhardt, 2023))."

AI Risk7. AI System Safety, Failures, & Limitations7.2 > AI possessing dangerous capabilities2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.
Mitigations1Defenses that may help with related attacks.
Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain7. AI System Safety, Failures, & Limitations
Subdomain7.2 > AI possessing dangerous capabilities
Entity1 - Human
Intent1 - Intentional
Timing2 - Post-deployment
CategoryDouble edge components
SubcategoryBroadly-Scoped Goals

Suggested mitigations

Defenses that may help with related attacks.

Code Signing

Deployment
LifecycleDeploymentCategoryTechnical - Cyber

Source

Research source for this risk, when available.