APromptRiskDBThreat intelligence atlas
AI Risk

Jailbreaking

"A jailbreaking attack attempts to break through the guardrails that are established in the model to perform restricted actions."

AI Risk2. Privacy & Security2.2 > AI system security vulnerabilities and attacks2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques7Attack methods connected to this risk.
Mitigations5Defenses that may help with related attacks.
Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain2. Privacy & Security
Subdomain2.2 > AI system security vulnerabilities and attacks
Entity1 - Human
Intent1 - Intentional
Timing2 - Post-deployment
CategoryInference risks (Multi-category)
SubcategoryJailbreaking

Suggested mitigations

Defenses that may help with related attacks.

AI Telemetry Logging

DeploymentMonitoring and Maintenance
LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Generative AI Guardrails

ML Model EngineeringML Model Evaluation+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Generative AI Guidelines

ML Model EngineeringML Model Evaluation+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Source

Research source for this risk, when available.