Record summary
A quick snapshot of what this page covers.
Attack context
How this AI attack works in practice.
- ATLAS ID
- AML.T0056
- Priority score
- 79
Mitigations
Defenses that may help against this attack.
AML.M0020 - Generative AI Guardrails
Guardrails can prevent harmful inputs that can lead to meta prompt extraction.
AML.M0021 - Generative AI Guidelines
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.M0022 - Generative AI Model Alignment
Model alignment can improve the parametric safety of a model by guiding it away from unsafe prompts and responses.
Case studies
Examples from public reports and exercises.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.