Record summary
A quick snapshot of what this page covers.
Control summary
What this defense is meant to help prevent.
Guidelines are safety controls that are placed between user-provided input and a generative AI model to help direct the model to produce desired outputs and prevent undesired outputs.
Guidelines can be implemented as instructions appended to all user prompts or as part of the instructions in the system prompt. They can define the goal(s), role, and voice of the system, as well as outline safety and security parameters.
- ATLAS ID
- AML.M0021
- Priority score
- 35
Covered techniques
Attacks this defense is designed to help with.
AML.T0053 - AI Agent Tool Invocation
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.T0062 - Discover LLM Hallucinations
Guidelines can instruct the model to avoid producing hallucinated content.
AML.T0056 - Extract LLM System Prompt
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.T0057 - LLM Data Leakage
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.T0054 - LLM Jailbreak
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.T0051 - LLM Prompt Injection
Model guidelines can instruct the model to refuse a response to unsafe inputs.
AML.T0061 - LLM Prompt Self-Replication
Guidelines can help instruct the model to produce more secure output, preventing the model from generating self-replicating outputs.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.