APromptRiskDBThreat intelligence atlas
AI Security Technique

AI Agent Context Poisoning - AI Security Technique

Adversaries may attempt to manipulate the context used by an AI agent's large language model (LLM) to influence the responses it generates or actions it takes. This allows an adversary to persistently change the behavior of the target agent and further their goals. Context poisoning can be accomplished by prompting the an LLM to add instructions or preferences to memory (See Memory) or...

AI Security TechniquedemonstratedPersistence

Record summary

A quick snapshot of what this page covers.

Tactics1Attacker goals connected to this method.
Mitigations1Defenses that may help against this attack.
AI risks12Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

Adversaries may attempt to manipulate the context used by an AI agent's large language model (LLM) to influence the responses it generates or actions it takes. This allows an adversary to persistently change the behavior of the target agent and further their goals.

Context poisoning can be accomplished by prompting the an LLM to add instructions or preferences to memory (See Memory) or by simply prompting an LLM that uses prior messages in a thread as part of its context (See Thread).

ATLAS ID
AML.T0080
Priority score
83
Maturity: demonstrated
Persistence

Mitigations

Defenses that may help against this attack.

AML.M0031 - Memory Hardening

ML Model EngineeringDeployment+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Memory hardening can help protect LLM memory from manipulation and prevent poisoned memories from executing.

Case studies

Examples from public reports and exercises.

No case studies found. No public example is connected to this attack in the current data.

Source

Where this page information comes from.