Overview
Risk patterns
Patterns found in the case record and its linked vulnerabilities.
- 1Dominant ATLAS tactic. Persistence appears in 2 case steps.
- 2Multiple attack methods. The case connects to 6 unique AI attack methods.
Procedure timeline
Search the case steps or filter them by attacker goal.
-
Resource Development
Step 1
LLM Prompt Crafting
The researcher crafted a basic prompt asking to set the memory context with a bulleted list of incorrect facts.
-
Defense Evasion
Step 2
LLM Prompt Obfuscation
The researcher placed the prompt in a Google Doc hidden in the header with tiny font matching the document’s background color to make it invisible.
-
Initial Access The Google Doc was shared with the victim, making it accessible to ChatGPT’s via its Connected App feature.
-
Execution
Step 4
Indirect
When a user referenced something in the shared document, its contents was added to the chat context, and the prompt was executed by ChatGPT.
-
Persistence
Step 5
Memory
The prompt caused new memories to be introduced, changing the behavior of ChatGPT. The chat window indicated that the memory has been set, despite the lack of human verification or intervention. All future chat sessions will use the poisoned memory store.
-
Persistence The memory poisoning prompt injection persists in the shared Google Doc, where it can spread to other users and chat sessions, making it difficult to trace sources of the memories and remove.
-
Impact
Step 7
User Harm
The victim can be misinformed, misled, or influenced as directed by ChatGPT's poisoned memories.
Mitigations
Defenses connected to the attack methods in this case.
Sources
Original public records and references for this case.
Original source
Original source links
Open the MITRE ATLAS data and public references used for this case study.