Indirect Prompt Injection Threats: Bing Chat Data Pirate - AI Case Study

AI Case Study

Whenever interacting with Microsoft's new Bing Chat LLM Chatbot, a user can allow Bing Chat permission to view and access currently open websites throughout the chat session. Researchers demonstrated the ability for an attacker to plant an injection in a website the user is visiting, which silently turns Bing Chat into a Social Engineer who seeks out and exfiltrates personal information. The user doesn't have to a...

Overview

Case steps5Steps described in the case record.

Techniques5Attack methods mentioned in the case steps.

Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

1Dominant ATLAS tactic. Resource Development appears in 1 case steps.
2Multiple attack methods. The case connects to 5 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Resource Development1Defense Evasion1Execution1Initial Access1Impact1

Step 1
Develop Capabilities
Resource Development

The attacker created a website containing malicious system prompts for the LLM to ingest in order to influence the model's behavior. These prompts are ingested by the model when access to it is requested by the user.
Step 2
LLM Prompt Obfuscation
Defense Evasion

The malicious prompts were obfuscated by setting the font size to 0, making it harder to detect by a human.
Step 3
Indirect
Execution

Bing chat is capable of seeing currently opened websites if allowed by the user. If the user has the adversary's website open, the malicious prompt will be executed.
Step 4
Spearphishing via Social Engineering LLM
Initial Access

The malicious prompt directs Bing Chat to change its conversational style to that of a pirate, and its behavior to subtly convince the user to provide PII (e.g. their name) and encourage the user to click on a link that has the user's PII encoded into the URL.
Step 5
User Harm
Impact

With this user information, the attacker could now use the user's PII it has received for further identity-level attacks, such identity theft or fraud.

Mitigations

Defenses connected to the attack methods in this case.

4 recordsView all mitigations →

AI Telemetry Logging

Implement logging of inputs and outputs of deployed AI models. When deploying AI agents, implement logging of the intermediate steps of agentic actions and decisions, data access and tool use, installation commands, and identity of the agent. Monitoring logs can help to detect security threats and mitigate impacts.

Additionally, having logging enabled can discourage adversaries who want to remain undetected from utilizing AI resources.

Deepfake Detection

Apply deepfake detection algorithms against any untrusted or user-provided data, especially in impactful applications such as biometric verification, to block generated content.

Detectors may use a combination of approaches, including:

AI models trained to differentiate between real and deepfake content.
Identifying common inconsistencies in deepfake content, such as unnatural facial movements, audio mismatches, or pixel-level artifacts.
Biometrics analysis, such blinking, eye movements, and microexpressions.

Input and Output Validation for AI Agent Components

Implement validation on inputs and outputs for the tools and data sources used by AI agents. Validation includes enforcing a common data format, schema validation, checks for sensitive or prohibited information leakage, and data sanitization to remove potential injections or unsafe code. Input and output validation can help prevent compromises from spreading in AI-enabled systems and can help secure the workflow when multiple components are chained together. Validation should be performed external to the AI agent.

User Training

Educate AI model developers to on AI supply chain risks and potentially malicious AI artifacts. Educate users on how to identify deepfakes and phishing attempts.

Source evidence

Original public records and references for this case.

View all sources →

Original source

Original source links

Open the MITRE ATLAS data and public references used for this case study.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json Indirect Prompt Injection Threats: Bing Chat Data Piratehttps://greshake.github.io/