Record summary
A quick snapshot of what this page covers.
Attack context
How this AI attack works in practice.
AI agent tools capable of performing write operations may be invoked to exfiltrate data to an adversary. Sensitive information can be encoded into the tool's input parameters and transmitted to an adversary-controlled location (such as an inbox, document, or server) as part of a seemingly legitimate action. Variants include sending emails, creating or modifying documents, updating CRM records, or even generating media such as images or videos.
The invoked tool itself may be legitimate but invoked by an adversary via LLM Prompt Injection, or the tool may be malicious (See AI Agent Tool Poisoning.
AI Agent Tool Poisoning can also be used manipulate the inputs and destination of a separate legitimate tool, invoked through normal usage by the victim.
- ATLAS ID
- AML.T0086
- Priority score
- 229
Mitigations
Defenses that may help against this attack.
AML.M0028 - AI Agent Tools Permissions Configuration
Configuring AI Agent tools with access controls inherited from the user or the AI Agent invoking the tool can limit an adversary's capabilities within a system, including their ability to abuse tool invocations and exfiltrate sensitive data.
AML.M0024 - AI Telemetry Logging
Log AI agent tool invocations to detect malicious calls.
AML.M0029 - Human In-the-Loop for AI Agent Actions
Requiring user confirmation of AI agent tool invocations can prevent the automatic execution of tools by an adversary.
AML.M0033 - Input and Output Validation for AI Agent Components
Validation can prevent adversaries from utilizing tools in an agentic workflow to compromise sensitive data sources.
AML.M0026 - Privileged AI Agent Permissions Configuration
Configuring privileged AI agents with proper access controls for tool use can limit an adversary's ability to abuse tool invocations if the agent is compromised.
AML.M0030 - Restrict AI Agent Tool Invocation on Untrusted Data
Restricting the automatic tool use when untrusted data is present can prevent adversaries from invoking tools via prompt injections.
AML.M0032 - Segmentation of AI Agent Components
Segmentation can prevent adversaries from utilizing tools in an agentic workflow to compromise sensitive data sources.
AML.M0027 - Single-User AI Agent Permissions Configuration
Configuring AI agents with permissions that are inherited from the user for tool use can limit an adversary's ability to abuse tool invocations if the agent is compromised.
Case studies
Examples from public reports and exercises.
Poisoned Postmark MCP Server Email Exfiltration
A bad actor successfully exfiltrated emails from users of the Postmark’s MCP server via a supply chain attack. Postmark is an email delivery service that allows organizations to send marketing and transactional emails via API. The Postmark MCP server allows users to interact with Postmark via AI agents.
The bad actor impersonated Postmark, by registering the postmark-mcp package name on npm. They initially published the legitimate versions of the MCP server. After the package became popular and reached over 1,000 downloads per week, the bad actor performed a rugpull and uploaded a malicious version of the package. The malicious version added the bad actor’s email address in the BCC line of all emails sent by the MCP tool. Users who upgraded to this version and continued to use the tool would have all emails exfiltrated to the bad actor.
Data Exfiltration via an MCP Server used by Cursor
The Backslash Security Research Team demonstrated that a Model Context Protocol (MCP) tool can be used as a vector for an indirect prompt injection attack on Cursor, potentially leading to the execution of malicious shell commands.
The Backslash Security Research Team created a proof-of-concept MCP server capable of scraping webpages. When a user asks Cursor to use the tool to scrape a site containing a malicious prompt, the prompt is injected into Cursor’s context. The prompt instructs Cursor to execute a shell command to exfiltrate the victim’s AI agent configuration files containing credentials. Cursor does prompt the user before executing the malicious command, potentially mitigating the attack.
Living Off AI: Prompt Injection via Jira Service Management
Researchers from Cato Networks demonstrated how adversaries can exploit AI-powered systems embedded in enterprise workflows to execute malicious actions with elevated privileges. This is achieved by crafting malicious inputs from external users such as support tickets that are later processed by internal users or automated systems using AI agents. These AI agents, operating with internal context and trust, may interpret and execute the malicious instructions, leading to unauthorized actions such as data exfiltration, privilege escalation, or system manipulation.
Data Exfiltration via Agent Tools in Copilot Studio
Researchers from Zenity demonstrated how an organization’s data can be exfiltrated via prompt injections that target an AI-powered customer service agent.
The target system is a customer service agent built by Zenity in Copilot Studio. It is modeled after an agent built by McKinsey to streamline its customer service needs. The AI agent listens to a customer service email inbox where customers send their engagement requests. Upon receiving a request, the agent looks at the customer’s previous engagements, understands who the best consultant for the case is, and proceeds to send an email to the respective consultant regarding the request, including all of the relevant context the consultant will need to properly engage with the customer.
The Zenity researchers begin by performing targeting to identify an email inbox that is managed by an AI agent. Then they use prompt injections to discover details about the AI agent, such as its knowledge sources and tools. Once they understand the AI agent’s capabilities, the researchers are able to craft a prompt that retrieves private customer data from the organization’s RAG database and CRM, and exfiltrate it via the AI agent’s email tool.
Vendor Response: Microsoft quickly acknowledged and fixed the issue. The prompts used by the Zenity researchers in this exercise no longer work, however other prompts may still be effective.
Data Exfiltration via Remote Poisoned MCP Tool
Researchers at Invariant Labs demonstrated that AI agents configured with remote Model Context Protocol (MCP) Tools can be vulnerable to model poisoning attacks. They show that an MCP Tool can contain malicious prompts in its docstring description, which is ingested into the AI agent’s context, modifying its behavior.
They demonstrate this attack with a proof-of-concept MCP Tool that instructs the agent to perform additional actions before using the tool. The agent is instructed to read files containing credentials from the victim’s machine and store their contents in one of the input variables to the tool. When the tool runs, the victim’s credentials are exfiltrated to the poisoned MCP server.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.