APromptRiskDBThreat intelligence atlas
AI Security Technique

Exfiltration via AI Agent Tool Invocation - AI Security Technique

AI agent tools capable of performing write operations may be invoked to exfiltrate data to an adversary. Sensitive information can be encoded into the tool's input parameters and transmitted to an adversary-controlled location (such as an inbox, document, or server) as part of a seemingly legitimate action. Variants include sending emails, creating or modifying documents, updating CRM records, or even generating m...

AI Security TechniquerealizedExfiltration

Record summary

A quick snapshot of what this page covers.

Tactics1Attacker goals connected to this method.
Mitigations8Defenses that may help against this attack.
AI risks25Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

AI agent tools capable of performing write operations may be invoked to exfiltrate data to an adversary. Sensitive information can be encoded into the tool's input parameters and transmitted to an adversary-controlled location (such as an inbox, document, or server) as part of a seemingly legitimate action. Variants include sending emails, creating or modifying documents, updating CRM records, or even generating media such as images or videos.

The invoked tool itself may be legitimate but invoked by an adversary via LLM Prompt Injection, or the tool may be malicious (See AI Agent Tool Poisoning.

AI Agent Tool Poisoning can also be used manipulate the inputs and destination of a separate legitimate tool, invoked through normal usage by the victim.

ATLAS ID
AML.T0086
Priority score
229
Maturity: realized
Exfiltration

Mitigations

Defenses that may help against this attack.

AML.M0028 - AI Agent Tools Permissions Configuration

Deployment
LifecycleDeploymentCategoryTechnical - Cyber

Configuring AI Agent tools with access controls inherited from the user or the AI Agent invoking the tool can limit an adversary's capabilities within a system, including their ability to abuse tool invocations and exfiltrate sensitive data.

AML.M0024 - AI Telemetry Logging

DeploymentMonitoring and Maintenance
LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Log AI agent tool invocations to detect malicious calls.

AML.M0032 - Segmentation of AI Agent Components

DeploymentBusiness and Data Understanding
LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Segmentation can prevent adversaries from utilizing tools in an agentic workflow to compromise sensitive data sources.

Case studies

Examples from public reports and exercises.

Poisoned Postmark MCP Server Email Exfiltration

incident
Date2025-09-01

A bad actor successfully exfiltrated emails from users of the Postmark’s MCP server via a supply chain attack. Postmark is an email delivery service that allows organizations to send marketing and transactional emails via API. The Postmark MCP server allows users to interact with Postmark via AI agents.

The bad actor impersonated Postmark, by registering the postmark-mcp package name on npm. They initially published the legitimate versions of the MCP server. After the package became popular and reached over 1,000 downloads per week, the bad actor performed a rugpull and uploaded a malicious version of the package. The malicious version added the bad actor’s email address in the BCC line of all emails sent by the MCP tool. Users who upgraded to this version and continued to use the tool would have all emails exfiltrated to the bad actor.

Data Exfiltration via an MCP Server used by Cursor

exercise
Date2025-06-24

The Backslash Security Research Team demonstrated that a Model Context Protocol (MCP) tool can be used as a vector for an indirect prompt injection attack on Cursor, potentially leading to the execution of malicious shell commands.

The Backslash Security Research Team created a proof-of-concept MCP server capable of scraping webpages. When a user asks Cursor to use the tool to scrape a site containing a malicious prompt, the prompt is injected into Cursor’s context. The prompt instructs Cursor to execute a shell command to exfiltrate the victim’s AI agent configuration files containing credentials. Cursor does prompt the user before executing the malicious command, potentially mitigating the attack.

Living Off AI: Prompt Injection via Jira Service Management

exercise
Date2025-06-19

Researchers from Cato Networks demonstrated how adversaries can exploit AI-powered systems embedded in enterprise workflows to execute malicious actions with elevated privileges. This is achieved by crafting malicious inputs from external users such as support tickets that are later processed by internal users or automated systems using AI agents. These AI agents, operating with internal context and trust, may interpret and execute the malicious instructions, leading to unauthorized actions such as data exfiltration, privilege escalation, or system manipulation.

Data Exfiltration via Agent Tools in Copilot Studio

exercise
Date2025-06-01

Researchers from Zenity demonstrated how an organization’s data can be exfiltrated via prompt injections that target an AI-powered customer service agent.

The target system is a customer service agent built by Zenity in Copilot Studio. It is modeled after an agent built by McKinsey to streamline its customer service needs. The AI agent listens to a customer service email inbox where customers send their engagement requests. Upon receiving a request, the agent looks at the customer’s previous engagements, understands who the best consultant for the case is, and proceeds to send an email to the respective consultant regarding the request, including all of the relevant context the consultant will need to properly engage with the customer.

The Zenity researchers begin by performing targeting to identify an email inbox that is managed by an AI agent. Then they use prompt injections to discover details about the AI agent, such as its knowledge sources and tools. Once they understand the AI agent’s capabilities, the researchers are able to craft a prompt that retrieves private customer data from the organization’s RAG database and CRM, and exfiltrate it via the AI agent’s email tool.

Vendor Response: Microsoft quickly acknowledged and fixed the issue. The prompts used by the Zenity researchers in this exercise no longer work, however other prompts may still be effective.

Data Exfiltration via Remote Poisoned MCP Tool

exercise
Date2025-04-01

Researchers at Invariant Labs demonstrated that AI agents configured with remote Model Context Protocol (MCP) Tools can be vulnerable to model poisoning attacks. They show that an MCP Tool can contain malicious prompts in its docstring description, which is ingested into the AI agent’s context, modifying its behavior.

They demonstrate this attack with a proof-of-concept MCP Tool that instructs the agent to perform additional actions before using the tool. The agent is instructed to read files containing credentials from the victim’s machine and store their contents in one of the input variables to the tool. When the tool runs, the victim’s credentials are exfiltrated to the poisoned MCP server.

Source

Where this page information comes from.