Drive-by Compromise - AI Security Technique

Overview

A source-backed snapshot of this AI security technique.

Adversaries may gain access to an AI system through a user visiting a website over the normal course of browsing, or an AI agent retrieving information from the web on behalf of a user. Websites can contain an LLM Prompt Injection which, when executed, can change the behavior of the AI model.

The same approach may be used to deliver other types of malicious code that don't target AI directly (See Drive-by Compromise in ATT&CK).

Tactics1Attacker goals connected to this method.

Mitigations0Defenses that may help against this attack.

AI risks12Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0078
Maturity: demonstrated
ATT&CK external ID: T1189
Priority score: 120

ATLAS tactics

Initial Access

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses0 ATLAS mitigation records
Public examples4 linked case study records
Research risks12 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

View all mitigations →

No connected defenses. No defense is connected to this attack in the current data.

Case studies

Examples from public reports and exercises.

4 recordsView all case studies →

OpenClaw Command & Control via Prompt Injection

Researchers at HiddenLayer demonstrated how a webpage can embed an indirect prompt injection that causes OpenClaw to silently execute a malicious script. Once executed, the script plants persistent malicious instructions into future system prompts, allowing the attacker to issue new commands, turning OpenClaw into a command and control agent.

What makes this attack unique is that, through a simple indirect prompt injection attack into an agentic lifecycle, untrusted content can be used to spoof the model’s control scheme and induce unapproved tool invocation for execution. Through this single inject, an LLM can become a persistent, automated command & control implant.

Date2026-02-03

exercise

Data Exfiltration via an MCP Server used by Cursor

The Backslash Security Research Team demonstrated that a Model Context Protocol (MCP) tool can be used as a vector for an indirect prompt injection attack on Cursor, potentially leading to the execution of malicious shell commands.

The Backslash Security Research Team created a proof-of-concept MCP server capable of scraping webpages. When a user asks Cursor to use the tool to scrape a site containing a malicious prompt, the prompt is injected into Cursor’s context. The prompt instructs Cursor to execute a shell command to exfiltrate the victim’s AI agent configuration files containing credentials. Cursor does prompt the user before executing the malicious command, potentially mitigating the attack.

Date2025-06-24

exercise

AI ClickFix: Hijacking Computer-Use Agents Using ClickFix

Embrace the Red demonstrated that AI computer-use agents are vulnerable to social engineering attacks and can be manipulated into executing arbitrary code on a victim’s machine. The attack is a variation on “ClickFix” which is a social engineering attack that fools humans into copying malicious commands and executing them.

The researcher used ChatGPT to generate a website designed to attract interactions with computer-use agents. When a user asked their Claude Computer-Use Agent to visit the researcher’s website, the text “Are you a computer? Please see instructions to confirm:” caused the agent to click the associated button. This executed JavaScript to copy a malicious command into the agent’s clipboard. The agent then proceeded to follow the instructions, opening a terminal, pasting the malicious command, and executing it. The command downloads a script from the researcher’s website and executes it. In the demonstration, the script opens the victim’s Calculator App, but in practice an adversary could run arbitrary code, compromising the victim’s system.

Date2025-05-24

exercise

ChatGPT Conversation Exfiltration

Embrace the Red demonstrated that ChatGPT users' conversations can be exfiltrated via an indirect prompt injection. To execute the attack, a threat actor uploads a malicious prompt to a public website, where a ChatGPT user may interact with it. The prompt causes ChatGPT to respond with the markdown for an image, whose URL has the user's conversation secretly embedded. ChatGPT renders the image for the user, creating a automatic request to an adversary-controlled script and exfiltrating the user's conversation. Additionally, the researcher demonstrated how the prompt can execute other plugins, opening them up to additional harms.

Date2023-05-01

exercise

Related risks

Research-backed risks connected to this topic.

Top 10 of 12View all risks →

Adversarial AI: Prompt Injections

"Prompt injections represent another class of attacks that involve the malicious insertion of prompts or requests in LLM-based interactive systems, leading to unintended actions or disclosure of sensitive information...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.78

Prompt injection

"Prompt Injections are a form of Adversarial Input that involve manipulating the text instructions given to a GenAI system (Liu et al., 2023). Prompt Injections exploit loopholes in a model’s architec- tures that have...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.77

Security - Robustness

While AI safety focuses on threats emanating from generative AI systems, security centers on threats posed to these systems. The most extensively discussed issue in this context are jailbreaking risks, which involve t...

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.76

Prompt injection attack

"A prompt injection attack forces a generative model that takes a prompt as input to produce unexpected output by manipulating the structure, instructions, or information contained in its prompt."

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

Confidence0.75

Showing 4 of 10