Manipulate User LLM Chat History - AI Security Technique

AI Security Technique

Adversaries may manipulate a user's large language model (LLM) chat history to cover the tracks of their malicious behavior. They may hide persistent changes they have made to the LLM's behavior, or obscure their attempts at discovering private information about the user. To do so, adversaries may delete or edit existing messages or create new threads as part of their coverup. This is feasible if the adversary has...

Overview

A source-backed snapshot of this AI security technique.

To do so, adversaries may delete or edit existing messages or create new threads as part of their coverup. This is feasible if the adversary has the victim's authentication tokens for the backend LLM service or if they have direct access to the victim's chat interface.

Chat interfaces (especially desktop interfaces) often do not show the injected prompt for any ongoing chat, as they update chat history only once when initially opening it. This can help the adversary's manipulations go unnoticed by the victim.

Tactics1Attacker goals connected to this method.

Mitigations0Defenses that may help against this attack.

AI risks0Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0092
Maturity: demonstrated
Priority score: 40

ATLAS tactics

Defense Evasion

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses0 ATLAS mitigation records
Public examples2 linked case study records
Research risks0 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

View all mitigations →

No connected defenses. No defense is connected to this attack in the current data.

Case studies

Examples from public reports and exercises.

2 recordsView all case studies →

Exposed ClawdBot Control Interfaces Leads to Credential Access and Execution

A security researcher identified hundreds of exposed ClawdBot control interfaces on the public internet. ClawdBot (now OpenClaw) “is a personal AI assistant you run on your own devices. It answers you on the channels you already use … , plus extension channels. … It can speak and listen on macOS/iOS/Android, and can render a live Canvas you control.”^[1] The researcher was able to access credentials to a variety of connected applications via ClawdBot’s configuration file. They were also able to invoke ClawdBot’s skills by prompting it via the chat interface, leading to root access in the container.

The researcher searched Shodan^[2] to identify Clawdbot instances exposed on the public internet, some without authentication enabled. The researcher demonstrated that the ClawdBot’s authentication mechanism could be bypassed due to a proxy misconfiguration.

With access to ClawdBot’s control interface, they were then able to access ClawdBot’s configuration, which contained credentials to a variety of other services. Across various exposed instances of ClawdBot, they identified Anthropic API Keys, Telegram Bot Tokens, Slack Oauth Credentials, and Signal Device Linking URIs. The researcher prompted ClawdBot directly via the chat interface, which led to exposure of its system prompt. They were also able to get ClawdBot to execute commands via it’s bash skill, which at least in once instance led to root access in the ClawdBot container.

The researcher noted a broad range of other impacts they could have had with this level of access, including:

Manipulation of user chat history with the ClawdBot AI agent
Exfiltration of conversation histories of any connected messaging services
Impersonation of users by sending messages on their behalf via connected messaging services

References

Date2026-01-25

exercise

AIKatz: Attacking LLM Desktop Applications

Researchers at Lumia have demonstrated that it is possible to extract authentication tokens from the memory of LLM Desktop Applications. An attacker could then use those tokens to impersonate as the victim to the LLM backed, thereby gaining access to the victim’s conversations as well as the ability to interfere in future conversations. The attacker’s access would allow them the ability to directly inject prompts to change the LLM’s behavior, poison the LLM’s context to have persistent effects, manipulate the user’s conversation history to cover their tracks, and ultimately impact the confidentiality, integrity, and availability of the system. The researchers demonstrated this on Anthropic Claude, Microsoft M365 Copilot, and OpenAI ChatGPT.

Vendor Responses to Responsible Disclosure:

Anthropic (HackerOne) - Closed as informational since local attack.
Microsoft Security Response Center - Attack doesn’t bypass security boundaries for CVE.
OpenAI (BugCrowd) - Closed as informational and noted that it’s up to Microsoft to patch this behavior.

Date2025-01-01

exercise

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json