AI Service Proxies - AI Security Technique

Record summary

A quick snapshot of what this page covers.

Tactics0Attacker goals connected to this method.

Mitigations0Defenses that may help against this attack.

AI risks5Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

Adversaries may utilize commercial proxy services that resell access to AI services such as frontier model APIs.

This infrastructure can be used to conduct large-scale campaigns to perform Exfiltration via AI Inference API via distillation. Adversaries may also use this infrastructure to Generate Malicious Commands for offensive cyber operations, or to generate content for Spearphishing via Social Engineering LLM.

Commercial AI service proxies distribute traffic from different accounts and various cloud platforms. The mix of traffic can make malicious activity difficult to detect and block [1].

Malicious actors conduct LLM Jacking attacks to gain access to victim accounts which they resell access to in their proxy services [2].

References

ATLAS ID: AML.T0008.005
Priority score: 65

Maturity: realized

Mitigations

Defenses that may help against this attack.

No connected defenses. No defense is connected to this attack in the current data.

Case studies

Examples from public reports and exercises.

Model Distillation Campaigns Targeting Anthropic Claude

incident

Date2026-02-23

Anthropic uncovered campaigns to extract Claude’s capabilities carried out by the three Chinese AI Labs: DeepSeek, Moonshot, and MiniMax. Collectively, these campaigns used approximately 24,000 accounts and 16 million queries. They used model distillation to train their own models on the outputs of Claude in an attempt to replicate Claude’s capabilities such as agentic reasoning, code generation, tool use, and computer use.

As outlined in Anthropic's report, model distillation was leveraged as a means for these labs to undermine Anthropic's export controls.[<sup>\[1\]</sup>][1] Distilled models lack the safeguards that prevent bad actors from using frontier models for malicious purposes such as the bioweapon development, disinformation, offensive cyber operations, and mass surveillance.

References

[1] https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks

Related risks

Research-backed risks connected to this topic.

Attacking LLMs via Additional Modalities a

Confidence: 0.67

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

"LLMs can now process modalities other than text, e.g. images or video frames (OpenAI, 2023c; Gemini Team, 2023). Several studies show that gradient-based attacks on multimodal models are easy and effective (Carlini e...

Adversarial AI: Data and Model Exfiltration Attacks

Confidence: 0.67

Domain2. Privacy & SecuritySubdomain2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

"Other forms of abuse can include privacy attacks that allow adversaries to exfiltrate or gain knowledge of the private training data set or other valuable assets. For example, privacy attacks such as membership infer...

Jailbreak of a multimodal model

Confidence: 0.67

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

"Current generation multimodal (e.g., vision and language) GPAI models are vulnerable to adversarial jailbreak attacks. These attacks can be used to automatically induce a model to produce an arbitrary or specific out...

Model extraction

Confidence: 0.67

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

"Data Exfiltration goes beyond revealing private information, and involves illicitly obtaining the training data used to build a model that may be sensitive or proprietary. Model Extraction is the same attack, only di...

Data exfiltration

Confidence: 0.67

Domain2. Privacy & SecuritySubdomain2.2 > AI system security vulnerabilities and attacks

"Data Exfiltration goes beyond revealing private information, and involves illicitly obtaining the training data used to build a model that may be sensitive or proprietary. Model Extraction is the same attack, only di...

Related CVEs

Known software flaws linked to this context.

No related CVEs. No software flaw is connected to this attack in the current data.

Source

Where this page information comes from.

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json