APromptRiskDBThreat intelligence atlas
AI Case Study

Model Distillation Campaigns Targeting Anthropic Claude - AI Case Study

Anthropic uncovered campaigns to extract Claude’s capabilities carried out by the three Chinese AI Labs: DeepSeek, Moonshot, and MiniMax. Collectively, these campaigns used approximately 24,000 accounts and 16 million queries. They used model distillation to train their own models on the outputs of Claude in an attempt to replicate Claude’s capabilities such as agentic reasoning, code generation, tool use, and com...

IncidentAnthropic ClaudeDeepSeek, Moonshot AI, MiniMaxImpactResource DevelopmentAI Model Access

Overview

Case steps7Steps described in the case record.
Techniques7Attack methods mentioned in the case steps.
Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

  • 1Dominant ATLAS tactic. Impact appears in 3 case steps.
  • 2Multiple attack methods. The case connects to 7 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Impact3Resource Development2AI Model Access1Exfiltration1
  1. Resource Development

    DeepSeek, Moonshot AI, and MiniMax used commercial proxy services to gain access to Claude. This circumvented Anthropic’s policy of not offering commercial access to Claude in China.

  2. Exfiltration

    DeepSeek, Moonshot AI, and MiniMax used their generated prompts to repeatedly query Claude and train their own models from the responses. Collectively, the labs issued over 16 million queries during their distillation campaigns.

  3. Impact

    DeepSeek, Moonshot AI, and MiniMax acquired Claude’s capabilities via distillation at a fraction of the cost of developing their own models. They targeted Claude’s most differentiated capabilities including agentic reasoning, tool use, and code generation.

  4. Impact

    The distilled models lack safeguards and could be used for malicious purposes such as offensive cyber operations, disinformation campaigns, mass surveillance, and censorship.

  5. Step 7

    User Harm

    Impact

    The distilled models lack Claude's safety guardrails, potentially exposing users to harmful outputs and behaviors.

Mitigations

Defenses connected to the attack methods in this case.

Sources

Original public records and references for this case.