APromptRiskDBThreat intelligence atlas
AI Security Technique

Corrupt AI Model - AI Security Technique

An adversary may purposefully corrupt a malicious AI model file so that it cannot be successfully deserialized in order to evade detection by a model scanner. The corrupt model may still successfully execute malicious code before deserialization fails.

AI Security TechniquerealizedDefense Evasion

Record summary

A quick snapshot of what this page covers.

Tactics1Attacker goals connected to this method.
Mitigations0Defenses that may help against this attack.
AI risks0Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

ATLAS ID
AML.T0076
Priority score
40
Maturity: realized
Defense Evasion

Mitigations

Defenses that may help against this attack.

No connected defenses. No defense is connected to this attack in the current data.

Case studies

Examples from public reports and exercises.

Malicious Models on Hugging Face

incident
Date2025-02-25

Researchers at ReversingLabs have identified malicious models containing embedded malware hosted on the Hugging Face model repository. The models were found to execute reverse shells when loaded, which grants the threat actor command and control capabilities on the victim's system. Hugging Face uses Picklescan to scan models for malicious code, however these models were not flagged as malicious. The researchers discovered that the model files were seemingly purposefully corrupted in a way that the malicious payload is executed before the model ultimately fails to de-serialize fully. Picklescan relied on being able to fully de-serialize the model.

Since becoming aware of this issue, Hugging Face has removed the models and has made changes to Picklescan to catch this particular attack. However, pickle files are fundamentally unsafe as they allow for arbitrary code execution, and there may be other types of malicious pickles that Picklescan cannot detect.

Source

Where this page information comes from.