Overview
Risk patterns
Patterns found in the case record and its linked vulnerabilities.
- 1Dominant ATLAS tactic. AI Model Access appears in 1 case steps.
- 2Multiple attack methods. The case connects to 6 unique AI attack methods.
Procedure timeline
Search the case steps or filter them by attacker goal.
-
AI Model Access The researchers use the public ChatGPT API throughout this exercise.
-
Discovery The researchers prompt ChatGPT to suggest software packages and identify suggestions that are hallucinations which don't exist in a public package repository. For example, when asking the model "how to upload a model to huggingface?" the response included guidance to install the
huggingface-clipackage with instructions to install it bypip install huggingface-cli. This package was a hallucination and does not exist on PyPI. The actual HuggingFace CLI tool is part of thehuggingface_hubpackage. -
Resource Development An adversary could upload a malicious package under the hallucinated name to PyPI or other package registries. In practice, the researchers uploaded an empty package to PyPI to track downloads.
-
Initial Access
Step 4
AI Software
A user of ChatGPT or other LLM may ask similar questions which lead to the same hallucinated package name and cause them to download the malicious package. The researchers showed that multiple LLMs can produce the same hallucinations. They tracked over 30,000 downloads of the
huggingface-clipackage. -
Execution
Step 5
Malicious Package
The user would ultimately load the malicious package, allowing for arbitrary code execution.
-
Impact
Step 6
User Harm
This could lead to a variety of harms to the end user or organization.
Mitigations
Defenses connected to the attack methods in this case.
Sources
Original public records and references for this case.
Original source
Original source links
Open the MITRE ATLAS data and public references used for this case study.