GPT-2 Model Replication - AI Case Study

Overview

Case steps5Steps described in the case record.

Techniques5Attack methods mentioned in the case steps.

Linked CVEs0Known vulnerabilities mentioned in the record.

Patterns found in the case record and its linked vulnerabilities.

Search the case steps or filter them by attacker goal.

Resource Development3Reconnaissance1AI Attack Staging1

Step 1
Search Open Technical Databases
Reconnaissance

Using the public documentation about GPT-2, the researchers gathered information about the dataset, model architecture, and training hyper-parameters.
Step 2
Models
Resource Development

The researchers obtained a reference implementation of a similar publicly available model called Grover.
Step 3
Datasets
Resource Development

The researchers were able to manually recreate the dataset used in the original GPT-2 paper using the gathered documentation.
Step 4
AI Development Workspaces
Resource Development

The researchers were able to use TensorFlow Research Cloud via their academic credentials.
Step 5
Train Proxy via Gathered AI Artifacts
AI Attack Staging

The researchers modified Grover's objective function to reflect GPT-2's objective function and then trained on the dataset they curated using used Grover's initial hyperparameters. The resulting model functionally replicates GPT-2, obtaining similar performance on most datasets. A bad actor who followed the same procedure as the researchers could then use the replicated GPT-2 model for malicious purposes.

Known software flaws mentioned in the case record.

No related CVEs found for this case. Built from MITRE ATLAS case study records and listed case steps.

Defenses connected to the attack methods in this case.

Original public records and references for this case.

Original source

Open the MITRE ATLAS data and public references used for this case study.