Tay Poisoning - AI Case Study

Overview

Case steps4Steps described in the case record.

Techniques4Attack methods mentioned in the case steps.

Linked CVEs0Known vulnerabilities mentioned in the record.

Patterns found in the case record and its linked vulnerabilities.

Search the case steps or filter them by attacker goal.

AI Model Access1Initial Access1Persistence1Impact1

Step 1
AI-Enabled Product or Service
AI Model Access

Adversaries were able to interact with Tay via Twitter messages.
Step 2
Data
Initial Access

Tay bot used the interactions with its Twitter users as training data to improve its conversations. Adversaries were able to coordinate with the intent of defacing Tay bot by exploiting this feedback loop.
Step 3
Poison Training Data
Persistence

By repeatedly interacting with Tay using racist and offensive language, they were able to skew Tay's dataset towards that language as well. This was done by adversaries using the "repeat after me" function, a command that forced Tay to repeat anything said to it.
Step 4
Erode AI Model Integrity
Impact

As a result of this coordinated attack, Tay's conversation algorithms began to learn to generate reprehensible material. Tay's internalization of this detestable language caused it to be unpromptedly repeated during interactions with innocent users.

Known software flaws mentioned in the case record.

No related CVEs found for this case. Built from MITRE ATLAS case study records and listed case steps.

Defenses connected to the attack methods in this case.

Original public records and references for this case.

Original source

Open the MITRE ATLAS data and public references used for this case study.