APromptRiskDBThreat intelligence atlas
AI Case Study

Tay Poisoning - AI Case Study

Microsoft created Tay, a Twitter chatbot designed to engage and entertain users. While previous chatbots used pre-programmed scripts to respond to prompts, Tay's machine learning capabilities allowed it to be directly influenced by its conversations. A coordinated attack encouraged malicious users to tweet abusive and offensive language at Tay, which eventually led to Tay generating similarly inflammatory content...

IncidentMicrosoft's Tay AI Chatbot4chan UsersAI Model AccessInitial AccessPersistence

Overview

Case steps4Steps described in the case record.
Techniques4Attack methods mentioned in the case steps.
Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

  • 1Dominant ATLAS tactic. AI Model Access appears in 1 case steps.
  • 2Multiple attack methods. The case connects to 4 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

AI Model Access1Initial Access1Persistence1Impact1
  1. Step 2

    Data

    Initial Access

    Tay bot used the interactions with its Twitter users as training data to improve its conversations. Adversaries were able to coordinate with the intent of defacing Tay bot by exploiting this feedback loop.

  2. Persistence

    By repeatedly interacting with Tay using racist and offensive language, they were able to skew Tay's dataset towards that language as well. This was done by adversaries using the "repeat after me" function, a command that forced Tay to repeat anything said to it.

  3. Impact

    As a result of this coordinated attack, Tay's conversation algorithms began to learn to generate reprehensible material. Tay's internalization of this detestable language caused it to be unpromptedly repeated during interactions with innocent users.

Mitigations

Defenses connected to the attack methods in this case.

Sources

Original public records and references for this case.

Original source

Original source links

Open the MITRE ATLAS data and public references used for this case study.