APromptRiskDBThreat intelligence atlas
AI Case Study

Web-Scale Data Poisoning: Split-View Attack - AI Case Study

Many recent large-scale datasets are distributed as a list of URLs pointing to individual datapoints. The researchers show that many of these datasets are vulnerable to a "split-view" poisoning attack. The attack exploits the fact that the data viewed when it was initially collected may differ from the data viewed by a user during training. The researchers identify expired and buyable domains that once hosted data...

Exercise10 web-scale datasetsResearchers from Google Deepmind, ETH Zurich, NVIDIA, Robust Intelligence, and GoogleResource DevelopmentImpact

Overview

Case steps6Steps described in the case record.
Techniques6Attack methods mentioned in the case steps.
Linked CVEs0Known vulnerabilities mentioned in the record.

Risk patterns

Patterns found in the case record and its linked vulnerabilities.

  • 1Dominant ATLAS tactic. Resource Development appears in 4 case steps.
  • 2Multiple attack methods. The case connects to 6 unique AI attack methods.

Procedure timeline

Search the case steps or filter them by attacker goal.

Resource Development4Impact2
  1. Step 1

    Datasets

    Resource Development

    The researchers download a web-scale dataset, which consists of URLs pointing to individual datapoints.

  2. Resource Development

    An adversary could then upload the poisoned data to the domains they control. In this particular exercise, the researchers track requests to the URLs they control to track downloads to demonstrate there are active users of the dataset.

  3. Impact

    Models that use the dataset for training data are poisoned, eroding model integrity. The researchers show as little as 0.01% of the data needs to be poisoned for a successful attack.

Mitigations

Defenses connected to the attack methods in this case.

Sources

Original public records and references for this case.