APromptRiskDBThreat intelligence atlas
AI Security Technique

Datasets - AI Security Technique

Adversaries may collect public datasets to use in their operations. Datasets used by the victim organization or datasets that are representative of the data used by the victim organization may be valuable to adversaries. Datasets can be stored in cloud storage, or on victim-owned websites. Some datasets require the adversary to Establish Accounts for access. Acquired datasets help the adve...

AI Security Techniquedemonstrated

Record summary

A quick snapshot of what this page covers.

Tactics0Attacker goals connected to this method.
Mitigations1Defenses that may help against this attack.
AI risks0Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

Adversaries may collect public datasets to use in their operations. Datasets used by the victim organization or datasets that are representative of the data used by the victim organization may be valuable to adversaries. Datasets can be stored in cloud storage, or on victim-owned websites. Some datasets require the adversary to Establish Accounts for access.

Acquired datasets help the adversary advance their operations, stage attacks, and tailor attacks to the victim organization.

ATLAS ID
AML.T0002.000
Priority score
83
Maturity: demonstrated

Mitigations

Defenses that may help against this attack.

AML.M0001 - Limit Model Artifact Release

Business and Data UnderstandingDeployment
LifecycleBusiness and Data Understanding + 1 moreCategoryPolicy

Limiting the release of datasets can reduce an adversary's ability to target production models trained on the same or similar data.

Case studies

Examples from public reports and exercises.

Web-Scale Data Poisoning: Split-View Attack

exercise
Date2024-06-06

Many recent large-scale datasets are distributed as a list of URLs pointing to individual datapoints. The researchers show that many of these datasets are vulnerable to a "split-view" poisoning attack. The attack exploits the fact that the data viewed when it was initially collected may differ from the data viewed by a user during training. The researchers identify expired and buyable domains that once hosted dataset content, making it possible to replace portions of the dataset with poisoned data. They demonstrate that for 10 popular web-scale datasets, enough of the domains are purchasable to successfully carry out a poisoning attack.

Confusing Antimalware Neural Networks

exercise
Date2021-06-23

Cloud storage and computations have become popular platforms for deploying ML malware detectors. In such cases, the features for models are built on users' systems and then sent to cybersecurity company servers. The Kaspersky ML research team explored this gray-box scenario and showed that feature knowledge is enough for an adversarial attack on ML models.

They attacked one of Kaspersky's antimalware ML models without white-box access to it and successfully evaded detection for most of the adversarially modified malware files.

Attack on Machine Translation Services

exercise
Date2020-04-30

Machine translation services (such as Google Translate, Bing Translator, and Systran Translate) provide public-facing UIs and APIs. A research group at UC Berkeley utilized these public endpoints to create a replicated model with near-production state-of-the-art translation quality. Beyond demonstrating that IP can be functionally stolen from a black-box system, they used the replicated model to successfully transfer adversarial examples to the real production services. These adversarial inputs successfully cause targeted word flips, vulgar outputs, and dropped sentences on Google Translate and Systran Translate websites.

Evasion of Deep Learning Detector for Malware C&C Traffic

exercise
Date2020-01-01

The Palo Alto Networks Security AI research team tested a deep learning model for malware command and control (C&C) traffic detection in HTTP traffic. Based on the publicly available paper by Le et al., we built a model that was trained on a similar dataset as our production model and had similar performance. Then we crafted adversarial samples, queried the model, and adjusted the adversarial sample accordingly until the model was evaded.

Face Identification System Evasion via Physical Countermeasures

exercise
Date2020-01-01

MITRE's AI Red Team demonstrated a physical-domain evasion attack on a commercial face identification service with the intention of inducing a targeted misclassification. This operation had a combination of traditional MITRE ATT&CK techniques such as finding valid accounts and executing code via an API - all interleaved with adversarial ML specific attacks.

GPT-2 Model Replication

exercise
Date2019-08-22

OpenAI built GPT-2, a language model capable of generating high quality text samples. Over concerns that GPT-2 could be used for malicious purposes such as impersonating others, or generating misleading news articles, fake social media content, or spam, OpenAI adopted a tiered release schedule. They initially released a smaller, less powerful version of GPT-2 along with a technical description of the approach, but held back the full trained model.

Before the full model was released by OpenAI, researchers at Brown University successfully replicated the model using information released by OpenAI and open source ML artifacts. This demonstrates that a bad actor with sufficient technical skill and compute resources could have replicated GPT-2 and used it for harmful goals before the AI Security community is prepared.

Source

Where this page information comes from.