Train Proxy via Replication - AI Security Technique

AI Security Technique

Adversaries may replicate a private model. By repeatedly querying the victim's AI Model Inference API Access, the adversary can collect the target model's inferences into a dataset. The inferences are used as labels for training a separate model offline that will mimic the behavior and performance of the target model. A replicated model that closely mimic's the target model is a valuable r...

Overview

A source-backed snapshot of this AI security technique.

A replicated model that closely mimic's the target model is a valuable resource in staging the attack. The adversary can use the replicated model to Craft Adversarial Data for various purposes (e.g. Evade AI Model, Spamming AI System with Chaff Data).

Tactics0Attacker goals connected to this method.

Mitigations3Defenses that may help against this attack.

AI risks0Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0005.001
Maturity: demonstrated
Priority score: 49

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses3 ATLAS mitigation records
Public examples2 linked case study records
Research risks0 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

3 recordsView all mitigations →

AML.M0024 - AI Telemetry Logging

Telemetry logging can help identify if a proxy training dataset has been exfiltrated.

LifecycleDeployment + 1 moreCategoryTechnical - Cyber

DeploymentMonitoring

AML.M0002 - Passive AI Output Obfuscation

Obfuscating model outputs restricts an adversary's ability to create an accurate proxy model by querying a model and observing its outputs.

LifecycleDeployment + 1 moreCategoryTechnical - ML

DeploymentML Model Evaluation

AML.M0004 - Restrict Number of AI Model Queries

Restricting the number of queries to the model decreases an adversary's ability to replicate an accurate proxy model.

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

B&D UnderstandingDeployment+1 more

Case studies

Examples from public reports and exercises.

2 recordsView all case studies →

Attack on Machine Translation Services

Machine translation services (such as Google Translate, Bing Translator, and Systran Translate) provide public-facing UIs and APIs. A research group at UC Berkeley utilized these public endpoints to create a replicated model with near-production state-of-the-art translation quality. Beyond demonstrating that IP can be functionally stolen from a black-box system, they used the replicated model to successfully transfer adversarial examples to the real production services. These adversarial inputs successfully cause targeted word flips, vulgar outputs, and dropped sentences on Google Translate and Systran Translate websites.

Date2020-04-30

exercise

ProofPoint Evasion

Proof Pudding (CVE-2019-20634) is a code repository that describes how ML researchers evaded ProofPoint's email protection system by first building a copy-cat email protection ML model, and using the insights to bypass the live system. More specifically, the insights allowed researchers to craft malicious emails that received preferable scores, going undetected by the system. Each word in an email is scored numerically based on multiple variables and if the overall score of the email is too low, ProofPoint will output an error, labeling it as SPAM.

Date2019-09-09

exercise

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json