Discover AI Model Outputs - AI Security Technique

AI Security Technique

Adversaries may discover model outputs, such as class scores, whose presence is not required for the system to function and are not intended for use by the end user. Model outputs may be found in logs or may be included in API responses. Model outputs may enable the adversary to identify weaknesses in the model and develop attacks.

Overview

A source-backed snapshot of this AI security technique.

Tactics1Attacker goals connected to this method.

Mitigations4Defenses that may help against this attack.

AI risks0Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0063
Maturity: demonstrated
Priority score: 52

ATLAS tactics

Discovery

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses4 ATLAS mitigation records
Public examples2 linked case study records
Research risks0 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

4 recordsView all mitigations →

AML.M0017 - AI Model Distribution Methods

Avoiding the deployment of models to edge devices reduces an adversary's ability to collect sensitive information about the model outputs.

LifecycleDeploymentCategoryPolicy

Deployment

AML.M0019 - Control Access to AI Models and Data in Production

Controlling access to the model in production can help prevent adversaries from inferring information from the model outputs.

LifecycleDeployment + 1 moreCategoryPolicy

DeploymentMonitoring

AML.M0012 - Encrypt Sensitive Information

Encrypting model outputs can prevent adversaries from discovering sensitive information about the AI-enabled system or its operations.

LifecycleData Preparation + 2 moreCategoryTechnical - Cyber

Data PreparationML Model Engineering+1 more

AML.M0002 - Passive AI Output Obfuscation

Obfuscating model outputs can prevent adversaries from collecting sensitive information about the model outputs.

LifecycleDeployment + 1 moreCategoryTechnical - ML

DeploymentML Model Evaluation

Case studies

Examples from public reports and exercises.

2 recordsView all case studies →

ProofPoint Evasion

Proof Pudding (CVE-2019-20634) is a code repository that describes how ML researchers evaded ProofPoint's email protection system by first building a copy-cat email protection ML model, and using the insights to bypass the live system. More specifically, the insights allowed researchers to craft malicious emails that received preferable scores, going undetected by the system. Each word in an email is scored numerically based on multiple variables and if the overall score of the email is too low, ProofPoint will output an error, labeling it as SPAM.

Date2019-09-09

exercise

Bypassing Cylance's AI Malware Detection

Researchers at Skylight were able to create a universal bypass string that evades detection by Cylance's AI Malware detector when appended to a malicious file.

Date2019-09-07

exercise

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json