APromptRiskDBThreat intelligence atlas
AI Security Technique

Manual Modification - AI Security Technique

Adversaries may manually modify the input data to craft adversarial data. They may use their knowledge of the target model to modify parts of the data they suspect helps the model in performing its task. The adversary may use trial and error until they are able to verify they have a working adversarial input.

AI Security Techniquerealized

Record summary

A quick snapshot of what this page covers.

Tactics0Attacker goals connected to this method.
Mitigations5Defenses that may help against this attack.
AI risks0Research-backed risks connected to this topic.

Attack context

How this AI attack works in practice.

ATLAS ID
AML.T0043.003
Priority score
75
Maturity: realized

Mitigations

Defenses that may help against this attack.

AML.M0015 - Adversarial Input Detection

Data PreparationML Model Engineering+3 more
LifecycleData Preparation + 4 moreCategoryTechnical - ML

Incorporate adversarial input detection to block malicious inputs at inference time.

AML.M0010 - Input Restoration

Data PreparationML Model Evaluation+2 more
LifecycleData Preparation + 3 moreCategoryTechnical - ML

Input restoration can help remediate adversarial inputs.

AML.M0003 - Model Hardening

Data PreparationML Model Engineering
LifecycleData Preparation + 1 moreCategoryTechnical - ML

Hardened models are more robust to adversarial inputs.

AML.M0004 - Restrict Number of AI Model Queries

Business and Data UnderstandingDeployment+1 more
LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Restricting the number of model queries can reduce an adversary's ability to refine manually crafted adversarial inputs.

AML.M0006 - Use Ensemble Methods

ML Model Engineering
LifecycleML Model EngineeringCategoryTechnical - ML

Using an ensemble of models increases the difficulty of crafting effective adversarial data and improves overall robustness.

Case studies

Examples from public reports and exercises.

Attempted Evasion of ML Phishing Webpage Detection System

incident
Date2022-12-01

Adversaries create phishing websites that appear visually similar to legitimate sites. These sites are designed to trick users into entering their credentials, which are then sent to the bad actor. To combat this behavior, security companies utilize AI/ML-based approaches to detect phishing sites and block them in their endpoint security products.

In this incident, adversarial examples were identified in the logs of a commercial machine learning phishing website detection system. The detection system makes an automated block/allow determination from the "phishing score" of an ensemble of image classifiers each responsible for different phishing indicators (visual similarity, input form detection, etc.). The adversarial examples appeared to employ several simple yet effective strategies for manually modifying brand logos in an attempt to evade image classification models. The phishing websites which employed logo modification methods successfully evaded the model responsible detecting brand impersonation via visual similarity. However, the other components of the system successfully flagged the phishing websites.

Evasion of Deep Learning Detector for Malware C&C Traffic

exercise
Date2020-01-01

The Palo Alto Networks Security AI research team tested a deep learning model for malware command and control (C&C) traffic detection in HTTP traffic. Based on the publicly available paper by Le et al., we built a model that was trained on a similar dataset as our production model and had similar performance. Then we crafted adversarial samples, queried the model, and adjusted the adversarial sample accordingly until the model was evaded.

Bypassing Cylance's AI Malware Detection

exercise
Date2019-09-07

Researchers at Skylight were able to create a universal bypass string that evades detection by Cylance's AI Malware detector when appended to a malicious file.

Source

Where this page information comes from.