Technical - ML AI Mitigations

Mitigation Category

Technical - ML groups 16 AI defenses by defense type.

Overview

A group of defenses with the same label.

Records16Records included in this view.

SourcePublicBuilt from public source data.

ModeStaticPrepared as a ready-to-read page.

Category summary

How ATLAS labels this defense group.

Mitigation category: Technical - ML
Mitigation count: 16

Technical - ML

Related defenses

Defenses included in this group.

Top 10 of 16View all mitigations →

Adversarial Input Detection

Detect and block adversarial inputs or atypical queries that deviate from known benign behavior, exhibit behavior patterns observed in previous attacks or that come from potentially malicious IPs. Incorporate adversarial detection algorithms into the AI system prior to the AI model.

LifecycleData Preparation + 4 moreCategoryTechnical - ML

Data PreparationML Model Engineering+3 more

Deepfake Detection

Apply deepfake detection algorithms against any untrusted or user-provided data, especially in impactful applications such as biometric verification, to block generated content.

Detectors may use a combination of approaches, including:

AI models trained to differentiate between real and deepfake content.
Identifying common inconsistencies in deepfake content, such as unnatural facial movements, audio mismatches, or pixel-level artifacts.
Biometrics analysis, such blinking, eye movements, and microexpressions.

LifecycleDeployment + 3 moreCategoryTechnical - ML

DeploymentMonitoring+2 more

Generative AI Guardrails

Guardrails are safety controls that are placed between a generative AI model and the output shared with the user to prevent undesired inputs and outputs. Guardrails can take the form of validators such as filters, rule-based logic, or regular expressions, as well as AI-based approaches, such as classifiers and utilizing LLMs, or named entity recognition (NER) to evaluate the safety of the prompt or response. Domain specific methods can be employed to reduce risks in a variety of areas such as etiquette, brand damage, jailbreaking, false information, code exploits, SQL injections, and data leakage.

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

ML Model EngineeringML Model Evaluation+1 more

Generative AI Guidelines

Guidelines are safety controls that are placed between user-provided input and a generative AI model to help direct the model to produce desired outputs and prevent undesired outputs.

Guidelines can be implemented as instructions appended to all user prompts or as part of the instructions in the system prompt. They can define the goal(s), role, and voice of the system, as well as outline safety and security parameters.

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

ML Model EngineeringML Model Evaluation+1 more

Showing 4 of 10

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json