APromptRiskDBThreat intelligence atlas
ML Lifecycle Stage

ML Model Engineering AI Mitigations

ML Model Engineering groups 15 AI defenses across the ML lifecycle.

ML Lifecycle StageML Model Engineering

Record summary

A quick snapshot of what this page covers.

Records15Records included in this view.
SourcePublicBuilt from public source data.
ModeStaticPrepared as a ready-to-read page.

Lifecycle stage

A group of defenses with the same label.

15 AI defenses are grouped under ML Model Engineering.

ML lifecycle stage
ML Model Engineering
Mitigation count
15
ML Model Engineering

Related defenses

Defenses included in this group.

AI Bill of Materials

Business and Data UnderstandingData Preparation+1 more
LifecycleBusiness and Data Understanding + 2 moreCategoryPolicy

An AI Bill of Materials (AI BOM) contains a full listing of artifacts and resources that were used in building the AI. The AI BOM can help mitigate supply chain risks and enable rapid response to reported vulnerabilities.

This can include maintaining dataset provenance, i.e. a detailed history of datasets used for AI applications. The history can include information about the dataset source as well as well as a complete record of any modifications.

Adversarial Input Detection

Data PreparationML Model Engineering+3 more
LifecycleData Preparation + 4 moreCategoryTechnical - ML

Detect and block adversarial inputs or atypical queries that deviate from known benign behavior, exhibit behavior patterns observed in previous attacks or that come from potentially malicious IPs. Incorporate adversarial detection algorithms into the AI system prior to the AI model.

Control Access to AI Models and Data at Rest

Business and Data UnderstandingData Preparation+2 more
LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

Establish access controls on internal model registries and limit internal access to production models. Limit access to training data only to approved users.

Deepfake Detection

DeploymentMonitoring and Maintenance+2 more
LifecycleDeployment + 3 moreCategoryTechnical - ML

Apply deepfake detection algorithms against any untrusted or user-provided data, especially in impactful applications such as biometric verification, to block generated content.

Detectors may use a combination of approaches, including:

  • AI models trained to differentiate between real and deepfake content.
  • Identifying common inconsistencies in deepfake content, such as unnatural facial movements, audio mismatches, or pixel-level artifacts.
  • Biometrics analysis, such blinking, eye movements, and microexpressions.

Encrypt Sensitive Information

Data PreparationML Model Engineering+1 more
LifecycleData Preparation + 2 moreCategoryTechnical - Cyber

Encrypt sensitive data such as AI models to protect against adversaries attempting to access sensitive data.

Generative AI Guardrails

ML Model EngineeringML Model Evaluation+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Guardrails are safety controls that are placed between a generative AI model and the output shared with the user to prevent undesired inputs and outputs. Guardrails can take the form of validators such as filters, rule-based logic, or regular expressions, as well as AI-based approaches, such as classifiers and utilizing LLMs, or named entity recognition (NER) to evaluate the safety of the prompt or response. Domain specific methods can be employed to reduce risks in a variety of areas such as etiquette, brand damage, jailbreaking, false information, code exploits, SQL injections, and data leakage.

Generative AI Guidelines

ML Model EngineeringML Model Evaluation+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Guidelines are safety controls that are placed between user-provided input and a generative AI model to help direct the model to produce desired outputs and prevent undesired outputs.

Guidelines can be implemented as instructions appended to all user prompts or as part of the instructions in the system prompt. They can define the goal(s), role, and voice of the system, as well as outline safety and security parameters.

Generative AI Model Alignment

ML Model EngineeringML Model Evaluation+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

When training or fine-tuning a generative AI model it is important to utilize techniques that improve model alignment with safety, security, and content policies.

The fine-tuning process can potentially remove built-in safety mechanisms in a generative AI model, but utilizing techniques such as Supervised Fine-Tuning, Reinforcement Learning from Human Feedback or AI Feedback, and Targeted Safety Context Distillation can improve the safety and alignment of the model.

Memory Hardening

ML Model EngineeringDeployment+1 more
LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Memory Hardening involves developing trust boundaries and secure processes for how an AI agent stores and accesses memory and context. This may be implemented using a combination of strategies including restricting an agent's ability to store memories by requiring external authentication and validation for memory updates, performing semantic integrity checks on retrieved memories before agents execute actions, and implementing controls for monitoring of memory and remediation processes for poisoned memory.

Model Hardening

Data PreparationML Model Engineering
LifecycleData Preparation + 1 moreCategoryTechnical - ML

Use techniques to make AI models robust to adversarial inputs such as adversarial training or network distillation.

Use Ensemble Methods

ML Model Engineering
LifecycleML Model EngineeringCategoryTechnical - ML

Use an ensemble of models for inference to increase robustness to adversarial inputs. Some attacks may effectively evade one model or model family but be ineffective against others.

Use Multi-Modal Sensors

Business and Data UnderstandingData Preparation+1 more
LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Incorporate multiple sensors to integrate varying perspectives and modalities to avoid a single point of failure susceptible to physical attacks.

User Training

Business and Data UnderstandingData Preparation+4 more
LifecycleBusiness and Data Understanding + 5 moreCategoryPolicy

Educate AI model developers to on AI supply chain risks and potentially malicious AI artifacts. Educate users on how to identify deepfakes and phishing attempts.

Verify AI Artifacts

Business and Data UnderstandingData Preparation+1 more
LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Verify the cryptographic checksum of all AI artifacts to verify that the file was not modified by an attacker.

Vulnerability Scanning

ML Model EngineeringData Preparation
LifecycleML Model Engineering + 1 moreCategoryTechnical - Cyber

Vulnerability scanning is used to find potentially exploitable software vulnerabilities to remediate them.

File formats such as pickle files that are commonly used to store AI models can contain exploits that allow for arbitrary code execution. These files should be scanned for potentially unsafe calls, which could be used to execute code, create new processes, or establish networking capabilities. Adversaries may embed malicious code in model corrupt model files, so scanners should be capable of working with models that cannot be fully de-serialized. Model artifacts, downstream products produced by models, and external software dependencies should be scanned for known vulnerabilities.

Source

Where this page information comes from.