AI Risk

Text encoding-based attacks

"Various new or existing text encodings, such as Base64, can be employed to craft jailbreak attacks that bypass safety training [13]. Low-resource language inputs also appear more likely to circumvent a model’s safeguards [229]. Since safety fine-tuning might not involve this encoding data or may only do so to a limited extent, harmful natural language prompts could be translated into less frequently used encoding...

View related techniques Read profile

AI Risk2. Privacy & Security2.2 > AI system security vulnerabilities and attacks2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques12Attack methods connected to this risk.

Mitigations16Defenses that may help with related attacks.

Domain2. Privacy & SecurityThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Various new or existing text encodings, such as Base64, can be employed to craft jailbreak attacks that bypass safety training [13]. Low-resource language inputs also appear more likely to circumvent a model’s safeguards [229]. Since safety fine-tuning might not involve this encoding data or may only do so to a limited extent, harmful natural language prompts could be translated into less frequently used encodings [214]."

Domain2. Privacy & Security

Subdomain2.2 > AI system security vulnerabilities and attacks

Entity1 - Human

Intent1 - Intentional

Timing2 - Post-deployment

CategoryAttacks on GPAIs/GPAI Failure Modes

SubcategoryText encoding-based attacks

Suggested mitigations

Defenses that may help with related attacks.

Control Access to AI Models and Data in Production

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryPolicy

AI Telemetry Logging

DeploymentMonitoring and Maintenance

LifecycleDeployment + 1 moreCategoryTechnical - Cyber

Generative AI Guardrails

ML Model EngineeringML Model Evaluation+1 more

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Generative AI Guidelines

ML Model EngineeringML Model Evaluation+1 more

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Generative AI Model Alignment

ML Model EngineeringML Model Evaluation+1 more

LifecycleML Model Engineering + 2 moreCategoryTechnical - ML

Control Access to AI Models and Data at Rest

Business and Data UnderstandingData Preparation+2 more

LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

Sanitize Training Data

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

Validate AI Model

ML Model EvaluationMonitoring and Maintenance

LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

Code Signing

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

Maintain AI Dataset Provenance

Data PreparationBusiness and Data Understanding

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Model Hardening

Data PreparationML Model Engineering

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Use Ensemble Methods

ML Model Engineering

LifecycleML Model EngineeringCategoryTechnical - ML

Use Multi-Modal Sensors

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - Cyber

Input Restoration

Data PreparationML Model Evaluation+2 more

LifecycleData Preparation + 3 moreCategoryTechnical - ML

Adversarial Input Detection

Data PreparationML Model Engineering+3 more

LifecycleData Preparation + 4 moreCategoryTechnical - ML

Deepfake Detection

DeploymentMonitoring and Maintenance+2 more

LifecycleDeployment + 3 moreCategoryTechnical - ML

Source

Research source for this risk, when available.

Included resource

Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

AuthorsGipiškis et al.Year2024TypeJournal Article

DOIhttps://doi.org/10.48550/arXiv.2410.23472 URLhttps://arxiv.org/abs/2410.23472

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/