Fine-tuning related (Excessive or overly restrictive safety-tuning)

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.

Mitigations5Defenses that may help with related attacks.

Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

How this risk is described and categorized.

Domain7. AI System Safety, Failures, & Limitations

Subdomain7.3 > Lack of capability or robustness

Entity4 - Not coded

Intent4 - Not coded

Timing4 - Not coded

CategoryModel Development

SubcategoryFine-tuning related (Excessive or overly restrictive safety-tuning)

Attack methods connected to this risk.

demonstrated

Methodtext_similarity_sqliteConfidence59%

Defenses that may help with related attacks.

Business and Data UnderstandingData Preparation+2 more

LifecycleBusiness and Data Understanding + 3 moreCategoryPolicy

Business and Data UnderstandingData Preparation+1 more

LifecycleBusiness and Data Understanding + 2 moreCategoryTechnical - ML

ML Model EvaluationMonitoring and Maintenance

LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

Deployment

LifecycleDeploymentCategoryTechnical - Cyber

Data PreparationBusiness and Data Understanding

LifecycleData Preparation + 1 moreCategoryTechnical - ML

Research source for this risk, when available.

Included resource

AuthorsGipiškis et al.Year2024TypeJournal Article

Original source

Open the public repository used for AI risk records and taxonomy fields.