PromptRiskDBThreat intelligence atlas
AI Risk

Fine-tuning related (Unexpected competence in fine-tuned versions of the upstream model)

"Downstream deployers may often fine-tune a GPAI model with specific deploy- ment-related datasets, to better suit the task. Fine-tuned upstream models can gain new or unexpected capabilities that the underlying upstream models did not exhibit [202, 126, 137]. These new capabilities may be unanticipated by the original model developer."

AI Risk7. AI System Safety, Failures, & Limitations7.2 > AI possessing dangerous capabilities1 - Pre-deployment

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.
Mitigations5Defenses that may help with related attacks.
Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

Domain7. AI System Safety, Failures, & Limitations
Subdomain7.2 > AI possessing dangerous capabilities
Entity1 - Human
Intent2 - Unintentional
Timing1 - Pre-deployment
CategoryModel Development
SubcategoryFine-tuning related (Unexpected competence in fine-tuned versions of the upstream model)

Suggested mitigations

Defenses that may help with related attacks.

Validate AI Model

ML Model EvaluationMonitoring and Maintenance
LifecycleML Model Evaluation + 1 moreCategoryTechnical - ML

Code Signing

Deployment
LifecycleDeploymentCategoryTechnical - Cyber

Source

Research source for this risk, when available.