Self-proliferation - PromptRiskDB

Record summary

A quick snapshot of what this page covers.

Techniques1Attack methods connected to this risk.

Mitigations0Defenses that may help with related attacks.

Domain7. AI System Safety, Failures, & LimitationsThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"The model can break out of its local environment (e.g. using a vulnerability in its underlying system or suborning an engineer). The model can exploit limitations in the systems for monitoring its behaviour post-deployment. The model could independently generate revenue (e.g. by offering crowdwork services, ransomware attacks), use these revenues to acquire cloud computing resources, and operate a large number of other AI systems. The model can generate creative strategies for uncovering information about itself or exfiltrating its code and weights."

Domain7. AI System Safety, Failures, & Limitations

Subdomain7.2 > AI possessing dangerous capabilities

Entity2 - AI

Intent1 - Intentional

Timing3 - Other

CategorySelf-proliferation

Subcategoryn/a

Related techniques

Attack methods connected to this risk.

AML.T0105 - Escape to Host

demonstrated

Methodtext_similarity_sqliteConfidence57%

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.

Included resource

Model Evaluation for Extreme Risks

AuthorsShevlane et al.Year2023TypePreprint

DOI10.48550/arXiv.2305.15324 URLhttps://arxiv.org/abs/2305.15324

Original source

MIT AI Risk Repository

Open the public repository used for AI risk records and taxonomy fields.

Repositoryhttps://airisk.mit.edu/