APromptRiskDBThreat intelligence atlas
AI Risk

On Purpose - Post Deployment

"Just because developers might succeed in creating a safe AI, it doesn't mean that it will not become unsafe at some later point. In other words, a perfectly friendly AI could be switched to the "dark side" during the post-deployment stage. This can happen rather innocuously as a result of someone lying to the AI and purposefully supplying it with incorrect information or more explicitly as a result of someone giv...

AI Risk4. Malicious Actors & Misuse4.3 > Fraud, scams, and targeted manipulation2 - Post-deployment

Record summary

A quick snapshot of what this page covers.

Techniques0Attack methods connected to this risk.
Mitigations0Defenses that may help with related attacks.
Domain4. Malicious Actors & MisuseThe broad risk area this belongs to.

Risk profile

How this risk is described and categorized.

"Just because developers might succeed in creating a safe AI, it doesn't mean that it will not become unsafe at some later point. In other words, a perfectly friendly AI could be switched to the "dark side" during the post-deployment stage. This can happen rather innocuously as a result of someone lying to the AI and purposefully supplying it with incorrect information or more explicitly as a result of someone giving the AI orders to perform illegal or dangerous actions against others."

Domain4. Malicious Actors & Misuse
Subdomain4.3 > Fraud, scams, and targeted manipulation
Entity1 - Human
Intent1 - Intentional
Timing2 - Post-deployment
CategoryOn Purpose - Post Deployment
Subcategoryn/a

Suggested mitigations

Defenses that may help with related attacks.

No propagated mitigations. No defense is available through the connected attack methods.

Source

Research source for this risk, when available.