Reputational Harm - AI Security Technique

Overview

A source-backed snapshot of this AI security technique.

Tactics0Attacker goals connected to this method.

Mitigations0Defenses that may help against this attack.

AI risks9Research-backed risks connected to this topic.

Technique details

Identifiers, maturity, and source taxonomy for this technique.

ATLAS ID: AML.T0048.001
Maturity: demonstrated
Priority score: 75

Attack flow

How to read the public records connected to this technique.

1. TechniqueRead the ATLAS description and evidence level.

2. TacticsSee which attacker goals this method supports.

3. ExamplesCheck whether public case studies mention it.

4. DefensesReview safeguards mapped by ATLAS.

5. SourcesOpen the original public records and references.

Impact

Why this technique may deserve attention in the current dataset.

Evidence leveldemonstrated
Mapped defenses0 ATLAS mitigation records
Public examples1 linked case study records
Research risks9 related MIT AI Risk records above the confidence threshold
Vulnerabilities0 linked CVE records

Mitigations

Defenses that may help against this attack.

View all mitigations →

No connected defenses. No defense is connected to this attack in the current data.

Case studies

Examples from public reports and exercises.

1 recordView all case studies →

PoisonGPT

Researchers from Mithril Security demonstrated how to poison an open-source pre-trained large language model (LLM) to return a false fact. They then successfully uploaded the poisoned model back to HuggingFace, the largest publicly-accessible model hub, to illustrate the vulnerability of the LLM supply chain. Users could have downloaded the poisoned model, receiving and spreading poisoned data and misinformation, causing many potential harms.

Date2023-07-01

exercise

Related risks

Research-backed risks connected to this topic.

9 recordsView all risks →

Impersonation/identity theft

"Impersonation/identity theft - Theft of an individual, group or organisation’s identity by a third-party in order to defraud, mock or otherwise harm them."

Domain4. Malicious Actors & MisuseSubdomain4.3 > Fraud, scams, and targeted manipulation

Confidence0.66

Malicious intent

"A frequent malicious use case of generative AI to harm, humiliate, or sexualize another person involves generating deepfakes of nonconsensual sexual imagery or videos."

Domain4. Malicious Actors & MisuseSubdomain4.3 > Fraud, scams, and targeted manipulation

Confidence0.66

Privacy and consent

"Even when a victim of targeted, AIgenerated harms successfully identifies a deepfake creator with malicious intent, they may still struggle to redress many harms because the generated image or video isn’t the victim...

Domain4. Malicious Actors & MisuseSubdomain4.3 > Fraud, scams, and targeted manipulation

Confidence0.66

Propaganda - Digital impersonations

"AI-generated impersonation for identity theft might be found at the intersection of “Harm to the Person” and “Deception.”"

Domain4. Malicious Actors & MisuseSubdomain4.3 > Fraud, scams, and targeted manipulation

Confidence0.66

Showing 4 of 9

Vulnerabilities

Known software flaws linked to this context.

View all vulnerabilities →

No related vulnerabilities. No software flaw is connected to this attack in the current data.

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json