Human In-the-Loop for AI Agent Actions - AI Mitigation

AI Mitigation

Systems should require the user or another human stakeholder to approve AI agent actions before the agent takes them. The human approver may be technical staff or business unit SMEs depending on the use case. Separate tools, such as dedicated audit agents, may assist human approval, but final adjudication should be conducted by a human decision-maker. The security benefits from Human In-the-Loop policies may be at...

Overview

A source-backed snapshot of this defense.

The security benefits from Human In-the-Loop policies may be at odds with operational overhead costs of additional approvals. To ease this, Human In-the-Loop policies should follow the degree of consequence of the task at hand. Minor, repetitive tasks performed by agents accessing basic tools may only require minimal human oversight, while agents employed in systems with significant consequences may necessitate approval from multiple stakeholders diversified across multiple organizations.

Techniques3Attacks this defense is designed to help with.

Lifecycle1Where this defense applies in the AI lifecycle.

Categories1How the source groups this defense.

Safeguard details

Where this defense applies and how the source classifies it.

ATLAS ID: AML.M0029
Priority score: 15

Deployment

Technical - ML

Covered techniques

Attacks this defense is designed to help with.

3 recordsView all techniques →

Source evidence

Original public records and references for this page.

View all sources →

Original source

Original source links

Open the public records and source datasets used for this page.

Repositoryhttps://github.com/mitre-atlas/atlas-data ATLAS.yamlhttps://github.com/mitre-atlas/atlas-data/blob/main/dist/ATLAS.yaml Schemahttps://github.com/mitre-atlas/atlas-data/blob/main/dist/schemas/atlas_output_schema.json