Record summary
A quick snapshot of what this page covers.
Attack context
How this AI attack works in practice.
Adversaries may deliberately drive a victim's AI services beyond normal operating capacity with the intent of increasing the cost of services. This may be achieved via high-volume, low-complexity queries (Excessive Queries) or low-volume, high-complexity queries (Resource-Intensive Queries). In Generative AI or Agentic AI systems, adversarial prompts may be introduced into the model's context to cause (Agentic Resource Consumption).
Unlike resource hijacking, where adversaries may leverage AI resources such as computational, memory, or storage for their own purposes, cost harvesting focuses on resource-centric pressure to a service to ultimately cause financial harm to the victim.
Cost Harvesting is especially relevant for cloud-hosted, pay-per-use AI/ML platforms (e.g., LLM APIs, generative image services, vision-language pipelines). By manipulating request volume or request complexity, an attacker can:
- Inflate the victim's compute or storage consumption, leading to higher operational costs.
- Trigger autoscaling mechanisms that provision additional resources, further amplifying cost and exposure.
- Saturate internal queues or GPU/TPU pipelines, causing latency spikes, request throttling, or outright service unavailability for legitimate users.
- ATLAS ID
- AML.T0034
- Priority score
- 16
Mitigations
Defenses that may help against this attack.
AML.M0019 - Control Access to AI Models and Data in Production
Access controls can limit API access and prevent cost harvesting.
AML.M0004 - Restrict Number of AI Model Queries
Limit the number of queries users can perform in a given interval to hinder an attacker's ability to send computationally expensive inputs
Case studies
Examples from public reports and exercises.
Source
Where this page information comes from.
Original source
Original source links
Open the public records and source datasets used for this page.