Idempotency Keys: Why AI Agents Need Safe Retries
AI agents retry on failure. That is not a bug — it is how autonomous systems handle unreliable networks. The problem is your API. Without idempotency keys, every retry is a duplicate charge, a duplicate order, or a corrupted record. Stripe solved this with one header. Most APIs have not.
The Retry Problem: Why Agents Are Different from Humans
When a human submits a payment form and the page hangs, they wait. They check their email for a confirmation. They look at their bank account. They call support before trying again. Humans have judgment, patience, and multiple feedback channels.
AI agents have none of that. An agent sends a POST request, gets a timeout or a 500 error, and retries immediately. Most agent frameworks retry automatically — 3 times with exponential backoff is the default in LangChain, CrewAI, and the Anthropic SDK. That means a single network hiccup during a payment request can generate four identical charge attempts before the agent even reports a problem.
Of 500 businesses AgentHermes has scanned, fewer than 8% support any form of idempotency on their mutating endpoints. The other 92% will silently process every retry as a new transaction. In a world where agents handle purchases, bookings, and data mutations autonomously, this is not a minor oversight — it is a data integrity crisis waiting to happen.
The 3 Idempotency Patterns Every Agent-Ready API Needs
Not every endpoint needs the same treatment. There are three distinct patterns that together cover every type of API interaction an agent will make. The best APIs implement all three. Most implement zero.
Key-Based (Stripe Pattern)
The client sends an Idempotency-Key header with every mutating request. The server stores the response mapped to that key. If the same key arrives again, the server returns the stored response without re-executing. Stripe pioneered this and it is the gold standard.
Example: POST /v1/charges with Idempotency-Key: abc-123 returns the same charge object on every retry
Natural Idempotency (GET, DELETE, PUT)
Some HTTP methods are idempotent by definition. GET never changes state. DELETE of the same resource returns 204 or 404. PUT replaces the full resource, so calling it twice produces the same result. Agents can safely retry these without special headers.
Example: DELETE /v1/customers/cus_123 returns 200 the first time, 404 on retries — both safe
Conditional (ETag / If-Match)
The server returns an ETag with each response. The client sends If-Match with the ETag on updates. If the resource changed between retries, the server returns 412 Precondition Failed instead of applying a stale update. This prevents lost-update bugs in concurrent agent scenarios.
Example: PUT /v1/orders/ord_456 with If-Match: "v3" fails with 412 if order was already modified
Stripe’s implementation is the reference standard. Every POST endpoint accepts an Idempotency-Key header. Keys are stored for 24 hours. If the same key arrives while the original request is still processing, Stripe returns a 409 instead of executing a second time. The response includes the original request ID so the agent can correlate. This is why Stripe scores 68 on agent readiness — details like this compound across all 9 dimensions.
What Goes Wrong Without Idempotency
Every failure mode below has happened in production. Agents do not hesitate to retry. The question is whether your API handles it safely.
What AgentHermes Checks in D9
Idempotency contributes to D9 Agent Experience (10% weight) and also impacts D8 Reliability (13% weight). Here are the specific signals the scanner evaluates.
Idempotency-Key header support
0.10 in D9Does the API accept and honor an Idempotency-Key header on POST/PATCH requests? AgentHermes sends a test request with the header and checks if the response includes idempotency metadata.
Duplicate request detection
0.10 in D9Does the API detect and reject duplicate submissions? AgentHermes looks for 409 Conflict responses, duplicate-detection error codes, or idempotency-key-already-used error messages in the API documentation.
409 Conflict for duplicates
Impacts D8 ReliabilityWhen a duplicate mutating request arrives, does the server return 409 with a structured JSON body pointing to the original resource? Or does it silently create a duplicate? The former is agent-safe. The latter is dangerous.
The compounding effect: Idempotency does not live in a single dimension. An API with proper idempotency keys also tends to have better error handling (409 Conflict instead of silent duplicates), better rate limiting (retries do not burn quota), and better reliability metrics (fewer false errors). A single infrastructure decision lifts 3 to 4 dimensions simultaneously.
How to Implement Idempotency in 50 Lines
The implementation is simpler than most developers expect. You need three things: a key-value store (Redis, DynamoDB, or even an in-memory Map for prototyping), middleware that intercepts mutating requests, and a TTL policy so keys do not accumulate forever.
The middleware reads the Idempotency-Key header from incoming POST and PATCH requests. If the key exists in the store, it returns the cached response with a 200 status and an Idempotent-Replayed: true header so the agent knows this is a replay. If the key does not exist, it processes the request, stores the response mapped to the key with a 24-hour TTL, and returns normally.
For in-flight deduplication — when the original request is still processing — the middleware should return a 409 Conflict with a JSON body containing {"error": "request_in_progress", "retry_after": 2}. This tells the agent to wait 2 seconds before trying again — much better than creating a duplicate transaction.
Store options
Redis (best for production, supports TTL natively), DynamoDB (serverless, pay-per-read), PostgreSQL (already in your stack), in-memory Map (prototyping only — lost on restart).
TTL policy
Stripe uses 24 hours. This is the standard. Long enough for any reasonable retry window, short enough to not accumulate stale keys. Shorter TTLs (1 hour) work for high-volume APIs.
Key format
UUID v4 is the standard. Agents generate a new UUID per logical operation and reuse it across retries. Some APIs accept arbitrary strings — UUIDs are safer because they avoid collisions.
Response headers
Return Idempotent-Replayed: true on cached responses. Return the original X-Request-ID. Include the original creation timestamp. These help agents distinguish first-run from replay.
Frequently Asked Questions
What is an idempotency key?
An idempotency key is a unique string that a client attaches to a request. The server uses this key to recognize retries of the same operation. If it sees the same key twice, it returns the original response instead of executing the operation again. This prevents duplicate side effects like double charges or duplicate orders.
Why do AI agents need idempotency more than human users?
Human users see a loading spinner and wait. If something fails, they check their email or account page before retrying. AI agents have no visual feedback — they see a timeout or error code and immediately retry, often within milliseconds. Without idempotency, that automatic retry creates a duplicate transaction every time a network hiccup occurs.
Does idempotency affect my Agent Readiness Score?
Yes. Idempotency support is checked in D9 Agent Experience (10% weight) and also impacts D8 Reliability (13% weight). Together, these two dimensions account for 23% of your total score. Businesses with idempotent APIs consistently score 5 to 8 points higher in these dimensions.
How do I add idempotency to an existing API?
The simplest approach is middleware that intercepts POST and PATCH requests, reads the Idempotency-Key header, checks a key-value store (Redis works well) for a stored response, and returns it if found. If not found, it processes the request normally and stores the response mapped to the key with a 24-hour TTL. Stripe open-sourced their approach and it takes about 50 lines of middleware code.
What if my API does not support custom headers?
If you cannot add header support, use natural idempotency patterns. Design your endpoints so that PUT replaces the full resource (making retries safe), DELETE is always safe to retry, and POST endpoints accept a client-generated unique ID in the request body that the server uses for deduplication. This is less elegant than the Idempotency-Key header but achieves the same safety guarantee.
Does your API handle retries safely?
Run a free Agent Readiness Scan to see your D9 Agent Experience score, including idempotency support, error handling, and rate limiting.