Idempotency Keys for APIs

Last reviewed on 4 May 2026.

Networks fail. Clients retry. Without idempotency, a successful side-effect that timed out before its response reached the client gets repeated — sometimes catastrophically. Idempotency keys give you a way to deduplicate retries safely, but only if the contract is precise. This page walks through how the pattern works, where the trade-offs sit, and the corner cases that catch real implementations out.

The problem in one paragraph

HTTP says GET, HEAD, PUT, and DELETE are idempotent: doing them twice has the same effect as doing them once. POST is not. So when a POST request fails — connection reset, gateway timeout, 502 from a load balancer — the client cannot tell whether the server processed it. Retry, and you risk a duplicate side-effect: a double charge, a duplicated order, two messages sent. Don't retry, and you risk losing a request that genuinely failed in flight. Neither is acceptable for any operation that costs money or sends a message.

The contract

The pattern that has emerged in production APIs — and which we recommend for any non-idempotent endpoint — works like this:

  1. The client generates a unique key for the logical operation it wants to perform. A UUID is fine; the only requirement is uniqueness within a sensible window.
  2. The client sends the key as the Idempotency-Key request header.
  3. If the server has not seen the key before, it processes the request normally, records the key together with the response, and returns the response.
  4. If the server has seen the key before, it returns the original response without re-doing the work.
  5. If the server has seen the key before but the request body is different, it returns an error (typically 422) so the client knows it has reused a key for a different operation.

The key insight is that the client controls the key, not the server. Only the client knows whether two requests are "the same operation" — the server cannot tell whether two structurally identical requests sent five seconds apart were a retry or a deliberate repeat.

A worked example: charging a card

Consider a payment endpoint:

POST /payments HTTP/1.1
Idempotency-Key: 7c1e4a3a-2b9f-4d2a-9b3a-2c8f1e3d4a5b
Content-Type: application/json

{ "amount": 4500, "currency": "USD", "source": "card_xyz" }

First time the server sees this key, it charges the card and returns 201 with a payment record. The key, the request fingerprint (a hash of the body and the relevant headers), and the full response are stored together.

Now suppose the response never reaches the client — TCP reset, client timeout, whatever. The client retries with the same key. The server sees the key, looks up the stored response, and returns it as if the work had just happened. The card is charged once.

Suppose instead the client retries with the same key but a different body — say, accidentally amount: 5400. The server sees the key, sees the stored fingerprint doesn't match the new request, and rejects with 422 and an error explaining the mismatch. The client now knows it has a bug, not a successful retry.

Storing keys: the trade-offs

The store needs to be fast, durable, and shared across every server that can handle the request. The realistic options are:

  • Redis or another in-memory store with persistence. Fast lookups, fast writes, easy TTLs. Requires careful capacity planning — every key holds the full response body. For most APIs this is the right answer.
  • The same database that stores the actual records. Adds a row per request, but gives you transactional guarantees. Simpler operationally; slower at scale.
  • A dedicated key-value store with strong consistency. Worth it once you outgrow Redis but before you want to push idempotency state into your primary database.

Whichever you pick, the lookup has to be transactional with the write. The classic bug is: check whether the key exists, find it doesn't, do the work, then store the key. Two concurrent retries can both pass the check before either writes the key, and both proceed to do the work. Use atomic insert-or-fail semantics (Redis SET key value NX, a database unique constraint) so that only one request can claim the key.

How long to keep keys

There's no single right answer. The window has to cover any retry the client might reasonably make, plus a margin. Common choices:

  • 24 hours — the conventional default. Long enough for any client-side retry, including ones that involve a human noticing and pressing the button again. Short enough that storage cost is bounded.
  • 7 days — sensible for operations that involve a multi-step external workflow (a payment that takes a day to settle, a webhook that gets retried for a week).
  • 30 days or longer — only when the operation has external side effects whose failure window genuinely lasts that long. Comes with real storage cost.

Whichever window you choose, document it. Clients that don't know how long a key is valid will reuse one that has just expired and get a duplicate.

Corner cases that catch implementations out

The in-flight request

What happens when the server receives the same key while it is still processing the original request? Two acceptable answers: queue the second request behind a lock keyed on the idempotency key, or return 409 with a "request in progress" body. Both work. What is not acceptable is to start a second copy of the work — that defeats the entire point.

Partial failure mid-way

If the work fails halfway through, you have a choice. Either store the failure as the canonical response (so retries return the same failure), or don't store anything (so retries try again). The first is correct when the failure is the genuine outcome of the operation — say, "card declined" — because retrying won't change the answer. The second is correct when the failure is transient — say, "downstream service timeout" — because the client should be able to retry and succeed.

The pragmatic rule: store the response if it is a deterministic outcome of the request. Don't store if the failure was internal and might not recur. This means a 4xx response is usually stored; a 5xx response is usually not.

Idempotency on operations that span multiple resources

If the operation creates several records — say, an order with line items — make sure either all of them are created or none of them are, before you record the key as having succeeded. Storing the key under partial success leaves the system in a state where retries return success but the records are incomplete.

Keys that look unique but aren't

Don't accept keys whose uniqueness depends on what the server happens to know. A timestamp alone is not unique across multiple clients. The customer ID is not unique across operations. The order number is not unique if two clients are independently creating orders. UUIDs are unique by construction; require them, or document the format you accept.

Common mistakes

  • Treating GET as needing idempotency keys. GET is already idempotent at the protocol level. Asking for keys on GET adds noise without adding safety.
  • Validating the key without validating the body. A retried key with a changed body should be rejected, not silently treated as a hit on the original.
  • Idempotency on the wrong scope. Per-account, not global. Two different customers must be able to use the same UUID without colliding.
  • Forgetting concurrency. Without atomic claim-and-write, two retries that arrive before the first response is stored will both do the work.
  • Not telling clients. Idempotency is a contract; if it isn't documented, clients won't use it, and you'll keep getting duplicate-charge bug reports.

Where to go next

For how this fits the broader API design picture, see API Design Best Practices. For the rate-limiting algorithms that often run alongside idempotency in retry-heavy clients, see API Rate Limiting Strategies. For the related problem of delivering events reliably to a webhook, see Webhook Design and Delivery.