How to Integrate with a Web API — Patterns and Pitfalls

The shape of a good integration

Every API integration has the same anatomy regardless of language or framework:

A client that knows how to talk to the API — base URL, authentication, request encoding.
A retry layer that decides what to do when calls fail.
A circuit breaker that protects the rest of your system when the API is down.
An observability layer that records what happened.
A fallback for cases where the API isn't available.

Most integrations skip three of those five and get away with it for a while. They stop getting away with it the first time the API has a real outage, when retries hammer the recovering service, or when "the integration is broken" turns out to be unfixable because no one knows what's actually failing.

The client layer

Build a thin wrapper around your HTTP library that handles the cross-cutting concerns once. The wrapper owns:

The base URL.
Authentication: token attachment, refresh on expiry.
Common headers (User-Agent identifying your app, Accept, request IDs).
Request and response serialization (JSON to/from your domain types).
Connection pooling — reuse connections rather than opening a new TCP+TLS connection for every call.

What it does not own: business logic. The wrapper exposes "fetch order by ID" or "create payment", and the calling code has no idea what the URL is or how the auth works. When the API changes — different auth, new pagination, version bump — there's exactly one place to update.

Most languages have idiomatic HTTP clients to build this on: requests in Python, fetch/Axios in JavaScript, OkHttp in Java/Kotlin, net/http in Go. The official SDK pattern (when it exists) is usually fine if it does these things well; when it doesn't, building a thin wrapper around the raw HTTP client is straightforward.

Retries: the most misused pattern in API integration

The naive retry loop — "if it failed, try again" — does more damage than no retries at all. A retry that retries the wrong errors, or retries too aggressively, or retries non-idempotent operations without coordination, turns one failure into a cascade.

What to retry, and what not to

Network errors and timeouts — retry. The request may not have reached the server, or the response may have been lost. Either way, the server is in an unknown state.
5xx responses — retry. The server failed; trying again may succeed.
429 Too Many Requests — retry, but only after waiting at least the duration in the Retry-After header. Retrying immediately is worse than not retrying.
4xx responses other than 429 — do not retry. The server is telling you the request itself is wrong; trying again won't help.
3xx redirects — follow them, but with a maximum hop count. Most HTTP clients do this automatically.

Backoff and jitter

Retries should back off exponentially with random jitter. The pattern: first retry after 200ms, then 400ms, then 800ms, then 1.6s, capped at some maximum (typically 30s). Add 0–100% random jitter on each interval. Without jitter, every client that failed at the same moment retries at the same moment, producing synchronized waves of load that delay recovery.

The full retry budget should also be capped — at most N retries per request, and at most M total retry-time. Without these, a request retries forever; an API outage produces a queue of accumulating retries that floods the API the moment it recovers.

Idempotency

Retrying a non-idempotent request (POST that creates a record, sends a message, charges a card) is dangerous: if the original request reached the server but the response was lost, the retry duplicates the operation. The standard solution is the Idempotency-Key header — see Idempotency Keys for APIs for the full pattern. Generate the key on the client per logical operation; reuse it across retries of the same operation.

Circuit breakers

Retries handle individual failures. Circuit breakers handle systemic failures.

The pattern: count failures over a sliding window. When the failure rate exceeds a threshold (say, 50% over the last 30 seconds), the circuit "opens" — subsequent calls fail immediately without making the network request. After a cooldown period, the circuit goes "half-open" and lets a small number of trial requests through; if they succeed, the circuit closes and normal traffic resumes.

Why this matters: without a circuit breaker, your system keeps making requests to a failing API. Each request times out, ties up a thread or a connection, slows down the rest of your work. The failing API takes down your service, not because of cascading failure, but because of cascading timeout. The circuit breaker says "stop calling the broken thing" so the rest of your system stays healthy.

Most language ecosystems have a circuit breaker library: resilience4j for the JVM, opossum for Node.js, pybreaker for Python. Configure one per upstream API.

Timeouts

Every network call needs an explicit timeout. Default timeouts in HTTP libraries are often "infinite" or measured in minutes — useless for API integrations where 99th-percentile latency should be under a few seconds.

Two timeouts to set:

Connect timeout — how long to wait for the TCP+TLS handshake. Should be short (1-3 seconds) because connecting is fast when it works.
Read timeout — how long to wait for the response after the request is sent. Depends on what the API does; for most synchronous APIs, 5-30 seconds is reasonable. For long-running operations, the API should expose an async pattern (return 202 with a polling URL) rather than holding the connection open.

Timeouts should be slightly less than your total request budget. If your service has a 10-second deadline to respond, calling an upstream API with a 30-second timeout means you'll time out the upstream and abandon the work too late to do anything else.

Observability

The integration is a black box until something goes wrong; the observability layer determines whether you can figure out what.

Logs

Log each outbound API call with: timestamp, method, URL (with sensitive parts redacted), request ID, response status, latency, and the correlation ID that ties this call to the user request that triggered it. Don't log full request or response bodies — they're large, often contain sensitive data, and the value-to-cost ratio is poor. Log enough to reconstruct what happened, not the literal traffic.

Metrics

Per-endpoint metrics that should always exist: request rate, error rate (by status code class), and latency distribution (p50, p95, p99). When the API gets slow or starts erroring, these surface the problem before users notice.

Traces

Distributed tracing (OpenTelemetry is the standard) lets you follow a single request through your system and into upstream APIs. The cost is per-request overhead and trace storage; the benefit is being able to answer "why did this user's checkout take 8 seconds?" in seconds rather than hours. Worth the cost for any non-trivial integration.

Handling rate limits gracefully

Rate limits are a contract — you have N requests per period, and exceeding it costs you 429s. The integration should respect them rather than discover them through failures.

Two patterns:

Read the rate limit headers. APIs that send X-RateLimit-Remaining on every response let your client self-throttle: when remaining drops below a threshold, slow down voluntarily.
Token bucket on the client side. Cap your own request rate to the documented limit, leaving headroom (e.g., target 80% of the limit). This shifts the throttling from the server returning 429 to your client never sending the request — which is faster and avoids the recovery cost on the server side.

The algorithms behind rate limits — what your client is up against — are covered in API Rate Limiting Strategies.

Handling pagination

Iterating through a paginated endpoint is one of those things that's trivial in the happy path and full of pitfalls when things go wrong.

The robust pattern:

Use cursor pagination if the API supports it. Offset pagination becomes incorrect when items are added during iteration; cursor pagination doesn't.
Stop conditions are explicit. Iterate until hasNextPage is false (or the equivalent), not until you receive an empty page (which can happen mid-iteration on offset pagination).
Cap the maximum number of pages to defend against runaway iteration when something's wrong with the API or your stop condition.
Persist progress for long iterations. If you're paging through a million records, save the cursor periodically so you can resume after a crash without restarting.

The pagination patterns themselves — offset, cursor, keyset — are covered in API Pagination Patterns.

Webhooks: the inbound side

If the API you're integrating with delivers events via webhooks, the integration also has an inbound surface. The patterns are different from outbound calls:

Verify signatures on every received webhook. Don't trust the source IP; verify the HMAC.
Acknowledge fast, process async. Return 200 within a few seconds; do the actual processing in a background job. Webhook senders time out fast and retry, and slow handlers cause duplicate deliveries.
Be idempotent. The sender will retry; you'll see the same event twice. Deduplicate by event ID.

The full design pattern, including signing schemes, retry behaviour, and replay protection, is in Webhook Design and Delivery.

Versioning and migration

The API you integrate with will change. Sometimes the changes are additive and you can ignore them; sometimes they're breaking and you have to migrate. Two preparations make this manageable:

Pin the API version explicitly. Whether the version is in the URL (/v1/...) or in a header, set it explicitly. Don't rely on the API's default; defaults change.
Keep the integration code thin. The thinner the wrapper, the easier the migration. If your domain code depends on the wire format directly, every change requires touching the whole codebase.

For the broader migration patterns, see the migration guide.

Testing integrations

Three layers of test, each catching a different class of bug:

Unit tests with mocked HTTP. Fast, deterministic, run on every commit. Test your wrapper's logic — auth, retries, error handling — against simulated responses.
Contract tests. Test against a recorded or schema-validated version of the API's responses. Catch the case where your code expects a field the API no longer returns.
Integration tests against a sandbox. Hit the real API (in test mode) with real-shaped requests. Slower, less deterministic, but catches the cases where the API documentation doesn't match the API's behaviour. Run on a schedule, not on every commit.

Common mistakes

No timeouts. The most common failure mode of API integrations. A hung upstream takes down the whole service.
Retrying every failure. Retrying 4xx responses doesn't help and wastes capacity. Retry transport errors and 5xx; surface 4xx to the caller.
No backoff. Tight retry loops turn a recoverable upstream blip into a self-inflicted DDoS.
No circuit breaker. The integration keeps trying to call a dead API; every call ties up a thread until it times out.
Logging request bodies. Often contains credentials, PII, or both. Hash or redact.
Hardcoding URLs. The base URL belongs in config, not in code. Different environments need different URLs.
One client for everything. If the integration grows to call several distinct APIs, give each one its own client wrapper. Sharing makes refactoring later harder.
No fallback. When the API is down for an hour, what happens to user requests that depend on it? Returning an error is sometimes the right answer; degrading gracefully is often better. Decide explicitly.

Where to go next

For the underlying patterns this page references, see idempotency keys, rate limiting, pagination, error handling, and webhooks. For migrating an existing integration to a new API version, see the migration guide. For the broader API design context, see API design best practices.