GraphQL — A Working Reference

What GraphQL is for

The original problem GraphQL solves: a rich client (a single-page app, a mobile screen) needs data from a dozen different resources to render one view. With REST that's a dozen round trips, or one bespoke "view endpoint" that the backend team has to maintain alongside the granular ones. GraphQL lets the client describe exactly the shape of data it needs in one query, and the server returns precisely that — no more, no less.

That capability is real, and for the right shape of application it's transformative. For others — server-to-server APIs, public APIs whose consumers are mostly other servers, simple CRUD interfaces — GraphQL adds operational complexity without solving a problem you have. Knowing when not to reach for it is half the value.

The schema is the contract

A GraphQL API is defined by its schema, written in the Schema Definition Language (SDL). Three top-level types matter:

Query — read operations. Side-effect-free.
Mutation — operations that change state.
Subscription — server-pushed events over a persistent connection (typically WebSocket).

A small example schema:

type Order {
  id: ID!
  status: OrderStatus!
  customer: Customer!
  items: [OrderItem!]!
  total: Money!
  createdAt: DateTime!
}

type Query {
  order(id: ID!): Order
  orders(filter: OrderFilter, first: Int, after: String): OrderConnection!
}

type Mutation {
  cancelOrder(id: ID!, reason: String): CancelOrderPayload!
}

The exclamation mark means non-null. [OrderItem!]! means a non-null list of non-null items. Get the nullability right at design time — switching a field from nullable to non-null is a breaking change, but the reverse usually isn't.

Schema design — the decisions that matter

Naming

The conventions:

Types in PascalCase: Order, OrderItem.
Fields in camelCase: createdAt, not created_at.
Enums in SCREAMING_SNAKE_CASE: OPEN, FULFILLED, CANCELLED.
Mutations are verb phrases: cancelOrder, createPayment. Not orderCancel.

Mutation shape

Two patterns. The simple one: a mutation takes flat arguments and returns the affected object. The robust one: every mutation takes a single input argument and returns a payload type. The robust pattern wins for non-trivial APIs:

input CancelOrderInput {
  orderId: ID!
  reason: String
}

type CancelOrderPayload {
  order: Order
  errors: [UserError!]
}

type Mutation {
  cancelOrder(input: CancelOrderInput!): CancelOrderPayload!
}

Why: input types are forward-compatible — you can add optional fields without breaking existing clients. Payload types let you return both the affected object and structured errors, which is critical because GraphQL's top-level errors array is ill-suited to expected business errors.

Errors

GraphQL has two error channels. The top-level errors array carries protocol-level failures (parsing, authorization, server crashes). The payload's own error fields carry expected business errors (insufficient balance, validation failed, resource locked). Mixing them is the most common GraphQL design mistake. The rule: anything a UI needs to display should not be in the top-level errors array. That array is for "the server messed up", not for "your input is wrong."

The convention for in-payload errors:

type UserError {
  message: String!
  code: ErrorCode!
  field: [String!]   # path to the offending field, like ["input", "amount"]
}

For deeper coverage of error envelope design — including how this maps to the REST world — see API Error Handling Conventions.

Pagination

The de-facto standard is the Relay Cursor Connection spec. Every paginated field returns a Connection type with edges, pageInfo, and a cursor on each edge:

type OrderConnection {
  edges: [OrderEdge!]!
  pageInfo: PageInfo!
  totalCount: Int   # optional, expensive on large sets
}

type OrderEdge {
  node: Order!
  cursor: String!
}

type PageInfo {
  hasNextPage: Boolean!
  hasPreviousPage: Boolean!
  startCursor: String
  endCursor: String
}

The shape feels heavy for simple cases but pays off when clients need to navigate forward and backward, and it leaves room for adding pagination metadata later without breaking. For the underlying pagination algorithms and their trade-offs, see API Pagination Patterns.

Versioning — and why GraphQL avoids it

The conventional GraphQL position is: never version. Add fields freely; never remove or rename them. When a field becomes obsolete, mark it deprecated:

type Order {
  total: Money @deprecated(reason: "Use totalAmount instead.")
  totalAmount: Money!
}

This works as long as you're disciplined about not breaking existing fields. The cost is schema bloat over time — your Order type accumulates deprecated fields that you can never remove. Some teams accept this; others run a "graveyard cleanup" once a year to remove deprecated fields older than some threshold, treating it as a known breaking change with advance notice. Either approach is defensible.

The N+1 problem

The query { orders { items { product { name } } } } looks innocent. Naively implemented, it fetches N orders, then for each order fetches M items, then for each item fetches the product. That's 1 + N + (N×M) database queries for one GraphQL request. It's the single most common GraphQL performance disaster.

The standard solution is DataLoader — a per-request batching layer that collects all the IDs requested for a given type during one tick of the event loop, then issues a single batched query. With DataLoader, the same query becomes 3 database queries regardless of how many orders you have. Every production GraphQL server needs DataLoader (or its language equivalent) on every entity-fetching resolver. This is not optional.

Query depth and complexity

GraphQL lets clients ask for arbitrarily deep nested queries: { user { friends { friends { friends { ... } } } } }. Without limits, a single malicious or careless query can cost more compute than your API can sustain. Two complementary mitigations:

Depth limiting. Reject queries deeper than some threshold (5–10 levels is typical). Crude but effective.
Complexity scoring. Assign a cost to each field; sum the cost across the query (multiplying by list sizes); reject if it exceeds a per-request budget. More flexible than depth but more work to maintain. Necessary for public-facing GraphQL APIs.

Persisted queries — where clients send a hash of a known query rather than the query itself — eliminate this whole class of problem because the server only ever runs queries it has pre-approved. They're the right answer for high-volume public APIs but require build-time tooling on the client side.

Authentication and authorization

Authentication for GraphQL is no different from REST: the request carries a token in the Authorization header, the server resolves it to a user, and the user is attached to the request context that resolvers can read. See the authentication reference.

Authorization is harder. In REST, you can check permissions per endpoint. In GraphQL, a single query can touch dozens of resolvers, and each one needs its own check. The pragmatic approaches:

Resolver-level checks. Each resolver explicitly checks whether the user can see this field for this object. Verbose but transparent. Best for APIs with simple permission rules.
Schema directives. Annotate the schema with @auth(requires: ADMIN) directives; a runtime middleware enforces them. Cleaner for large schemas with role-based rules.
Service-layer checks. Push authorization into the underlying domain services and let resolvers be thin pass-throughs. Best when GraphQL sits in front of an existing application with its own authorization model.

Whichever approach you pick, never rely on the client not asking for forbidden data. Field-level authorization has to be enforced server-side.

Subscriptions

Subscriptions push events to clients over a persistent connection — typically WebSocket. They're useful for collaborative apps, live dashboards, chat. They're more expensive to operate than queries because each subscriber holds an open connection and the server has to fan events out.

Two pieces of advice. First: subscriptions are hard to scale; if you can solve the problem with polling at a sensible interval, do that. Second: subscriptions belong on a separate transport from queries and mutations — the operational characteristics are different (long-lived connections, fan-out load, different failure modes), and conflating them complicates everything.

When GraphQL is the wrong choice

Server-to-server APIs. The over-fetching problem GraphQL solves is a client-side problem. Servers don't care about a few extra bytes; they care about predictable performance and easy caching, both of which REST gives them more cheaply.
Public APIs with thousands of consumers. Every consumer can issue arbitrary queries against your schema, which makes capacity planning and abuse prevention much harder. Persisted queries help, but force the consumers into your build pipeline.
APIs that mostly serve files or binary data. GraphQL is built around JSON. File uploads work but feel grafted on; serving images or video through GraphQL is hostile.
Teams without ops experience. N+1, complexity attacks, schema evolution, monitoring per-field latency — these are real problems that need someone owning them. If you don't have that capacity, REST gives you more for less.

Common mistakes

Mirroring the database in the schema. Your domain model is not your database. Design the schema for the clients, not for the ORM.
Returning the same payload type from every mutation. Each mutation is a distinct operation; its payload should reflect that. Generic MutationPayload types defeat the type system.
Using GraphQL errors for business errors. Validation failures, business rule violations, "card declined" — all belong in the payload, not in the protocol error array.
No query complexity limits. Without them, the first hostile actor (or careless client) takes the API down.
One giant resolver per type. Resolvers should be small and composable. Complex business logic belongs in services that resolvers call into.
Caching at the HTTP layer. GraphQL POSTs everything to one endpoint, so HTTP caches see only opaque blobs. Cache at the resolver level (per entity ID) or use persisted queries to make the URLs cacheable.

Where to go next

For when REST is the better default, see REST API Design. For real-time push (which subscriptions are built on), see WebSockets. For the security model that applies to both, see API security.