Production Readiness Checklist
Use this checklist to identify potential gaps before your graph handles production traffic
We recommend that you read through this checklist and identify critical features for your team before your supergraph begins handling production traffic.
GraphOS Studio
- Ensure that you've created multiple variants to represent the different environments where your supergraph runs (for example, production, staging, and development).
- Protect your production variant to avoid accidental changes while working in Studio.
Apollo Router
- Ensure that you've correctly configured managed federation and GraphOS schema usage reporting.
- For security, turn off introspection for all production routers (by default the router disables introspection, but make sure you are not using
--dev
mode).- You can continue to view and fetch your GraphQL schemas from GraphOS and run operations from GraphOS Studio Explorer.
- Configure the router traffic shaping features:
- Set request and subgraph level timeouts and rate limits
- Deduplicate subgraph requests
- Communicate with subgraphs using APQ
- Enable operation limits to block large and malicious requests
- Configure additional tracing, metrics, and logging through OpenTelemetry or Prometheus
- Enable the operation and query plan distributed cache
- Optionally, enable any other features deemed critical for your deployment of Apollo Router
Subgraphs/servers
- For security, turn off introspection for all production GraphQL subgraphs.
- You can continue to view and fetch your GraphQL schemas from GraphOS and run operations from GraphOS Studio Explorer.
- Ensure that you've integrated
rover subgraph check
androver subgraph publish
into your CI/CD pipeline. - If your subgraph servers are listed as compatible with
FEDERATED TRACING
, ensure that you've enabled federated traces, and that you can view operation metrics as expected in Apollo Studio.- Enable fractional trace sampling via
fieldLevelInstrumentation
to reduce performance hits due to tracing.
- Enable fractional trace sampling via
- Ensure that you've load-tested your graph.
- Test loads should be representative of your current traffic (both in terms of volume and in terms of the actual operations you execute in the test).
- To investigate performance issues, use Apollo Studio to identify which operations are performing slowly.
- Look at resolver execution times to identify slow areas of execution.
- Whenever possible, avoid making multiple calls to data sources within a single resolver.
- Understand query plan execution to help understand slow operations and optimize your supergraph to avoid them.
- Consider adding caching layers.
- Apollo Server supports automatic persisted queries (APQ) out of the box.
- If using Apollo Server, ensure that you use a distributed caching system for APQ in production to avoid cache inconsistency across server instances.
- Optionally use the
@cacheControl
directive to enable your CDN to cache APQ GET requests using theCache-Control
header.
- Optionally add full response caching to improve performance.
- Apollo Server supports automatic persisted queries (APQ) out of the box.
Clients
- Ensure that your clients identify themselves by name and version.
- If you're using an Apollo Client library, you can add a client name and version to the constructor.
- For example, the React client uses the
name
andversion
attributes in the constructor options. - If you're using a third-party GraphQL client, set the
apollographql-client-name
andapollographql-client-version
HTTP headers for each request to identify your client. - For an example of enforcing client identification in your gateway, see this tech note for Client ID enforcement.
- Consider adding caching layers.
- Enable Persisted Queries and/or Automatic Persisted Queries (APQ) support for request-size savings.
- Enable and configure the client side normalized cache