Using Codegen to Build Better GraphQL APIs

How we used Codegen to create more robust GraphQL APIs for our own backend services.

Using Codegen to Build Better GraphQL APIs

Photo of server rack before and after cleanup, via Blue Wave Communications.

GraphQL has become an essential part of the modern web development stack. At NUMI, we've been using GraphQL for 4 years and learned a lot in the process. GraphQL has many, many advantages over REST:

  • Strong runtime type enforcements, validating both inbound and out payloads
  • Query composability via fragments
  • Nested fields, allowing clients to fetch more complex data objects from a single request
  • A defacto documentation layer for your entire business domain

But maintaining a GraphQL API over a long timespan can be tricky. Because GraphQL is still a relatively new server/client paradigm, much of the existing literature focuses on getting started, oftentimes tutorials. The number of teams who have been maintaining GraphQL servers for a long time, across a lot of shipped product, is still relatively small. NUMI adopted GraphQL early - among the first 13% of web developers who had tried it by the end of 2017. Our 4 years of experience with GraphQL have been extremely intense periods of change because of the many pivots we made before we found a sustainable business. Our pivots were across e-commerce models, so we maintained the same backend and a data model with many of the same concerns. Each pivot was a stress test on our GraphQL set up, as we figured out how to adapt our previous schema to the new business concerns introduced by the pivot.

The Problem

Our GraphQL API was relatively simple - it was meant to be a "thin" layer allowing clients to query and mutate data from our database. The types, queries, mutations, and inputs in our GraphQL schema were tightly coupled with the tables in our database. While the fetching layer was simple,  maintaining a manually defined GraphQL schema proved overwhelming as the tables in our database (and the associations between them) grew.

We found that maintaining and amending a GraphQL schema manually became extremely tedious after a certain level of data complexity. The difficulty grew proportionally to the number of models in our business domain and the number associations between the various models. Before we go into the solution we developed for this, let's outline exactly why it was so hard.


As we increased the number of models in our business domain, we increased our GraphQL schema's surface area. We had to maintain types, inputs, queries, and mutations for each model. The collection of schema definitions across models was fairly similar - query for 1 record, query for many records, mutate one record, mutate many records, types looked identical to inputs except for createdAt, updatedAt, and id fields. But we had to manually implement these across each table.

This was a perfect storm of misaligned engineering incentives. As our data model grew with the new features we shipped - there was a strong pressure on us to reduce how much of our data model was represented in GraphQL. Our solution was to only "define what was necessary". For example if we had a User model, and we knew that we'd never need to query for multiple users, we simply didn't define a getUsers query in our schema.

It seemed like a good solution for avoiding unnecessary maintainance burden. But over time, this created a highly inconsistent API. Sometimes we would later discover a need for REST-ish queries and mutations that weren't defined by our GraphQL schema. Oftentimes we'd discover in the middle of building a feature that we were blocked until we implemented a query or mutation (and included it and its parameters in the SDL). Eventually it became part of the work for planning features to do a spike on whether we already had exposed the necessary queries or mutations. These features had simple API requirements - just reading from and writing to tables in our database.

Migration Costs

Just as struggled with the lack of a consistent interace across many models, it quickly proved painful to upgrade our schema. A handy way to measure the complexity of a stack is to estimate how many different passages of code you need to update in order to make a change.

Just to update the GraphQL of section of our stack, we had to update type and input definitions. To add an association, we also had to update the resolver for the type. When we were adding a table that would be associated with many models (such as Image, which can belong to Product, ProductVariant, User, Organization, and more), the number of required changes became overwhelming. Remember that these were just the GraphQL-specific changes, not including changes to our ORM declarations, database migrations, and Typescript definitions.

These migration costs hit a breaking point during a major upgrade we made where we had to implement a lightweight CMS. It became so costly to update the various parts of our GraphQL schema and resolvers that we were forced to reassess

The Solution: ORM to GraphQL Codegen

The solution to this problem was a GraphQL codegen we developed that would convert our ORM definitions to GraphQL.

The idea was basic - a method that accepts an ORM model as a parameter and then auto generates the following GraphQL-related values:

  • the schema definition for its type, input, and queries and mutations (as strings)
  • resolvers (as functions) for:
  • the queries getFoo, getFoos, foosCursor
  • the mutations createOrUpdateFoo, bulkCreateOrUpdateFoo, deleteFoo, bulkDeleteFoos
  • fields that derive from any associated tables, whose associations can be accurately modeled in the ORM
  • any custom fields, which can have be configured with custom resolvers and type definitions

These values are merged together into the parameters to create an Apollo server.

Hopping into a bit more detail, here is the generator function code, reprinted below as it appears in our code base, with a few clarifying comments:

The modules imported from that file are somewhat self explanatory.

The Upside From GraphQL Codegen

~0 Incidental Coordination Cost for Migrations

Deriving our GraphQL parameters from our ORM layer made it so that most updates to our data model reflected automatically in GraphQL. If the model was properly described as an ORM instance, everything "just worked". This significantly reduced production incidents caused by database migrations, which allowed us to ship features faster.

Consistent Types, Inputs, and Operations across Models

While GraphQL offers many improvements over REST, it makes no native attempt to offer a "Uniform Interface". Uniform interfaces have huge advantages because they become a colloquial language within the project, where verbs are operations (createOrUpdateUser, bulkCreateOrUpdateUser, deleteUser), adjectives are qualifiers that convey context and expectations (eg UserInput which has no "id" prop vs UserResult vs which has an id, createdAt, and updatedAt vs UserConnection which is a set of Users inside of a cursor response). Every developer on our team knew that you could run createOrUpdate on any table in our database. Our frontend development speeds improved greatly, as we could go into every situation with a clear familiarity of what was possible with the GraphQL API.

To demonstrate how important this consistency is, let’s start by looking at some of the mutations we implemented and defined manually, in signature, input, and sometimes even return type. Below is a select list of. There was a pattern we began to notice in our mutations after years of maintaining this API. I’ve color coded the mutations so that you can notice the pattern in seconds, rather than 3 years.























You can probably tell the mutations are named in roughly the pattern “verbObject”. Second, you can probably see many verbs that are CRUD operations while others that seem a little more “custom”, as if they are doing more intricate operations than simply creating, updating or deleting single records. While some of these mutations were misnomered CRUD operations, others performed some more complex data changes like addLineItemsFromOrderToCart. Other mutations, such as addCustomLineItemToOrder, could be modeled as a CRUD operation where the input has certain predictable values (such as type: “CUSTOM”).

We decided to REST-ify all of our CRUD operations. This first of all, created a consistent language from which we could give a name to every CRUD mutation we supported, and all the tables in our database. Note that just because it was possible to attempt a CRUD mutation, it did not mean that we supported it. This NUMI-internal “dialect” allowed developers to memorize our mutations like a table rather than a list. Here is a selection from that table of CRUD mutations:

Every operation that could be modeled using CRUD verbs was. But there were still cases where we had to run more complex mutations, so we allowed developers to define custom mutations.

The benefits of the declarative approach were immediately clear in the PR - we eliminated over 80% of our SDL code, leaving only the custom types and operations which ran composite operations that were difficult to describe. This custom SDL was significantly easier to maintain as covered a much smaller, less interconnected surface area.

The Surprising Problem And Its Solution

The Problem We Didn’t Know We Had

Sometimes we caught implementation bugs that came from repeating boilerplate logical patterns that slipped past code analysis. These are much more subtle than simply wrong boilerplate - these are failures to correctly implement ceremonies.

Some common ones are authorization checks based off of deep joins - eg checking all Roles where Role.status==='ACTIVE' for a User to see if any of its Role.OrganizationIds match when that User tries to delete a product variant, for example.

Incorrectly implemented ceremonies are frustratingly easy to miss in code review. A bad night’s sleep or a growling stomach would be more than enough distraction for any one of us to miss an incorrect ceremony implementation in code review.

Or perhaps reusing the above logic chain as a subcheck when a User tries to add a StockItemEvent for a given StockItem (by comparing to the same array of active roles for the user).

Unexpected Hero: Composable, Consistent Business Logic to Eliminates Boilerplate Logical Patterns

Arguably the biggest gain was the one we foresaw the least. Our query and mutation generators forced us to think in a standardized way about how data gets retrieved and modified, regardless of the model. Namely, it forced us to define a "lifecycle". We realized that our operations fit a common pattern.

authorizer (required)

CRUD operation (required)

sideEffect (C/U/D only, optional)

First there was an authorizer check for every mutation - if the user was not authorized to query or change the data, we threw an error. Then we would read from or write to the database. This forced us to think about authorization as a first-class concern for common operations for each model. Considering authorization against input (if create or update operations), existing record (if update or delete), and context (the user, the organization that they’re operating within, etc) helped us define a consistent interface for authorization. We haven’t yet found a great general solution to authorization concerns in GraphQL, but this helped a lot.

In the above file, the most important import (no pun) is actually types. The relevant passage of types shows the most surprising benefit of this new code pattern. Can you guess what that benefit is, and how these typedefs demonstrate it?

By making our GraphQL schema as declaratively defined as possible, we were forced to describe the pattern by which we should execute our "RESTful" GraphQL. When we migrated away from manually implemented queries and mutations to these "codegened" ones, we saw several common themes in our operations.

These are too subtle and too boring to pass detection. Rather than depend on your engineering team to keep all these concerns in their head whenever writing your business logic, why not create a hardened interface that limits the number of ways you can do things? Immediately, we could see our code becoming more stable and more secure. Our GraphQL types derived from our ORM model definitions, meaning as long as our ORM model definitions were correct, GraphQL would automatically validate inputs and outputs correctly for us. Those static definitions became runtime protections.

Another place where these patterns stabilized our business logic was the introduction of "sideEffects". These were handlers fired after the core mutation creating, updating, or deleting a record. First, they helped enforce the use of transactions to keep database operations atomic - a huge win for ensuring data consistency. Second, they forced us to enumerate everything we do as a side effect for these database changes. From creating files in S3 buckets to syncrhonously verifying payment via API. By forcing these synchronization events to happen within the transaction, we could ensure that the operation would rollback if any of these sideEffects failed.

Finally, was authorization. We've not yet found a great general pattern for authorization with GraphQL APIs. In our honest experience, the best you can strive towards is being "less wrong" about your authorization logic over time. This GraphQL codegen pattern required authorizer handlers for query, createOrUpdate, and delete mutations. These forced us to think about authorization as its own layer of execution logic, that considered both the user and the operation in its evaluation.

## Caveat Emptor

There are a lot of reasons you might not want to take our approach. If you have a simple business domain that you are very confident will not evolve, you may be better served with an off-the-shelf ORM-to-GraphQL tool like Prisma. We had some complex business logic to support that we found was difficult to model in Prisma, which at the time was still called Graphcool. We also had to use some more involved Postgres features, such as PostGIS and index declaration, which Prisma's ORM didn't support at the level we needed.

Not all of our imperative GraphQL definitions or resolvers were eliminated. Of course, we still had many custom mutations and queries. But these weren't ones that could simply be modeled as "get|create|update|delete (one|many) of (record) (where|input)".

Timing influenced much of our situation. We loved Sequelize, and felt its robust community had taken nearly a decade to build interfaces for the many, many edge cases you encounter trying to do any non-pithy database operations. In 2018, it was clearly the most proven Node.js ORM while the new generation (eg TypeORM and Prisma) was still finding its bearings. If we had to redo our stack from scratch, we probably would try out TypeORM, which would further reduce our definition surface area to:

  • Typescript
  • G̶r̶a̶p̶h̶Q̶L̶
  • O̶R̶M̶

We did learn that GraphQL services become much easier to build with the right patterns. We saw many instances in this process where complexity, relative to the number of tables in our database, dropped from O(n^2) and O(n) to O(log(n)) or O(1). If notice anywhere in your code where your maintenance costs are scaling proportionate to the number of tables in your database, pay close attention to it. This may be a ripe place to apply a pattern that automates your wiring. That's probably true for your your GraphQL service, too, regardless of how you've implemented it.

Tap into the most driven engineers and designers on the planet