<onWebFocus />

Knowledge is only real when shared.

GraphQL

April 2022

Why and when to use GraphQL.

In 2015, at around the time when React started to be adopted in the mainstream, Facebook released another Open Source framework called GraphQL. This post will explain what it is and more importantly when it should be used. Ending with an Introduction to GraphQL that explains how it works and includes an interactive example of both the GraphQL server and client for a simple todo application.

Data APIs

More dynamic websites using frameworks like React often retrieve data from the server without loading any additional markup. Traditionally, the backend provides routes according to the REST standard.

GET https://onwebfocus.com/api/post/2

{
  id: 'graphql',
  name: 'GraphQL',
  views: 436,
  description: 'Why and when to use GraphQL.',
  markup: '...',
  preview: 'https://....',
  image: '...'
}

REST mostly describes how the routes should look like and which methods to use. Usually, the returned data is encoded as JSON and therefore can directly be used on the frontend.

Limitations of REST Interfaces

In order to run into any issues using a REST based approach one has to work on a large application transferring lots of data. The problem that will eventually occur is that the Frontend making the request has no additional control over the response. Therefore, if an additional property should be sent along with a REST request the backend has to make a change as well. Whether or not it's the same person working on the Frontend and the Backend going back and forth between those two codebases takes some time and requires coordination.

In order for the backend to avoid creating many routes for the same resource and exactly specifying all the required properties before implementation the solution is usually to just send all the properties attached to a resource. Needless to say, with large applications this can lead to huge amounts of unnecessary data being transferred from the API to the Frontend.

How GraphQL Solves This Problem

With GraphQL the Backend provides the data with all properties to the GraphQL server. Similar to a SQL query the Frontend can now specify how the response should look like and the server will only send what has been requested.

query {
  post(id: 2) {
    name
    markup
  }
}

In the above case we only need the name and markup properties in the Frontend for now. In case we suddenly also require the description which has already been added in the Backend from the start we can also retrieve this property from the Frontend without making any changes to the Backend.

query {
  post(id: 2) {
    name
    description
    markup
  }
}

This effect becomes even more pronounced when creating a generic public interface to be used by many frontends. A good example is the GitHub REST API which will send huge amounts of unnecessary data. On the other hand the GitHub GraphQL API let's users load exactly the data they need therefore reducing the request size a lot.

When to Use GraphQL

GraphQL will be useful when it's anticipated that there would otherwise have to be a lot of coordination effort necessary between the Frontend and the Backend or request sizes would explode as the Backend will have to return all available properties. When compared to REST routes GraphQL queries are much more flexible and when set up properly can be used for many different applications all using the same data source.

Keep in mind that GraphQL will add a significant amount of complexity. So that queries can be written the data has to be described in a schematic way using types. It's important to also keep in mind the learning curve one has to go through in order to learn enough about GraphQL to be able to set up an interface. Without any prior knowledge in GraphQL one is usually better of by creating a REST interface. However, when the context matches what's described above and the time to learn GraphQL is available it's time to get started. Once mastered and integrated a GraphQL interface can save considerable work in the long run.

Server-side rendering frameworks like next offer a data fetching layer that has similar benefits to a GraphQL query and can decrease the amount of data required be transferred while allowing more properties to be used with only minor changes to the Backend in getServerSideProps.

Leveraging GraphQL to Avoid a Backend Altogether

Usually, the Backend API queries a database then transforms the data and returns it along with the appropriate request. With GraphQL we already perform such a database query in the Frontend. In theory, this avoids the need for extra logic in the Backend as everything is already there. Of course, it's not as easy as simply removing the Backend, but as described in a previous post about Hasura it's possible to only have a graphical tool to create tables, add data and define permissions as the Backend. The GraphQL server including the schema and resolvers is automatically generated.

Introduction to GraphQL

The following explains how a small GraphQL server works while afterwards a client built with React is shown and can be tried out and edited on this page. This is only intended as a teaser and before creating your own GraphQL server it's recommended to take a look at the official documentation or better the Apollo Client and Apollo Server documentations as they are much more detailed. Apollo which is also used for the following example is the most popular GraphQL framework offering both a server and a client with various Frontend and Backend integrations.

Note that in the following case one would be better off using a REST interface as the application is way to small and simple enough to specify the full interface in advance.

The Server

All the bits used hereafter can be found on github.com/tobua/graphql with the whole application deployed to Vercel.

Schema

In order to validate the data of incoming requests and the data returned by resolvers, GraphQL requires a schema. It works similar to TypeScript as it throws an error in case something doesn't match the defined types. Unlike TypeScript validation occurs at runtime with actual data. Below you can see the schema for our server consisting of a type called Task that can be retrieved using the tasks Query and modified with two different Mutations. The first mutation will add another task to the list returned by the query while the second toggles a task to the done state that has previously been defined as a Boolean scalar type.

const typeDefs = gql`
  type Task {
    id: ID!
    name: String!
    done: Boolean!
  }

  type Query {
    tasks: [Task!]!
  }

  type Mutation {
    addTask(name: String!): Task
    setTaskDone(id: ID!): Boolean
  }
`

When using a public API implemented with GraphQL the schema is especially handy thanks to what's called introspection. When this is enabled any consumer of the interface can also download the schema therefore knowing exactly what the interface looks like.

Resolver

Queries and Mutations listed in the schema can't do anything unless a dedicated resolver exists. Resolvers are simple JavaScript functions that receive the data passed in the GraphQL request and are expected to return JavaScript variables matching the schema previously defined.

import { list, add, toggle } from '../interface/mysql.js'

const resolvers = {
  Query: {
    tasks: async () => await list(),
  },
  Mutation: {
    addTask: async (_, { name }) => {
      return add(name)
    },
    setTaskDone: async (_, { id }) => {
      return toggle(id)
    },
  },
}

Resolvers often more or less directly connect to a database. In theory Apollo Server has something called Data Sources that intends to automate this connection to a datasource. Unfortunately, few developers use this feature and it's quite limited. It's often advertised how easy the connection to a legacy REST interface is. As with every architectural choice this should be questioned before switching to GraphQL. Other options are a direct connection to the REST interface or for the GraphQL server to connect directly to the database that the REST interface connects to.

In order for GraphQL to increase performance by allowing the user to specify exactly the properties that should be returned it's also necessary to ensure that resolver functions resolve quickly. This is the case when the database is running on the same server or the database request is filtered to ensure only the required fields are transferred. Usually, all available and in the schema specified data is provided to the resolver function. The GraphQL server will later remove the unnecessary properties before sending a response. In cases where memory is scarce or the database is located far removed from the server (as in a serverless environment) narrowing of the return values can also be passed on in the resolver. Although somewhat cumbersome to discern the values required they can be found in the info argument passed to the resolver.

tasks: async (parent, args, context, info) => {
  const properties = info.fieldNodes[0].selectionSet.selections.map(selection => selection.name.value)
  // properties === ['name', 'done']
  const { result } = sql.query(`SELECT ${properties.join(',')} from tasks`)
  return result
}

These properties can then be handed over to the database request to allow for further optimization. The GraphQL server will fail if the resolver doesn't supply a requested property but missing properties that exist on the schema but aren't returned when not requested will not cause any issues.

Initialization

import { ApolloServer, gql } from 'apollo-server-express'
import { ApolloServerPluginLandingPageGraphQLPlayground } from 'apollo-server-core'

const server = new ApolloServer({
  typeDefs,
  resolvers,
  introspection: true, // Users can download the schema as JSON.
  plugins: [
    // Opening the server in the browser will show a GUI to try out requests and see results.
    ApolloServerPluginLandingPageGraphQLPlayground(),
  ],
})

Once the schema and the resolvers are defined it's straightforward to start the server as the Apollo framework will take care of everything. During development the playground plugin can be especially helpful as it allows to write and run requests with full schema support and without the need to create a client in advance.

The Client

The above described server is used to supply the following Apollo Client React application with data.

import { ApolloClient, ApolloProvider, InMemoryCache } from '@apollo/client'
import { Add, Tasks } from './component.js'

const client = new ApolloClient({
  uri: 'https://graphql-tobua.vercel.app/graphql',
  cache: new InMemoryCache()
})

export default () => (
  <ApolloProvider client={client}>
    <Add />
    <Tasks />
  </ApolloProvider>
)

Just as on the server the Apollo client is easy to initialize and has to be passed to the React application at the root using the context API. This allows the useQuery and useMutation hooks also provided by @apollo/client to access the client. As GraphQLs goal is obviously to squeeze every last bit of performance out the cache layer can also be configured in order to avoid making certain requests. In practice it's usually just an instance of InMemoryCache without any additional configuration.

The queries inside the query.js file define exactly what the server should return. This allows for dyanmically adding, combining or removing properties that exist on the schema without making any changes to the server.

import { gql } from '@apollo/client'

export const tasks = gql`query {
  tasks {
    id
    name
    done
  }
}`

The React hooks inside component.js allow us to get the current state and results of each request. In the case of mutations which are usually triggered after an event like onClick additional variables can be supplied similar to a POST request in REST.

Use of the refetchQueries option is required to tell the client which queries have to be refreshed after a mutation. This issue arises due to the tight coupling between the individual React components and the server requests. Using a public GraphQL API like the GitHub GraphQL API this approach is fine. However, when deciding to put in the effort to build your own GraphQL interface on the server it's recommended to also put in a bit of extra effort and introduce a separate state layer for the React application managed for example with MobX. This will simplify the component rendering code and automate rerendering of connected data as well as avoiding unnecessary refetches of data from the server.