10 Insanely Annoying REST API Problems (And How to Solve them with GraphQL)

I've spent the last three days painstakingly versioning a broken REST API for a client and I can't help but wonder, what problems does GraphQL solve?

  1. Introduces proper Data Fetching
  2. Solves Overfetching / Underfetching
  3. Reduces waste of Network Requests
  4. Brings Flexibility to the Static Nature of APIs
  5. Makes Resource Deprecations Easy
  6. Makes Evolution and Versioning Easy
  7. Introduces Schema Stitching
  8. Makes Subscriptions Easy
  9. Boosts Performance
  10. Makes Querying easy with a Language

To solve each of these REST API problems with GraphQL, we need to understand them first. Consider the following example.

Suppose, in a blog application, you wanted to find an author, list all the posts by that author and list the followers of that author:

How would you go about this?

The RESTful Approach

Although REST famously utilizes a uniform protocol (usually HTTP) for its interface, it is in fact protocol agnostic.

It does not define it's own spec, is not a query language and does not care what protocol you use.

REST aims to completely decouple clients and APIs, read hypermedia controls.

It is stateless, static and accomodates evolution grudgingly.

In the example above, a standard RESTful approach would expose the data with three different API endpoints as follows.

        
/authors/:id
/authors/:id/posts
/authors/:id/followers
        
      

Let's go over each of them.

Endpoint#1: Fetching the Author

Retrieving the an author using the :id params in REST would typically yield all the author data as follows.

        
{
  "author": {
    "_id": "507f1f77bcf86cd799439011",
    "name": "Solo",
    "address": "{ ... }",
    "birthday": "August 29, 1995",
    ...
  }
}
        
      

If the client only needed the author name, we can then assign the JSON Object to a variable, say author, and then use get the name with author.name.

Endpoint#2: Fetching the Author's Posts

Hitting the second endpoint to get the author's posts from the API returns a list of all the posts with the fields as follows.

        
{
  "posts": [{
    "_id": "",
    "title": "Start Learning GraphQL Today",
    "content": "...",
    "comments": { ... }
  }, {
    "_id": "",
    "title": "Relay vs Apollo: GraphQL Clients",
    "content": "...",
    "comments": { ... }
  }, {
    "_id": "",
    "title": "Why GraphQL is Killing REST",
    "content": "...",
    "comments": { ... },
  }, {
    ...
  }]
}
        
      

Because we only need the post titles, we can again assign a variable posts, and the iterate through the posts assigning another variable post.name on each of the titles.

Endpoint#3: Fetching the Author's Followers

Hitting the third API endpoint for the author's followers returns a list with all the followers of that author with their complete data.

Again, to drop the unnecessary data, we can assign a variable followers, to the JSON Object and retrieve the name with dot notation.

        
{
  "followers": [{
    "_id": "507f1f77bcf86cd799439011",
    "name": "John",
    "address": "{ ... }",
    "birthday": "March 05, 1970"
  }, {
    "_id": "507f191e810c19729de860e3",
    "name": "Alice",
    "address": "{ ... }",
    "birthday": "May 14, 1986"
  }, {
    "_id": "507f110c19729de8de86065l",
    "name": "Sarah",
    "address": "{ ... }",
    "birthday": "November 23, 1991"
  }, {
    "_id": "507f1f77bcf8610c19729de8",
    "name": "Nolan",
    "address": "{ ... }",
    "birthday": "June 15, 1984"
  }, {
    ...
  }]
}
        
      

So, what's the problem with this approach.

Well...

Let's see.

Problem #1: Data Fetching

Modern Progressive Web apps and Native apps are increasingly data-driven.

This often requires them to fetch and combine related resources from huge datasets with near zero latency.

A common bottleneck with RESTful APIs is the need for multiple roundtrips to multiple endpoints to fetch all required related resources.

To get all the data we needed for the blog above, we had to hit three different API endpoints.

Fetching data in REST APIs usually means hitting an ever increasing endpoints endpoints. Even if you fetch async, the multiple endpoints typically go up fairly quickly when you scale the application.

Problem #2 Overfetching or Underfetching

For the first endpoint, besides getting back the author id and the name, which were the only required fields, the API sent back the address, the birthday and all the other fields for that author.

If there are 100 other fields storing data for the author, the server will return all these fields every single time this endpoint is hit.

As we saw, these fields were never even required by the client.

In the second endpoint, all the additional data for the posts sent back i.e. content and a huge listcomments was also never required.

All 100 fields fields each of the followers would again be fetched from the third endpoint.

Each of the three endpoints illustrate perfect examples of overfetching.

There needing to be three endpoints illustrates that underfetching. Three endpoints had to be hit to fetch all the required data.

Problem #3: Network Request

As illustrated above, you are almost guaranteed to fetch more or less data than you need.

This leads to huge waste when scaling REST APIs because each endpoint has a fixed data structure.

The above example required three different requests to the server, only to transfer a significant amount of resources that were just ignored by the client.

This REST implementation represents a huge waste bandwidth per request and ,ideally, a lot more requests than necessary.

It get slower and slower for large datasets, massively increasing latency.

Problem #4: Static Nature of REST

The solution to these three problems above usually boils down to two things; designing your API as closely as possible to how the client will request the data and versioning the API when the data changes.

And modern APIs change a lot.

You could ofcourse design your API in such a way that it exposes the data in a more efficient way. Which will likely be a temporary fix, until the data changes.

REST APIs are static, data is stored is a certain way, retrieved in a certain way. Changing things is painful. Period.

To retrieve new data from the API, you need a new endpoint, which you can just add.

If its new data you want from an old API, say additional fields not previously included for a Post with defined endpoints, you have to bump the version.

API Evolution as it were.

Problem #5: Versioning and Evolution

The hardest part of maintaining an API is versioning. It has no "right way" for starters.

Breaking changes creep in easily and often, the best way to version almost completely depends with the API with a few rules of thumb.

Evolution is a difficult problem, one which many options are available

The easiest is probably just deprecation.

At every wrong turn, the API risks breaking existing clients, or responding to a client outdated data. Changing the format often neccesitates a near complete redesign.

REST implementations are dogged by innefficiencies and inflexibility.

What Problem does GraphQL Solve?

GraphQL requests hits a single endpoint.

This two problems with rest are the what GraphQL solves, and it solves them brilliantly.

Let me illustrate this.

A typical solution for the above problem with GraphQL would hit the following endpoint:

        
/graphql
        
      

Easy!

With the following graph query.

        
query {
  author(id: "507f1f77bcf86cd799439011") {
    name
    posts {
      title
    }
    followers(last: 3) {
      name
    }
  }
}
        
      

Unlike REST which fetched resources with a HTTP GET, this is a POST request that tacks all the data requirements of the client in one.

And you would get back...

        
{
  "data": {
    "Author": {
      "name": "Solo",
      "posts": [
        { title: "Start Learning GraphQL Today" },
        { title: "Relay vs Apollo: GraphQL Clients" },
        { title: "Why GraphQL is Killing REST" }
      ],
      "followers": [
        { name: "John" },
        { name: "Alice" },
        { name: "Sarah" },
        { name: "Nolan" }
      ]
    }
  }
}
        
      

The root field in this JSON Object is data. The rest ONLY what requested in the graph query

Nothing else.

What Problems does GraphQL Create?

GraphQL is not a magic pill that solves everything.

It does not actually create problems, so much as it removes the best parts of REST.

  1. Caching

What GraphQL Doesn't Solve

And finally there is what neither REST nor GraphQL can touch.

  1. Technical Debt