Cogitatorium

View Original

Speaking GraphQLy: Provocative Intro to GraphQL

So you want to see what this GraphQL thing is… or to assess whether I have any clue myself. Well, this won’t be your run-of-the-mill tutorial. You can find that elsewhere. Why am I spending time writing this you ask? I’m doing this because what I found elsewhere didn’t clear my misconceptions and even supported them. I spent too much time looking for, expecting, and fearing things I should not have, mainly because they don’t exist. I hope to save you from repeating my mistakes.

This must begin with a clear mind. Don’t expect anything you assume about GraphQL to be true at all. If you (think you) know something already, please “unthink” that even if you are one of the GraphQL creators. To help with this, do read my earlier post about what GraphQL is not. That alone would have saved me time, had I had it available to me when I needed it. The other trick I will use is a somewhat unconventional approach that I promise will save you time as well, although it looks like a detour: I won’t be writing about GraphQL until the end. We will design something of our own instead: JSONTL, a JSON Template Language.

 Starting Point

You may know at least one “template language”. For example, various HTML template languages are commonly known. We want to design something like that but for JSON. Why? Say we’re fed up writing code as shown on the left to produce JSON as shown on the right, without relying on serialization, marshalling or various object mapping frameworks. The following JSON illustrates the end goal when looking up the details about a particular item available for sale:

Template Pseudo Code Target JSON
it = getItem(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9");
data = {
  item: {
    upc: it.getUPC(),
    name: it.getName(locale: ENGLISH_CANADIAN),
    price: it.getPrice(),
    bulkPrice: it.getPrice(currency: CAD, count: 10),
    stock: it.getStock(near: "M4C 1M2").map(storeStock -> // list
      {
        storeName: storeStock.getStoreName(),
        count: storeStock.getCount()
      }
    )
  }
}

data: {
  item: {
    upc: 72527273070,
    name: "The Enchanted Iciql",
    price: 19.99,
    bulkPrice: 179.91,
    stock: [
      {
        storeName: "Whimsiql Books",
        count: 24
      }, …
    ]
  }
}
      

You should be able to read that template pseudo-code. The idea is that it creates the “data” representation structure which is directly and magically represented in the output. If you are viewing this on a wide enough screen the lines of the template and the corresponding output should line up, for your convenience.

I’m making use of constructs such as named and optional arguments, enums (ENGLISH_CANADIAN and CAD) and lists (ordered collections, arrays). I’m also making use of “lambda expressions” to “map” each store stock to a structure containing what we care about having in the target JSON. Please excuse unquoted key/field names in the target JSON. I kept them out for brevity. You can imagine them back if you wish.

There is a lot of boilerplate there. The left side doesn’t look much like the right side at all. We see a lot of duplication. That mapping of stock details appears particularly nasty. Let’s make that better.

Mapping Sugar

 We’re going to change our language and add “syntactic sugar” as follows:

  1. When a function call, such as getStock(…) is followed by a {…} block we’ll treat that as a “.map( x -> {…})” construct. In other words

  2. We can remove “it.” and “storeStock.” before getters as we’re going to default to the object returned by the function invoked just before the mapping block.

  3. While we’re there, we’re going to require that construct for anything yielding non-primitive values, such as structures or objects. In this example, all other function calls return primitives.

  4. If this construct is applied to a single-valued result, we just transform that value. In the example, it is applied to a list, so we apply it to each element.

In other words, we want to express the following:

a(...).map(x -> {
  p: x.b(...),
  q: x.c(...),
  ...
})

More concisely as follows:

a(...) {
  p: b(...)
  q: c(...)
}

… regardless of whether a(…) returns a single value or a collection (list, array, …). That cleans up our template code to the following:

data = {
  item: getItem(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc: getUPC(),
    name: getName(locale: ENGLISH_CANADIAN),
    price: getPrice(),
    bulkPrice: getPrice(currency: CAD, count: 10),
    stock: getStock(near: "M4C 1M2") // list
      {
        storeName: getStoreName(),
        count: getCount()
      }
  }
}

Getter Name Magic

Some languages have convenience features to access “properties” backed by functions or methods without having to specifically reference the function/method names. If a property is read, it will automatically find a getter. If it is assigned, it will automatically use a setter. We care only about getters, not setters (not even later – we just won’t have them, ever). In any case, we’ll steal that feature. If it helps, you can think of it as automatically prefixing the function names with “get” so that we don’t have to type those three letters or be concerned about letter capitalization after them. That brings us to:

data = {
  item: item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc: upc(),
    name: name(locale: ENGLISH_CANADIAN),
    price: price(),
    bulkPrice: price(currency: CAD, count: 10),
    stock: stock(near: "M4C 1M2") // list
      {
        storeName: storeName(),
        count: count()
      }
  }
}

Remove Empty Argument Lists

Why bother with “()”? We decide that we don’t need to reference or pass functions themselves. So let’s remove those “()” in “upc()”, “price()”, “storeName()” and “count()”:

data = {
  item: item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc: upc,
    name: name(locale: ENGLISH_CANADIAN),
    price: price,
    bulkPrice: price(currency: CAD, count: 10),
    stock: stock(near: "M4C 1M2") // list
      {
        storeName: storeName,
        count: count
      }
  }
}

Silence the echoes

We have a lot of “something: something” there:

item: item
upc: upc
price: price
stock: stock
storeName: storeName
count: count

The left/first one is the name of the target JSON field, which we may decide to call “alias”, and the second one tells our language which function to call to get the value. Let’s make aliases optional. If omitted, we’re going to default them. In our example there is only one case not benefiting from this sugar: “bulkPrice”. It has to stay as it is.

data = {
  item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc,
    name(locale: ENGLISH_CANADIAN),
    price,
    bulkPrice: price(currency: CAD, count: 10),
    stock(near: "M4C 1M2") // list
      {
        storeName,
        count
      }
  }
}

 Shorter comments

Not that it matters much, but say we dislike the use of “//” for end-of-line comments and want to shorten it. Let’s do that:

data = {
  item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc,
    name(locale: ENGLISH_CANADIAN),
    price,
    bulkPrice: price(currency: CAD, count: 10),
    stock(near: "M4C 1M2") # list
      {
        storeName,
        count
      }
  }
}

Remove the envelope

Our JSON template will produce JSON, not JavaScript or some other code. Its output is only one value, structure. That “data = “ at the beginning, is out of place and can cause trouble. I mean, what does that “=” even do? We’ll simplify things and always assume that the output is going to be named “data”. We get to:

{
  item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc,
    name(locale: ENGLISH_CANADIAN),
    price,
    bulkPrice: price(currency: CAD, count: 10),
    stock(near: "M4C 1M2") # list
      {
        storeName,
        count
      }
  }
}

Charade Ends

How does that template look to you? Could you make use of it? Well, surprise! We just reinvented GraphQL. That last version is GraphQL. We’re done with the basics. There are few more advanced features but they are along the same lines. For example, commas and most whitespace are optional. They are treated equally and seen as just separators. The following is valid GraphQL:

{
  item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc,,,,  ,,, ,, # Yes, commas are just whitespace
    name(locale: ENGLISH_CANADIAN)
    price
    bulkPrice: price(currency: CAD, count: 10)
    stock(near: "M4C 1M2") {
      storeName
      count
    }
  }
}

… as is this:

{item(id: “a25e2223-faec-48e0-b54c-d9f48bd4c5d9”){upc name(locale:ENGLISH_CANADIAN) price bulkPrice: price(currency:CAD,count:10) stock(near:"M4C 1M2"){storeName,count}}}

So what is GraphQL? You could think of it as:

  • a JSON template language that API clients can use in requests to servers

  • a functional or RPC style API with a composable twist: it adds easy nesting of any number of follow-up calls on an output of the preceding call.

Where the query?

I cheated a tiny bit. That was, indeed, valid GraphQL but because it leverages GraphQL’s convenience feature that allows us to omit the keyword “query” word at the beginning. A more complete example would be:

query { # Notice the keyword here
  item(id: "a25e2223-faec-48e0-b54c-d9f48bd4c5d9") {
    upc
    name(locale: ENGLISH_CANADIAN)
    price
    bulkPrice: price(currency: CAD, count: 10)
    stock(near: "M4C 1M2") {
      storeName
      count
    }
  }
}

Do we have to construct queries dynamically?

Yeah, constructing the query every time a different argument needs to be passed isn’t all that nice. GraphQL supports parameterization but it calls the concept “variables”. I won’t go into this beyond giving you a taste o how this looks like:

query CanadaEnglishItemInquiry($itemId: UUID! $quantity: Int! $near: PostalCode) {
  item(id: $itemId) {
    upc
    name(locale: ENGLISH_CANADIAN)
    price
    bulkPrice: price(currency: CAD, count: $quantity)
    stock(near: $near) {
      storeName
      count
    }
  }
}

 How does it know what’s available?

Aaaah! Good question 😉! Why? You don’t want all your functions, methods to be exposed? Well, even if you did, GraphQL wouldn’t know how to do it. Something has to tell it. That is where the other GraphQL language comes in! What other language you may ask? Well, GraphQL also has the “Schema Definition Language” (SDL), used to define all these terms we had in there. Without schema, GraphQL is entirely empty. It has nothing. Most GraphQL frameworks I’m aware of are schema-first in some sense (but they don’t have to be). They let you make the connections between the GraphQL schema and the code you wish to expose.

GraphQL SDL does for GraphQL what Swagger or Open API specification does for other/basic HTTP-based APIs. BTW, most common way to use GraphQL is over HTTP and that can be expressed as Open API. Schema leverages GraphQL’s “type system”. Beyond a few primitive types (Boolean, Int, Float, String and ID) it also supports lists, nullability specification and custom-defined enum and complex types. Complex types are split into input-only (arguments) and output-only (responses) types. Presently only output types support interfaces and unions.

To add just a bit more detail, non-nullability is marked by sufficing the type name with an exclamation mark and lists are defined by enclosing the element types into square brackets. “[String!]!” is a non-null list of non-null Strings and “[[Float!]!]” is a nullable list of non-nullable lists of non-nullable floats – a nullable two-dimensional matrix. “ID” is a special type indicating a value that should not be subject to any transformation or otherwise treated as having any other meaning than identifying something, unlike numbers which can be, say summed up or strings which could be truncated, parsed, etc.

Now I need to share that SDL?

Not really. This is the part that GraphQL does define and why I lied when I said that it is entirely empty. GraphQL defines the process of “introspection”, a feature that allows the client accessing the GraphQL API to ask it (the API) about its schema. The response remains to be JSON.

query {
  item(id: “a25e2223-faec-48e0-b54c-d9f48bd4c5d9”) {
    __typename # Notice __
  }
  __schema { # Notice __
    types {
      name
      description
      fields {
        name
        ...
      }
    }
  }
}

That could produce something like:

data: {
  item: {
    __typename: "Book"
  }
  __schema: {
    types: [{
      name: "Book",
      description: "A type for book items",
      fields: [
        {
          name: "author"
          ...
        }, ...
      ]
    }, {…}, {…}, {…}, …]
  }
}

Only read-only?

No, GraphQL isn’t read-only. Think about it, though. Revisit these functions that we call(ed), those “getters” in early examples. They can take input arguments, do whatever. What stops you from making them do more than just getting values? Did you think about it? Here’s the answer: nothing. That’s right. If you’re implementing a GraphQL server, GraphQL can’t stop you from doing whatever you feel like and exposing it in whichever way you want. I mean, think about it. Is our example request truly 100% read-only? Are you sure nothing is written to logs? Could it be that the user’s profile is updated with what they were interested in? How about the level of interest for the product requested? Both could be used for analytical purposes.

So, you can indeed make and expose functions that make changes. GraphQL, however, exposes a separate and dedicated “space” for operations that can have “side-effects”, nudging the API developer to register those non-read-only functions as “mutations”. The rest relied on your good judgment.

For example:

mutation {
  addToStock(store: … itemId: … quantity: 100) {
    count # how many do we have now?
  }
}

May produce…

data: {
  addToStock: {
    count: 153
  }
}

Later, not now

If you don’t need (or want) the data immediately returned to you, GraphQL also introduces the concept of “subscriptions”. These follow the same basic pattern but make no assumptions about which way the data will go, only that it won’t be going immediately back. Example:

subscription {
  lowStockAlert(itemId: “…” alertQuantity: 5) {
    store { 
      storeName
    }
    count # how many do we have now?
  }
}

Show me some HTTP

HTTP is the primary way of accessing GraphQL APIs. HTTP GET method can be used for operations registered as queries, whereas POST can be used for any, including queries. Assuming your GraphQL API is exposed at “/graphql”, the following is an example of a valid request:

GET /graphql?query={item(id:1234){name,upc}}

The “query” we’ve been making examples of so far is simply passed as a raw text value in the URL query string, as “query” parameter, escaped as needed. Variables can be passed in as well as “variables” JSON, but I won’t go into those details. The URL query string is not used with POST. One possibility is to use JSON:

POST /graphql
Content-Type: application/json
… (more headers)

{"query":"item(id:1234){name,upc}}"}

Tomayto, Tomahto, Potayto, Potahto

The terms chosen for GraphQL make sense if viewed from the perspective of the target JSON structure and the initial intent behind the creation of GraphQL. However, these do not match the power GraphQL ended up having despite that initial intent. To unlock that power it helps to think of those GraphQL concepts using names in creative ways.

 For example, GraphQL specification uses the term “field”, not “function”, despite allowing input arguments and talking of “resolver functions”. I’ve prepared a cheat sheet of these.

Is this it?

No, it isn’t. GraphQL has more. I skimmed on variables and didn’t mention inline and named (reusable) fragments that can help apply common projection (“field selection”) throughout. There are some extra “curiosities” about field merging, execution order, subscription lifecycles, and directives and I only mentioned SDL. They don’t change what GraphQL is and are more advanced topics worthy of separate posts. If there’s interest and I get to it.