Chapter 4. GraphQL and the Gatsby Data Layer

Up until now, all of our work on implementing Gatsby has focused on use cases that don’t require data retrieval and processing. Before we turn our attention to how Gatsby integrates with external data, in this chapter we’ll cover the data layer in Gatsby, whether data comes from files in your Gatsby site (the local filesystem) or from external sources such as CMSs, commerce systems, or backend databases (external data sources that require a source plugin, discussed in more detail in Chapter 5).

Within Gatsby’s data layer, GraphQL mediates the relationship between Gatsby pages and components and the data that populates those pages and components. Though GraphQL is popular in web development as a query language, Gatsby uses it internally to provide a single unified approach to handling and managing data. With data potentially originating from a variety of disparate sources, Gatsby flattens the differences between discrete serialization formats (forms of articulating data) and thus third-party systems by populating a GraphQL API that is accessible from any Gatsby page or component.

In this chapter, we’ll explore the foundations of GraphQL that apply to any GraphQL API before switching gears to look at how Gatsby uses GraphQL specifically in its data layer and in page and component queries.

GraphQL Fundamentals

GraphQL is a query language that provides client-tailored queries. That is, unlike REST APIs, which adhere to the requirements dictated by the server, GraphQL APIs respond to client queries with a response that adheres to the shape of that query. Today, GraphQL APIs are commonplace for backend database access, headless CMS consumption, and other cross-system use cases, but it’s still quite rare to find GraphQL used internally within a framework.

GraphQL has become popular thanks to the flexibility it provides developers and the more favorable developer experience it facilitates through client-driven queries. Common motivations for using it include:

Avoiding response bloat
GraphQL improves query performance by only serving that data that is necessary to populate the response according to the client-issued query. In traditional REST APIs, response payload sizes can be larger than necessary, or additional requests may be required to acquire the needed information.
Query-time data transformations
In many JavaScript implementations, data postprocessing needs to occur to harmonize data formats or to perform a sort operation. GraphQL offers means to perform data transformations on the fly at query time through the use of explicitly defined arguments within a GraphQL API.
Offloading request complexity
In many JavaScript implementations, correctly issuing a query often requires a complex interplay between promises, XMLHttpRequest implementations, and waiting for the requested data to arrive. Because GraphQL only requires a query, as opposed to a particular URL (and potentially headers and request bodies), it may provide a smoother developer experience.

Using GraphQL does have some disadvantages, too—notably, GraphQL APIs can be difficult to scale due to the need to serve a response that is tightly tailored to the client’s query. Fortunately, because Gatsby uses GraphQL during development and build-time compilation, that latency only impacts the build duration rather than the end user’s time to first interaction.

To get you up to speed, in the coming sections we’ll cover GraphQL queries, fields, arguments, query variables, directives, fragments, and finally schemas and types.

GraphQL Queries

The primary means of interacting with a GraphQL API from the client is a query, which is a declarative expression of data requirements from the server.

Consider the following example GraphQL query, which in Gatsby returns the title of the Gatsby site from its metadata:

{
  site {
    siteMetadata {
      title
    }
  }
}

Gatsby’s internal GraphQL API will return the following response to this query:

{
  "data": {
    "site": {
      "siteMetadata": {
        "title": "A Gatsby site!"
      }
    }
  },
  "extensions": {}
}

Notice how the GraphQL query and response are structurally identical: they share the same hierarchy and the same sequence of names. In other words, a typical GraphQL query issued by a client outlines the shape that the GraphQL response issued by the server should take. The pseudo-JSON structure of the GraphQL query becomes a valid JSON object in the GraphQL response.

The preceding GraphQL query is anonymous; it lacks an explicit name. But in GraphQL, you can identify queries with an operation type and an operation name. Here, query is the operation type, and GetSiteInfo is the operation name:

query GetSiteInfo {
  site {
    siteMetadata {
      title
    }
  }
}

You can also identify the operation type without an operation name, if you wish to write an anonymous query. This and the previous query return identical responses:

query {
  site {
    siteMetadata {
      title
    }
  }
}
Note

GraphQL queries are read operations; they retrieve data. There is another operation type, mutation, which handles write operations. However, Gatsby does not provide mutation support within its internal GraphQL API due to its impact on how Gatsby functions, so we do not cover mutations here. For more information about GraphQL beyond Gatsby, including GraphQL mutations, consult Learning GraphQL by Eve Porcello and Alex Banks (O’Reilly).

GraphQL Fields

Let’s take another look at the anonymous version of the query:

{
  site {
    siteMetadata {
      title
    }
  }
}

In GraphQL, each of the words contained within the query (site, siteMetadata, title) that identify inner elements are known as fields. Those fields located at the top level (e.g., site) are occasionally referred to as root-level fields, but keep in mind that all GraphQL fields behave identically, and there is no functional difference between fields at any point in a query’s hierarchy.

GraphQL fields are crucial because they tell the GraphQL API what information the client desires for further processing and rendering. In GraphQL APIs outside Gatsby, the client and server are generally distinct from an architectural perspective. But in Gatsby, the GraphQL server and client are contained within the same framework; for our purposes, React and Gatsby components are our GraphQL clients.

Fields can take aliases, which allow us to arbitrarily rename any GraphQL field in the schema to something else in the resulting response from the API. In the following example, though the title field is identified as such in the schema, we’ve aliased it to siteName in our query, so the server will return a JSON object containing siteName as the identifier rather than title:

{
  site {
    siteMetadata {
      siteName: title
    }
  }
}

Aliases can be particularly useful when you wish to serve the same data multiple times but in a different form, since GraphQL prohibits repeating the same field name twice in a single query:

{
  defaultSite: site {
    siteMetadata {
      title
    }
  }
  aliasedSite: site {
    metadata: siteMetadata {
      siteName: title
    }
  }
}
Note

Gatsby structures its data according to common GraphQL conventions. Each individual data object in Gatsby is a node. In Gatsby’s GraphQL API, nodes are connected through edges, which are ranges that represent all the nodes returned for a given query.

GraphQL Arguments

In GraphQL, arguments are used to apply certain criteria to the fields in the response. These can be as simple as sort criteria, such as ascending or descending alphabetical order, or as complex as date formatters, which return dates according to the format stipulated by an argument applied to a field.

Let’s take a look at a more complex query to understand how to refine the sort of response that comes back through arguments applied to fields—we’ll discuss many of the unfamiliar aspects of this query in subsequent chapters:

query {
  site {
    siteMetadata {
      title
    }
  }
  allMarkdownRemark {
    nodes {
      excerpt
      fields {
        slug
      }
      frontmatter {
        date(formatString: "MMMM DD, YYYY")
        title
        description
      }
    }
  }
}

In this anonymous query, we see an example of an argument on the date field, named formatString:

date(formatString: "MMMM DD, YYYY")

In many databases and servers, dates are stored as Unix timestamps or in a machine-friendly format rather than a human-readable format. To display a date like 2021-11-05 in a more user-friendly form in the response, we can use a GraphQL argument on the date field to perform query-time date formatting. In this example, the date would be formatted as:

"November 5, 2021"

In addition to providing a date format to the formatString argument on the date field, we can also define a locale to adapt the outputted date to the preferred locale or language, which will have different names for months and days of the week. For example:

date(
  formatString: "D MMMM YYYY"
  locale: "tr"
)

Because tr represents the Turkish language and region, our date will have the Turkish name for the month of November (note that we’ve also supplied a formatString appropriate to the region):

"5 Kasım 2021"

As mentioned previously, aliases allow us to serve multiple dates in different formats in the same response without running afoul of GraphQL’s prohibition of repeated field names in a single query:

englishDate: date(formatString: "MMMM DD, YYYY")
turkishDate: date(
  formatString: "D MMMM YYYY",
  locale: "tr"
)

You can also use fromNow, which returns a string showing how long ago or how far in the future the returned date is, and difference, which returns the difference between the date and current time in the specified unit (e.g., days or weeks):

firstDate: date(fromNow: true)
secondDate: date(difference: "weeks")
Note

Gatsby depends on a library known as Moment.js to format dates. The Moment.js documentation has a full accounting of tokens for date formatters. Note that to introduce locales unavailable by default (Moment.js ships with English–US locale strings by default), you will need to load the locale into Moment.js.

Though string formatters are a common type of argument found in GraphQL APIs, other arguments influence not only the field they are attached to but also all fields nested within. For instance, given a query that returns multiple objects of a given type, GraphQL arguments exist that allow you to arbitrarily limit, skip, filter, or sort the objects returned in the response.

Note

The group field in GraphQL also accepts an arbitrary field argument by which to group results. For an illustration of using group, see Chapter 8.

limit and skip

In the following query, we have added a limit argument to ensure that only four items are returned:

{
  allMarkdownRemark(limit: 4) {
    edges {
      node {
        frontmatter {
          title
        }
      }
    }
  }
}

The response to this query will contain only four items, even if there are more than four present.

In the following query, we’ve added a skip argument so the GraphQL API excludes the first five items from the list and returns only four items from that point onward:

{
  allMarkdownRemark(limit: 4, skip: 5) {
    edges {
      node {
        frontmatter {
          title
        }
      }
    }
  }
}

The response to this query will contain only four items, having skipped the first five in the list. Assuming there are at least nine items available, the response will contain items six through nine.

filter

In the following query, we’ve added a filter argument that uses the ne (not equal) operator to ensure that the title field within the frontmatter field will be excluded from the response if it contains no content:

{
  allMarkdownRemark(
    filter: {
      frontmatter: {
        title: {
          ne: ""
        }
      }
    }
  ) {
    edges {
      node {
        frontmatter {
          title
        }
      }
    }
  }
}

The response to this query will exclude any items in which the title field is empty.

Gatsby uses a package known as Sift to perform filtering through a MongoDB-like syntax that will be familiar to developers who routinely work with MongoDB databases. Therefore, Gatsby’s GraphQL API supports common operators such as eq (equal), ne (not equal), in (is this item in an arbitrary list?), and regex (arbitrary regular expressions). You can also filter on multiple fields, as in the following example, which checks for an item that contains the title “Their Eyes Were Watching God” and does not have an empty date:

allMarkdownRemark(
  filter: {
    frontmatter: {
      title: {
        eq: "Their Eyes Were Watching God"
      }
      date: {
        ne: ""
      }
    }
  }
)

These operators can even be combined on the same field. This example filters for items containing the string “Watching” but excludes “Their Eyes Were Watching God” from the returned results:

allMarkdownRemark(
  filter: {
    frontmatter: {
      title: {
        regex: "/Watching/"
        ne: "Their Eyes Were Watching God"
      }
    }
  }
)

Thanks to Sift, Gatsby provides a variety of useful operators for filter arguments (see Table 4-1).

Table 4-1. A full list of operators that can be used in Gatsby’s GraphQL API for filtering
Operator Meaning Definition
eq Equal Must match the given data exactly
ne Not equal Must differ from the given data
regex Regular expression Must match the given regular expression (in Gatsby 2.0, backslashes must be escaped twice, so /\+/ must be written as /\\\\+/)
glob Global Permits the use of * as a wildcard, which acts as a placeholder for any nonempty string
in In array Must be a member of the array
nin Not in array Must not be a member of the array
gt Greater than Must be greater than the given value
gte Greater than or equal Must be greater than or equal to the given value
lt Less than Must be less than the given value
lte Less than or equal Must be less than or equal to the given value
ele⁠m​Match Element match Indicates that the field being filtered will return an array of individual elements, on which filters can be applied with the preceding operators

sort

Just as the filter argument can ensure the inclusion or exclusion of certain data in a query response, the sort argument can reorder or resequence the data returned. Consider the following query, in which we’re sorting the returned data in ascending alphabetical order (ASC):

allMarkdownRemark(
  sort: {
    fields: [frontmatter___title]
    order: ASC
  }
) {
  edges {
    node {
      frontmatter {
        title
        date
      }
    }
  }
}

Note here that we’ve used three underscores in succession within the fields value to identify a nested field within the frontmatter field (frontmatter___title).

We can also sort according to multiple fields, as in the following query, which sorts the items by title in ascending alphabetical order before sorting by date in ascending chronological order:

allMarkdownRemark(
  sort: {
    fields: [frontmatter___title, frontmatter___date]
    order: ASC
  }
) {
  edges {
    node {
      frontmatter {
        title
        date
      }
    }
  }
}

If we want to sort by title in ascending alphabetical order but sort by date in descending chronological order, we can use the order argument to identify how the two fields should influence the sort distinctly. Note the addition of square brackets:

allMarkdownRemark(
  sort: {
    fields: [frontmatter___title, frontmatter___date]
    order: [ASC, DESC]
  }
) {
  edges {
    node {
      frontmatter {
        title
        date
      }
    }
  }
}
Note

By default, the sort keyword will sort fields in ascending order when no order is indicated. The sort keyword can be used only once on a given field, so multiple sorts need to occur through successive field identifications, as shown in the preceding example.

GraphQL Query Variables

Though GraphQL query arguments are common for typical limit, skip, filter, and sort operations, GraphQL also provides a mechanism for developers to introduce query variables at the root level of the query. This is particularly useful for situations where a user defines how a list should be sorted or filtered, or what items in the list should be skipped. In short, query variables allow for us to provide user-defined arguments as opposed to static arguments that don’t change.

Consider the following example query, which leaves it up to the query variables to determine how the results should be limited and filtered:

query GetAllPosts(
  $limit: Int
  $sort: MarkdownRemarkSortInput
  $filter: MarkdownRemarkFilterInput
) {
  allMarkdownRemark {
    edges {
      node {
        frontmatter {
          title
        }
      }
    }
  }
}

In order for this query to function properly, we need to provide values for each of these query variables in the form of a JSON object containing each of these variable names:

{
  "limit": 4,
  "sort": {
    "fields": "frontmatter___title",
    "order": "ASC"
  },
  "filter": {
    "frontmatter": {
      "title": {
        "regex": "/Watching/"
      }
    }
  }
}

As you can see, query variables can take the place of query arguments when we need to provide arbitrarily defined limits, skips, filters, and sorts. Note, however, that the query must be named and cannot remain anonymous when using query variables.

Note

Query variables in GraphQL can be either scalar values or objects, as you can see in this example. GraphiQL, a query editor and debugger we’ll cover in “The Gatsby Data Layer”, contains a Query Variables pane where developers can input arbitrary query variable values.

GraphQL Directives

GraphQL query variables allow us to designate arbitrary arguments that apply to fields, but what if we want to define actual logic that conditionally includes or excludes certain fields at query time based on those query variables? For that, we need directives. GraphQL makes two directives available:

  • The @skip directive indicates to GraphQL that based on a Boolean value defined by a query variable, the field carrying the directive should be excluded from the response.

  • The @include directive indicates to GraphQL that based on a Boolean value defined by a query variable, the field carrying the directive should be included in the response.

Consider the following example query, which defines a query variable $includeDate with a default value of false. If the query variable is set to true, then the response to the query will include items that have dates as well as titles:

query GetAllPosts(
  $includeDate: Boolean = false
) {
  allMarkdownRemark {
    edges {
      node {
        frontmatter {
          title
          date @include(if: $includeDate)
        }
      }
    }
  }
}

The @skip directive works similarly and is useful for cases where you want to leave out certain information, such as when rendering solely item titles for a list view rather than the full item for the individual item view:

query GetAllPosts(
  $teaser: Boolean = false
) {
  allMarkdownRemark {
    edges {
      node {
        frontmatter {
          title
          date @skip(if: $teaser)
        }
      }
    }
  }
}

GraphQL directives are particularly useful for situations where you wish to perform conditional rendering of only certain data pertaining to a component, and when you prefer not to overload GraphQL API responses to keep payload sizes small. The @skip and @include directives can be added to any field, as long as the query variable is available.

GraphQL Fragments

Sometimes our GraphQL queries can become overly verbose, with multiple hierarchical levels and many identified fields. This often necessitates the extraction of certain parts of the query into separate, reusable sets of fields that can be included where needed in a more concise form. In GraphQL, these repeatable sets of fields are known as fragments.

Consider a scenario where we want to reuse the frontmatter portion of the following query, shown in bold, in other queries:

query {
  site {
    siteMetadata {
      title
    }
  }
  allMarkdownRemark {
    nodes {
      excerpt
      fields {
        slug
      }
      frontmatter {
        date(formatString: "MMMM DD, YYYY")
        title
        description
      }
    }
  }
}

To define this as a fragment, we can separate out the portion of the query we wish to turn into a reusable field collection and identify it as a fragment separately from the query. In addition, we can then include the fragment within a query by referring to the name we give it when we define the fragment:

query {
  site {
    siteMetadata {
      title
    }
  }
  allMarkdownRemark {
    nodes {
      excerpt
      fields {
        slug
      }
      ...MarkdownFrontmatter
    }
  }
}

fragment MarkdownFrontmatter on MarkdownRemark {
  frontmatter {  
    date(formatString: "MMMM DD, YYYY")
    title
    description
  }
}

As you can see in this example, to use our newly created fragment within the query, we simply reference it with an ellipsis prefix (...) and the name of the fragment (MarkdownFrontmatter). Now we can potentially reuse this fragment in any other query where we need the same data to be extracted.

Fragments can also be inline, where we provide the fragment’s contents directly where the fragment is invoked. The following query is identical to the previous one, and the ellipsis here represents an anonymous fragment that is defined immediately rather than in a separate fragment definition. This approach allows you to include fields by type without resorting to the use of a fragment outside the query itself, which can improve readability:

query {
  site {
    siteMetadata {
      title
    }
  }
  allMarkdownRemark {
    nodes {
      excerpt
      fields {
        slug
      }
      ... on MarkdownRemark {
        frontmatter {  
          date(formatString: "MMMM DD, YYYY")
          title
          description
        }
      }
    }
  }
}

But there’s an outstanding question that we need to answer: what exactly is the MarkdownRemark name that comes after the fragment name, if included, and the keyword on in the fragment definition? To answer that question, we need to dig a little deeper into GraphQL’s inner workings and take a look at schemas and types.

GraphQL Schemas and Types

Because GraphQL queries are fundamentally about retrieving data that adheres to a shape desired by the client, those writing queries need to have some awareness of what shapes the GraphQL API can accept. Just like databases, GraphQL has an internal schema that assigns types to fields. These types dictate what responses look like for a given field. Most GraphQL schemas are manually written by architects, but Gatsby infers a GraphQL schema based on how it handles data internally and how it manages data from external sources.

A GraphQL schema consists of a series of type definitions that define what a field returns in the form of an object (e.g., a string, a Boolean, or an integer), as well as possible arguments for that object (e.g., ASC or DESC for ascending or descending sort order, respectively). For example, consider a GraphQL response that looks like this:

{
  "title": "Good trouble"
}

In the associated GraphQL schema to which all fields adhere, the type for the title field would be identified as follows—this explicitly limits the type of object issued by the GraphQL API in response to the title field to be a string and nothing else:

title: String

Let’s take another look at the query and fragment we wrote in the previous section:

query {
  site {
    siteMetadata {
      title
    }
  }
  allMarkdownRemark {
    nodes {
      excerpt
      fields {
        slug
      }
      ...MarkdownFrontmatter
    }
  }
}

fragment MarkdownFrontmatter on MarkdownRemark {
  frontmatter {  
    date(formatString: "MMMM DD, YYYY")
    title
    description
  }
}

In this fragment definition, we’re also indicating under which field types the fragment can be applied. In this case, the allMarkdownRemark field accepts inner fields of type MarkdownRemark. Because the excerpt, fields, and frontmatter fields are all represented as possible fields within the MarkdownRemark type, we know that our fragment containing a top-level frontmatter field will be applicable for all objects of type MarkdownRemark. Every fragment must have an associated type so that a GraphQL API can validate whether that fragment can be interpreted correctly or not.

But how do we introspect our GraphQL schema to understand how types relate to one another in type definitions, like the relationship between MarkdownRemark and the fields excerpt, fields, and frontmatter? And how do we know what type each individual field is within a given GraphQL query? To answer those questions, we’ll explore the foundational role GraphQL plays in Gatsby and how Gatsby makes available GraphQL tooling that offers query debugging and schema introspection capabilities.

The Gatsby Data Layer

The Gatsby data layer encompasses both Gatsby’s internal GraphQL API and source plugins, which together collect data and define a GraphQL schema that traverses that data. Whether this data comes from the surrounding filesystem in the form of Markdown files or from a REST or GraphQL API in the form of WordPress’s web services, Gatsby’s internal GraphQL API facilitates the single-file co-location of data requirements and data rendering, as all Gatsby GraphQL queries are written into Gatsby components.

A common question asked by Gatsby novices is why GraphQL is necessary in the first place. After all, Gatsby is primarily about generating static sites; why does it need an internal GraphQL API? Because Gatsby can pull in information from so many disparate sources, each with its own approach to exposing data, a unified data layer is required. For this reason, it’s common to hear the word data in Gatsby defined as “anything that doesn’t live in a React or Gatsby component.”

Before we jump into the sorts of GraphQL queries Gatsby’s GraphQL API makes available, we’ll first explore Gatsby’s developer ecosystem for GraphQL. This includes tools such as GraphiQL, GraphQL Explorer, and GraphQL Playground, all of which are useful means for Gatsby developers to test queries and introspect schemas.

Tip

Though Gatsby’s internal GraphQL API is the easiest way to retrieve and manipulate data within Gatsby, you can also use unstructured data and consume it through the createPages API, which we discuss in Chapter 6.

GraphiQL

There’s no requirement to install a GraphQL dependency or otherwise configure GraphQL when you start implementing a Gatsby site. As soon as you run gatsby develop or gatsby build at the root of your codebase, Gatsby will automatically infer and create a GraphQL schema. Once the site is compiled by running gatsby develop, you can explore Gatsby’s data layer and GraphQL API by navigating to the URL https://localhost:8000/___graphql (note the three underscores).

To see this in action, clone a new version of the Gatsby blog starter and run gatsby develop. Once the development server is running, navigate to https://localhost:8000/___graphql:

$ gatsby new gtdg-ch4-graphiql gatsbyjs/gatsby-starter-blog
$ cd gtdg-ch4-graphiql
$ gatsby develop

At this URL, you’ll see a two-sided UI consisting of a query editor on the lefthand side and a response preview pane on the righthand side (Figure 4-1). This is GraphiQL, an interactive GraphQL query editor that allows developers to quickly test queries. Figure 4-1 also shows the Documentation Explorer, accessed by clicking “Docs,” expanded.

Figure 4-1. GraphiQL, an interactive GraphQL query editor for testing and debugging queries according to a given schema

To see GraphiQL in action, let’s try inserting a simple GraphQL query to extract some information about the site:

{
  site {
    siteMetadata {
      title
      description
    }
  }
}

After the description field, try hitting the Enter key and typing Ctrl-Space or Shift-Space. You’ll see an autocomplete list appear with fields that are valid for insertion at that point, including author and siteUrl (Figure 4-2). This autocomplete feature is one way to insert fields into GraphiQL; you can also enter them manually.

Figure 4-2. GraphiQL’s autocomplete feature can be activated at any point by typing Ctrl-Space (or Shift-Space) to see what fields can be inserted at that point in the query

Now, type Ctrl-Enter to run the query (or click the Play button in the header). You’ll see the response in the righthand pane, expressed in valid JSON (Figure 4-3). As you can see, GraphiQL will run the query against Gatsby’s internal GraphQL schema and return a response based on that schema. Testing our queries with GraphiQL, therefore, gives us full confidence that queries running correctly there will work properly within our code as well.

The Documentation Explorer on the right is GraphiQL’s internal schema introspection system, which allows you to search the GraphQL schema and to view what types are associated with certain fields. The information served by the Documentation Explorer matches the information displayed when you hover over a field in the query editor on the lefthand side. To open the Documentation Explorer while it is toggled closed, click the “Docs” link in the upper righthand corner of GraphiQL.

Figure 4-3. To run a GraphQL query in GraphQL, click the Play button or press Ctrl-Enter; you’ll see the response in the preview pane

GraphiQL Explorer

Packaged with GraphiQL’s query editor is GraphiQL Explorer, a convenient introspection interface for developers to see what fields are available for a given query, along with nested fields within them. You can also use GraphiQL Explorer to construct a query by clicking on available fields and inputs, rather than writing out the query by hand. For developers who prefer more of a graphical query building experience, GraphiQL Explorer is a convenient tool (see Figure 4-4).

GraphiQL Explorer is particularly useful for advanced queries that require complex logic—especially unions, which are generally left up to the GraphQL implementation and lack a unified standard, and inline fragments, which can be frustrating for developers new to GraphQL to work with. GraphiQL Explorer lists all the available union types within the Explorer view and makes it easy to test inline fragments.

Gatsby also includes support for code snippet generation based on GraphiQL Explorer through GraphiQL’s Code Exporter. Rather than generating just the GraphQL query, which needs to be integrated with a Gatsby component, the Code Exporter is capable of generating a Gatsby page or component file based on a query constructed in GraphiQL Explorer.

Note

For more information about GraphiQL Explorer and GraphiQL’s Code Exporter, consult Michal Piechowiak’s blog post on the subject.

Figure 4-4. In GraphiQL Explorer, built into the GraphiQL interface, you can click on fields represented in the Explorer view to construct queries without typing them by hand

GraphQL Playground

Though GraphiQL is useful for most developers’ requirements, sometimes it’s important to have a more fundamental understanding of a GraphQL schema, at the level of how data is served to the schema by external data sources through source plugins. Exploring how schemas are constructed for GraphQL can help identify deeper-seated issues that require work within external data sources rather than within the schema itself.

GraphQL Playground, developed by Prisma, is an integrated development environment (IDE) for GraphQL queries that provides much more power than GraphiQL. It considers the logic of how data enters GraphQL schemas, rather than just allowing query testing like GraphiQL, but it requires installation and isn’t available out of the box in Gatsby. GraphQL Playground provides a range of useful tools, but for the time being it remains an experimental feature in Gatsby.

To use GraphQL Playground with Gatsby, add the GATSBY_GRAPHQL_IDE flag to the value associated with the develop script in your Gatsby site’s package.json file:

"develop": "GATSBY_GRAPHQL_IDE=playground gatsby develop",

Now, instead of running gatsby develop, run the following command, and when you visit the URL https://localhost:8000/___graphql you’ll see the GraphQL Playground interface instead of GraphiQL (Figure 4-5):

$ npm run develop
Tip

If you’re developing Gatsby sites on Windows, you will first need to install cross-env (npm install --save-dev cross-env) and change the value associated with the develop script in your package.json to the following instead:

"develop": "cross-env GATSBY_GRAPHQL_IDE=playground 
gatsby develop",
Figure 4-5. The initial state of GraphQL Playground when you enable it in Gatsby

Now that you have a solid understanding of the available developer tools for working with GraphQL in Gatsby, we can finally turn our attention to Gatsby’s page and component queries, the most important GraphQL building blocks in Gatsby.

Note

To continue using gatsby develop to instantiate your local development environment instead of npm run develop, add the dotenv package to your gatsby-config.js file and, separately, add an environment variable file. Because we are concerned with the development environment here, name the file .env.development, and add the following line to it:

GATSBY_GRAPHQL_IDE=playground

Page and Component Queries

As we’ve seen in previous chapters, Gatsby works with both pages and components. Up to now, when exploring how we can build pages and components for Gatsby sites, we’ve always explicitly provided the data within the JSX that renders that data rather than pulling data from external sources or from the surrounding filesystem.

In this chapter’s introduction, we defined Gatsby’s data layer as the mediator between data, whether it originates from the local filesystem or from an external source like a database or CMS, and the rendering that occurs within Gatsby’s pages and components. Now, we can connect the dots and see how this rendering happens.

Whenever Gatsby sees a GraphQL query conforming to Gatsby’s standard GraphQL approach, it will parse, evaluate, and inject the query response into the page or component from which the query originates.

For the remainder of this chapter, we’ll focus our attention primarily on GraphQL queries that work with the surrounding filesystem, as Gatsby can pull data from Markdown or other files that it can access. In Chapter 5, we’ll discuss source plugins, which Gatsby employs to retrieve data from external systems and to populate a GraphQL schema.

Page Queries

In Gatsby, pages can be rendered using no data at all (if the data is hardcoded) or using data brought in via GraphQL queries. GraphQL queries in Gatsby pages are known as page queries, and they have a one-to-one relationship with a given Gatsby page. Unlike Gatsby’s static queries, which we’ll examine in the following sections, page queries can accept GraphQL query variables like those we saw in “GraphQL Query Variables”.

Gatsby makes available a graphql tag for arbitrary GraphQL queries defined within a Gatsby page or component. To see this in action, let’s create a new Gatsby blog based on the Gatsby blog starter. Because it already comes with a source plugin enabled, we can jump right in and look at some of the GraphQL queries contained in its pages:

$ gatsby new gtdg-ch4-graphql gatsbyjs/gatsby-starter-blog
$ cd gtdg-ch4-graphql

Open src/pages/index.js, one of our Gatsby pages, and let’s go through it step by step:

// src/pages/index.js
import React from "react"
import { Link, graphql } from "gatsby" 

import Bio from "../components/bio"
import Layout from "../components/layout"
import SEO from "../components/seo"

const BlogIndex = ({ data, location }) => { 
  const siteTitle = data.site.siteMetadata?.title || `Title`
  const posts = data.allMarkdownRemark.nodes

  if (posts.length === 0) {
    return (
      <Layout location={location} title={siteTitle}>
        <SEO title="All posts" />
        <Bio />
        <p>
          No blog posts found. Add Markdown posts to "content/blog" (or the
          directory you specified for the "gatsby-source-filesystem" plugin in
          gatsby-config.js).
        </p>
      </Layout>
    )
  }

  return ( 
    <Layout location={location} title={siteTitle}>
      <SEO title="All posts" />
      <Bio />
      <ol style={{ listStyle: `none` }}>
        {posts.map(post => {
          const title = post.frontmatter.title || post.fields.slug

          return (
            <li key={post.fields.slug}>
              <article
                className="post-list-item"
                itemScope
                itemType="http://schema.org/Article"
              >
                <header>
                  <h2>
                    <Link to={post.fields.slug} itemProp="url">
                      <span itemProp="headline">{title}</span>
                    </Link>
                  </h2>
                  <small>{post.frontmatter.date}</small>
                </header>
                <section>
                  <p
                    dangerouslySetInnerHTML={{
                      __html: post.frontmatter.description || post.excerpt,
                    }}
                    itemProp="description"
                  />
                </section>
              </article>
            </li>
          )
        })}
      </ol>
    </Layout>
  )
}

export default BlogIndex

export const pageQuery = graphql` 
  query { 
    site {
      siteMetadata {
        title
      }
    }
    allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }) {
      nodes {
        excerpt
        fields {
          slug
        }
        frontmatter {
          date(formatString: "MMMM DD, YYYY")
          title
          description
        }
      }
    }
  }
`

This import statement brings in Gatsby’s <Link /> component and graphql tag.

Here the connection is made between our GraphQL query and the data variable, which is populated with the props generated by the response to our GraphQL query, and in the same shape as our query.

This code performs all of our rendering in JSX.

Note that we are using an export statement to ensure that Gatsby is aware of our GraphQL query. The name of the constant (here, pageQuery) isn’t important, because Gatsby inspects our code for an exported graphql string rather than a specific variable name. Many page queries in the wild are simply named query. Only one page query is possible per Gatsby page. Also note that we are using a tagged template surrounded by backticks, allowing for a multiline string that contains our GraphQL query and is indicated by Gatsby’s graphql tag. The contents of these backticks must be a valid GraphQL query in order for Gatsby to successfully populate the data in the page.

This is the GraphQL query that populates the home page of our Gatsby blog starter.

Let’s focus on that second section:

// src/pages/index.js
const BlogIndex = ({ data, location }) => {
  const siteTitle = data.site.siteMetadata?.title || `Title`

Rather than hardcoding data like we did in earlier chapters, we can now dig further into our data object to access all the data we need. Let’s see this in action with a simple example that pulls from our Gatsby site information in gatsby-config.js. Change the preceding lines to use our description from gatsby-config.js instead:

// src/pages/index.js
const BlogIndex = ({ data, location }) => {
  const siteTitle = data.site.siteMetadata?.description || `Description`

We also need to update our pageQuery GraphQL query to retrieve the site description as well as the title:

// src/pages/index.js
export const pageQuery = graphql`
  query {
    site {
      siteMetadata {
        title
        description
      }
    }
    allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }) {
      nodes {
        excerpt
        fields {
          slug
        }
        frontmatter {
          date(formatString: "MMMM DD, YYYY")
          title
          description
        }
      }
    }
  }
`

Now, when you save the file and execute gatsby develop, you’ll see that the blog title has been updated to reflect the description text (“A starter blog demonstrating what Gatsby can do.”) rather than the title (“Gatsby Starter Blog”).

Page queries make up the majority of GraphQL queries you’ll construct while building rudimentary Gatsby sites. But what about GraphQL queries contained within components that aren’t pages? We’ll look at that next.

Component Queries with StaticQuery

As of Gatsby v2, Gatsby also allows individual components contained within a page to retrieve data using a GraphQL query through the StaticQuery API. This is particularly useful when you split out a component from a surrounding page but require external data for just that component. We call these component queries.

Though StaticQuery is capable of handling most of the use cases page queries already address, including fragments, static queries differ from page queries in several critical ways:

  • Although page queries can accept query variables, they do not function outside of Gatsby pages.

  • Static queries cannot accept query variables (this is why they’re called “static”), but they can be used in both pages and in-page components.

  • The StaticQuery API does not work properly with React.createElement invocations that fall outside of JSX’s purview. For these cases, Gatsby recommends using JSX and, if needed, explicitly using StaticQuery in a JSX element (<StaticQuery />).

Static queries share one characteristic with page queries: only one static query can be used per component, just as only one page query can be used per page. Therefore, if you have separate data requirements in another portion of the component, you will need to split that logic out into another component before adding a new static query.

Importantly, static queries provide the same benefits of co-location within components that page queries do within pages. Using the StaticQuery API allows you to both issue a query and render the data from the response in a single JSX element. Consider the following example, which demonstrates this co-location:

// src/components/header.js
import React from "react"
import { StaticQuery, graphql } from "gatsby"

export default function Header() {
 return (
   <StaticQuery
     query={graphql`
       query {
         site {
           siteMetadata {
             title
           }
         }
       }
     `}
     render={data => (
       <header>
         <h1>{data.site.siteMetadata.title}</h1>
       </header>
     )}
   />
 )
}

As you can see, using the <StaticQuery /> JSX element gives us the query attribute, whose value is our component query, and the render attribute, whose value represents how we want the response data from the component query to figure into our component’s rendering within a function.

Now that we have a means of issuing component queries, not just page queries, we’re all set! But there is one outstanding question remaining, particularly for developers who have adopted the React Hooks paradigm: how can we use a React hook to define a component query rather than a JSX element?

Warning

If you are performing type checking through PropTypes, a common API in React applications, using the <StaticQuery /> JSX element will break this. For an example of how to restore PropTypes type checking, consult the Gatsby documentation.

Component Queries with the useStaticQuery Hook

As of Gatsby v2.1.0, a separate means of accessing component queries is available in the form of the useStaticQuery hook. For readers unfamiliar with the React Hooks paradigm, React hooks are methods to access state information and other key React features without having to create a class. In short, the useStaticQuery hook accepts a GraphQL query and returns the response data. It can be used in any component, including pages and in-page components.

The useStaticQuery hook has a few limitations, just like <StaticQuery />:

  • The useStaticQuery hook cannot accept query variables (again, this is why it is called “static”).

  • As with page queries and static queries, only one useStaticQuery hook can be used per component.

Tip

You must have React and ReactDOM 16.8.0 or later to use the useStaticQuery hook. If you’re using an older version of Gatsby, run this command to update your React and ReactDOM versions to the appropriate version:

$ npm install react@^16.8.0 react-dom@^16.8.0

Let’s take another look at the example component from the previous section. This time we’ll use the useStaticQuery hook instead of <StaticQuery />:

// src/components/header.js
import React from "react"
import { useStaticQuery, graphql } from "gatsby"

export default function Header() {
 const data = useStaticQuery(graphql`
   query {
     site {
       siteMetadata {
         title
       }
     }
   }
 `)
 return (
   <header>
     <h1>{data.site.siteMetadata.title}</h1>
   </header>
 )
}

React hooks have become popular because developers can easily use them to create chunks of repeatable functionality, much like helper functions. Because useStaticQuery is a hook, we can leverage it to compose and also recycle blocks of reusable functionality rather than invoking that functionality every time.

One common example of this is to create a hook that will provide data for reuse in any component so that the query is only issued once. This shortens the build duration and means that our Gatsby site can deploy slightly faster. For instance, we may want to only query for our site title once. In the following hook definition, we’ve created a React hook that can be reused in any component:

// src/hooks/use-site-title.js
import { useStaticQuery, graphql } from "gatsby"

export const useSiteTitle = () => {
 const { site } = useStaticQuery(
   graphql`
     query {
       site {
         siteMetadata {
           title
         }
       }
     }
   `
 )
 return site.siteMetadata
}

Now, we can import this React hook into our header component and invoke it there to get our Gatsby site title:

// src/components/header.js
import React from "react"
import { useSiteTitle } from "../hooks/use-site-title"

export default function Header() {
 const { title } = useSiteTitle()
 return (
   <header>
     <h1>{title}</h1>
   </header>
 )
}
Note

Consult the React documentation for more information about React Hooks.

Equipped with an understanding of page queries and component queries using either <StaticQuery /> or the useStaticQuery hook, we now have a variety of approaches to query data from within our Gatsby pages and components.

Note

For more information about Gatsby’s GraphQL APIs, which are inspectable in the GraphiQL interface, consult the GraphQL API, query options, and the Node model and Node interface documentation.

Conclusion

This chapter introduced GraphQL and Gatsby’s internal data layer, covering both the principles underlying GraphQL queries and APIs and how GraphQL appears in Gatsby in the form of page and component queries. Though it’s possible to use Gatsby without GraphQL, this is where much of the power inherent to Gatsby comes from, because it mediates the relationship between data—whether it originates from an external source or a local filesystem—and its rendering.

The way we write GraphQL queries in Gatsby with the graphql tag and exported queries is by design when it comes to a favorable developer experience and separation of concerns. Because GraphQL queries usually sit alongside rendering code in Gatsby components, Gatsby’s internal data layer also facilitates the sort of co-location of data requirements and rendering logic that many React developers consider just as much a best practice as testing your queries in GraphiQL or GraphQL Playground.

But where exactly does all this data come from? What does Gatsby do to retrieve data from disparate external sources or the filesystem and populate its internal GraphQL schema? How can we connect CMSs and commerce systems to our Gatsby sites? Next, we cover source plugins and sourcing data—all of the ways Gatsby gets its hands on external data.

Get Gatsby: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.