Chapter 4. GraphQL and the Gatsby Data Layer
Up until now, all of our work on implementing Gatsby has focused on use cases that don’t require data retrieval and processing. Before we turn our attention to how Gatsby integrates with external data, in this chapter we’ll cover the data layer in Gatsby, whether data comes from files in your Gatsby site (the local filesystem) or from external sources such as CMSs, commerce systems, or backend databases (external data sources that require a source plugin, discussed in more detail in Chapter 5).
Within Gatsby’s data layer, GraphQL mediates the relationship between Gatsby pages and components and the data that populates those pages and components. Though GraphQL is popular in web development as a query language, Gatsby uses it internally to provide a single unified approach to handling and managing data. With data potentially originating from a variety of disparate sources, Gatsby flattens the differences between discrete serialization formats (forms of articulating data) and thus third-party systems by populating a GraphQL API that is accessible from any Gatsby page or component.
In this chapter, we’ll explore the foundations of GraphQL that apply to any GraphQL API before switching gears to look at how Gatsby uses GraphQL specifically in its data layer and in page and component queries.
GraphQL Fundamentals
GraphQL is a query language that provides client-tailored queries. That is, unlike REST APIs, which adhere to the requirements dictated by the server, GraphQL APIs respond to client queries with a response that adheres to the shape of that query. Today, GraphQL APIs are commonplace for backend database access, headless CMS consumption, and other cross-system use cases, but it’s still quite rare to find GraphQL used internally within a framework.
GraphQL has become popular thanks to the flexibility it provides developers and the more favorable developer experience it facilitates through client-driven queries. Common motivations for using it include:
- Avoiding response bloat
- GraphQL improves query performance by only serving that data that is necessary to populate the response according to the client-issued query. In traditional REST APIs, response payload sizes can be larger than necessary, or additional requests may be required to acquire the needed information.
- Query-time data transformations
- In many JavaScript implementations, data postprocessing needs to occur to harmonize data formats or to perform a sort operation. GraphQL offers means to perform data transformations on the fly at query time through the use of explicitly defined arguments within a GraphQL API.
- Offloading request complexity
- In many JavaScript implementations, correctly issuing a query often requires a complex interplay between promises, XMLHttpRequest implementations, and waiting for the requested data to arrive. Because GraphQL only requires a query, as opposed to a particular URL (and potentially headers and request bodies), it may provide a smoother developer experience.
Using GraphQL does have some disadvantages, too—notably, GraphQL APIs can be difficult to scale due to the need to serve a response that is tightly tailored to the client’s query. Fortunately, because Gatsby uses GraphQL during development and build-time compilation, that latency only impacts the build duration rather than the end user’s time to first interaction.
To get you up to speed, in the coming sections we’ll cover GraphQL queries, fields, arguments, query variables, directives, fragments, and finally schemas and types.
GraphQL Queries
The primary means of interacting with a GraphQL API from the client is a query, which is a declarative expression of data requirements from the server.
Consider the following example GraphQL query, which in Gatsby returns the title of the Gatsby site from its metadata:
{ site { siteMetadata { title } } }
Gatsby’s internal GraphQL API will return the following response to this query:
{ "data": { "site": { "siteMetadata": { "title": "A Gatsby site!" } } }, "extensions": {} }
Notice how the GraphQL query and response are structurally identical: they share the same hierarchy and the same sequence of names. In other words, a typical GraphQL query issued by a client outlines the shape that the GraphQL response issued by the server should take. The pseudo-JSON structure of the GraphQL query becomes a valid JSON object in the GraphQL response.
The preceding GraphQL query is anonymous; it lacks an explicit name. But in GraphQL, you can identify queries with an operation type and an operation name. Here, query
is the operation type, and GetSiteInfo
is the operation name:
query GetSiteInfo { site { siteMetadata { title } } }
You can also identify the operation type without an operation name, if you wish to write an anonymous query. This and the previous query return identical responses:
query { site { siteMetadata { title } } }
Note
GraphQL queries are read operations; they retrieve data. There is another operation type, mutation
, which handles write operations. However, Gatsby does not provide mutation support within its internal GraphQL API due to its impact on how Gatsby functions, so we do not cover mutations here. For more information about GraphQL beyond Gatsby, including GraphQL mutations, consult Learning GraphQL by Eve Porcello and Alex Banks (O’Reilly).
GraphQL Fields
Let’s take another look at the anonymous version of the query:
{ site { siteMetadata { title } } }
In GraphQL, each of the words contained within the query (site
, siteMetadata
, title
) that identify inner elements are known as fields. Those fields located at the top level (e.g., site
) are occasionally referred to as root-level fields, but keep in mind that all GraphQL fields behave identically, and there is no functional difference between fields at any point in a query’s hierarchy.
GraphQL fields are crucial because they tell the GraphQL API what information the client desires for further processing and rendering. In GraphQL APIs outside Gatsby, the client and server are generally distinct from an architectural perspective. But in Gatsby, the GraphQL server and client are contained within the same framework; for our purposes, React and Gatsby components are our GraphQL clients.
Fields can take aliases, which allow us to arbitrarily rename any GraphQL field in the schema to something else in the resulting response from the API. In the following example, though the title
field is identified as such in the schema, we’ve aliased it to siteName
in our query, so the server will return a JSON object containing siteName
as the identifier rather than title
:
{ site { siteMetadata { siteName: title } } }
Aliases can be particularly useful when you wish to serve the same data multiple times but in a different form, since GraphQL prohibits repeating the same field name twice in a single query:
{ defaultSite: site { siteMetadata { title } } aliasedSite: site { metadata: siteMetadata { siteName: title } } }
GraphQL Arguments
In GraphQL, arguments are used to apply certain criteria to the fields in the response. These can be as simple as sort criteria, such as ascending or descending alphabetical order, or as complex as date formatters, which return dates according to the format stipulated by an argument applied to a field.
Let’s take a look at a more complex query to understand how to refine the sort of response that comes back through arguments applied to fields—we’ll discuss many of the unfamiliar aspects of this query in subsequent chapters:
query { site { siteMetadata { title } } allMarkdownRemark { nodes { excerpt fields { slug } frontmatter { date(formatString: "MMMM DD, YYYY") title description } } } }
In this anonymous query, we see an example of an argument on the date
field, named formatString
:
date(formatString: "MMMM DD, YYYY")
In many databases and servers, dates are stored as Unix timestamps or in a machine-friendly format rather than a human-readable format. To display a date like 2021-11-05
in a more user-friendly form in the response, we can use a GraphQL argument on the date
field to perform query-time date formatting. In this example, the date would be formatted as:
"November 5, 2021"
In addition to providing a date format to the formatString
argument on the date
field, we can also define a locale
to adapt the outputted date to the preferred locale or language, which will have different names for months and days of the week. For example:
date( formatString: "D MMMM YYYY" locale: "tr" )
Because tr
represents the Turkish language and region, our date will have the Turkish name for the month of November (note that we’ve also supplied a formatString
appropriate to the region):
"5 Kasım 2021"
As mentioned previously, aliases allow us to serve multiple dates in different formats in the same response without running afoul of GraphQL’s prohibition of repeated field names in a single query:
englishDate: date(formatString: "MMMM DD, YYYY") turkishDate: date( formatString: "D MMMM YYYY", locale: "tr" )
You can also use fromNow
, which returns a string showing how long ago or how far in the future the returned date is, and difference
, which returns the difference between the date and current time in the specified unit (e.g., days
or weeks
):
firstDate: date(fromNow: true) secondDate: date(difference: "weeks")
Note
Gatsby depends on a library known as Moment.js to format dates. The Moment.js documentation has a full accounting of tokens for date formatters. Note that to introduce locales unavailable by default (Moment.js ships with English–US locale strings by default), you will need to load the locale into Moment.js.
Though string formatters are a common type of argument found in GraphQL APIs, other arguments influence not only the field they are attached to but also all fields nested within. For instance, given a query that returns multiple objects of a given type, GraphQL arguments exist that allow you to arbitrarily limit, skip, filter, or sort the objects returned in the response.
Note
The group
field in GraphQL also accepts an arbitrary field
argument by which to group results. For an illustration of using group
, see Chapter 8.
limit and skip
In the following query, we have added a limit
argument to ensure that only four items are returned:
{ allMarkdownRemark(limit: 4) { edges { node { frontmatter { title } } } } }
The response to this query will contain only four items, even if there are more than four present.
In the following query, we’ve added a skip
argument so the GraphQL API excludes the first five items from the list and returns only four items from that point onward:
{ allMarkdownRemark(limit: 4, skip: 5) { edges { node { frontmatter { title } } } } }
The response to this query will contain only four items, having skipped the first five in the list. Assuming there are at least nine items available, the response will contain items six through nine.
filter
In the following query, we’ve added a filter
argument that uses the ne
(not equal) operator to ensure that the title
field within the frontmatter
field will be excluded from the response if it contains no content:
{ allMarkdownRemark( filter: { frontmatter: { title: { ne: "" } } } ) { edges { node { frontmatter { title } } } } }
The response to this query will exclude any items in which the title
field is empty.
Gatsby uses a package known as Sift to perform filtering through a MongoDB-like syntax that will be familiar to developers who routinely work with MongoDB databases. Therefore, Gatsby’s GraphQL API supports common operators such as eq
(equal), ne
(not equal), in
(is this item in an arbitrary list?), and regex
(arbitrary regular expressions). You can also filter on multiple fields, as in the following example, which checks for an item that contains the title
“Their Eyes Were Watching God” and does not have an empty date
:
allMarkdownRemark( filter: { frontmatter: { title: { eq: "Their Eyes Were Watching God" } date: { ne: "" } } } )
These operators can even be combined on the same field. This example filters for items containing the string “Watching” but excludes “Their Eyes Were Watching God” from the returned results:
allMarkdownRemark( filter: { frontmatter: { title: { regex: "/Watching/" ne: "Their Eyes Were Watching God" } } } )
Thanks to Sift, Gatsby provides a variety of useful operators for filter arguments (see Table 4-1).
Operator | Meaning | Definition |
---|---|---|
eq |
Equal | Must match the given data exactly |
ne |
Not equal | Must differ from the given data |
regex |
Regular expression | Must match the given regular expression (in Gatsby 2.0, backslashes must be escaped twice, so /\+/ must be written as /\\\\+/ ) |
glob |
Global | Permits the use of * as a wildcard, which acts as a placeholder for any nonempty string |
in |
In array | Must be a member of the array |
nin |
Not in array | Must not be a member of the array |
gt |
Greater than | Must be greater than the given value |
gte |
Greater than or equal | Must be greater than or equal to the given value |
lt |
Less than | Must be less than the given value |
lte |
Less than or equal | Must be less than or equal to the given value |
elemMatch |
Element match | Indicates that the field being filtered will return an array of individual elements, on which filters can be applied with the preceding operators |
sort
Just as the filter
argument can ensure the inclusion or exclusion of certain data in a query response, the sort
argument can reorder or resequence the data returned. Consider the following query, in which we’re sorting the returned data in ascending alphabetical order (ASC
):
allMarkdownRemark( sort: { fields: [frontmatter___title] order: ASC } ) { edges { node { frontmatter { title date } } } }
Note here that we’ve used three underscores in succession within the fields
value to identify a nested field within the frontmatter
field (frontmatter___title
).
We can also sort according to multiple fields, as in the following query, which sorts the items by title
in ascending alphabetical order before sorting by date
in ascending chronological order:
allMarkdownRemark( sort: { fields: [frontmatter___title, frontmatter___date] order: ASC } ) { edges { node { frontmatter { title date } } } }
If we want to sort by title in ascending alphabetical order but sort by date in descending chronological order, we can use the order
argument to identify how the two fields should influence the sort distinctly. Note the addition of square brackets:
allMarkdownRemark( sort: { fields: [frontmatter___title, frontmatter___date] order: [ASC, DESC] } ) { edges { node { frontmatter { title date } } } }
GraphQL Query Variables
Though GraphQL query arguments are common for typical limit, skip, filter, and sort operations, GraphQL also provides a mechanism for developers to introduce query variables at the root level of the query. This is particularly useful for situations where a user defines how a list should be sorted or filtered, or what items in the list should be skipped. In short, query variables allow for us to provide user-defined arguments as opposed to static arguments that don’t change.
Consider the following example query, which leaves it up to the query variables to determine how the results should be limited and filtered:
query GetAllPosts( $limit: Int $sort: MarkdownRemarkSortInput $filter: MarkdownRemarkFilterInput ) { allMarkdownRemark { edges { node { frontmatter { title } } } } }
In order for this query to function properly, we need to provide values for each of these query variables in the form of a JSON object containing each of these variable names:
{
"limit"
:
4
,
"sort"
:
{
"fields"
:
"frontmatter___title"
,
"order"
:
"ASC"
},
"filter"
:
{
"frontmatter"
:
{
"title"
:
{
"regex"
:
"/Watching/"
}
}
}
}
As you can see, query variables can take the place of query arguments when we need to provide arbitrarily defined limits, skips, filters, and sorts. Note, however, that the query must be named and cannot remain anonymous when using query variables.
Note
Query variables in GraphQL can be either scalar values or objects, as you can see in this example. GraphiQL, a query editor and debugger we’ll cover in “The Gatsby Data Layer”, contains a Query Variables pane where developers can input arbitrary query variable values.
GraphQL Directives
GraphQL query variables allow us to designate arbitrary arguments that apply to fields, but what if we want to define actual logic that conditionally includes or excludes certain fields at query time based on those query variables? For that, we need directives. GraphQL makes two directives available:
-
The
@skip
directive indicates to GraphQL that based on a Boolean value defined by a query variable, the field carrying the directive should be excluded from the response. -
The
@include
directive indicates to GraphQL that based on a Boolean value defined by a query variable, the field carrying the directive should be included in the response.
Consider the following example query, which defines a query variable $includeDate
with a default value of false
. If the query variable is set to true
, then the response to the query will include items that have dates as well as titles:
query GetAllPosts( $includeDate: Boolean = false ) { allMarkdownRemark { edges { node { frontmatter { title date @include(if: $includeDate) } } } } }
The @skip
directive works similarly and is useful for cases where you want to leave out certain information, such as when rendering solely item titles for a list view rather than the full item for the individual item view:
query GetAllPosts( $teaser: Boolean = false ) { allMarkdownRemark { edges { node { frontmatter { title date @skip(if: $teaser) } } } } }
GraphQL directives are particularly useful for situations where you wish to perform conditional rendering of only certain data pertaining to a component, and when you prefer not to overload GraphQL API responses to keep payload sizes small. The @skip
and @include
directives can be added to any field, as long as the query variable is available.
GraphQL Fragments
Sometimes our GraphQL queries can become overly verbose, with multiple hierarchical levels and many identified fields. This often necessitates the extraction of certain parts of the query into separate, reusable sets of fields that can be included where needed in a more concise form. In GraphQL, these repeatable sets of fields are known as fragments.
Consider a scenario where we want to reuse the frontmatter
portion of the following query, shown in bold, in other queries:
query { site { siteMetadata { title } } allMarkdownRemark { nodes { excerpt fields { slug } frontmatter { date(formatString: "MMMM DD, YYYY") title description } } } }
To define this as a fragment, we can separate out the portion of the query we wish to turn into a reusable field collection and identify it as a fragment separately from the query. In addition, we can then include the fragment within a query by referring to the name we give it when we define the fragment:
query { site { siteMetadata { title } } allMarkdownRemark { nodes { excerpt fields { slug } ...MarkdownFrontmatter } } } fragment MarkdownFrontmatter on MarkdownRemark { frontmatter { date(formatString: "MMMM DD, YYYY") title description } }
As you can see in this example, to use our newly created fragment within the query, we simply reference it with an ellipsis prefix (...
) and the name of the fragment (MarkdownFrontmatter
). Now we can potentially reuse this fragment in any other query where we need the same data to be extracted.
Fragments can also be inline, where we provide the fragment’s contents directly where the fragment is invoked. The following query is identical to the previous one, and the ellipsis here represents an anonymous fragment that is defined immediately rather than in a separate fragment definition. This approach allows you to include fields by type without resorting to the use of a fragment outside the query itself, which can improve readability:
query { site { siteMetadata { title } } allMarkdownRemark { nodes { excerpt fields { slug } ... on MarkdownRemark { frontmatter { date(formatString: "MMMM DD, YYYY") title description } } } } }
But there’s an outstanding question that we need to answer: what exactly is the MarkdownRemark
name that comes after the fragment name, if included, and the keyword on
in the fragment definition? To answer that question, we need to dig a little deeper into GraphQL’s inner workings and take a look at schemas and types.
GraphQL Schemas and Types
Because GraphQL queries are fundamentally about retrieving data that adheres to a shape desired by the client, those writing queries need to have some awareness of what shapes the GraphQL API can accept. Just like databases, GraphQL has an internal schema that assigns types to fields. These types dictate what responses look like for a given field. Most GraphQL schemas are manually written by architects, but Gatsby infers a GraphQL schema based on how it handles data internally and how it manages data from external sources.
A GraphQL schema consists of a series of type definitions that define what a field returns in the form of an object (e.g., a string, a Boolean, or an integer), as well as possible arguments for that object (e.g., ASC
or DESC
for ascending or descending sort order, respectively). For example, consider a GraphQL response that looks like this:
{
"title"
:
"Good trouble"
}
In the associated GraphQL schema to which all fields adhere, the type for the title
field would be identified as follows—this explicitly limits the type of object issued by the GraphQL API in response to the title
field to be a string and nothing else:
title: String
Let’s take another look at the query and fragment we wrote in the previous section:
query { site { siteMetadata { title } } allMarkdownRemark { nodes { excerpt fields { slug } ...MarkdownFrontmatter } } } fragment MarkdownFrontmatter on MarkdownRemark { frontmatter { date(formatString: "MMMM DD, YYYY") title description } }
In this fragment definition, we’re also indicating under which field types the fragment can be applied. In this case, the allMarkdownRemark
field accepts inner fields of type MarkdownRemark
. Because the excerpt
, fields
, and frontmatter
fields are all represented as possible fields within the MarkdownRemark
type, we know that our fragment containing a top-level frontmatter
field will be applicable for all objects of type MarkdownRemark
. Every fragment must have an associated type so that a GraphQL API can validate whether that fragment can be interpreted correctly or not.
But how do we introspect our GraphQL schema to understand how types relate to one another in type definitions, like the relationship between MarkdownRemark
and the fields excerpt
, fields
, and frontmatter
? And how do we know what type each individual field is within a given GraphQL query? To answer those questions, we’ll explore the foundational role GraphQL plays in Gatsby and how Gatsby makes available GraphQL tooling that offers query debugging and schema introspection capabilities.
The Gatsby Data Layer
The Gatsby data layer encompasses both Gatsby’s internal GraphQL API and source plugins, which together collect data and define a GraphQL schema that traverses that data. Whether this data comes from the surrounding filesystem in the form of Markdown files or from a REST or GraphQL API in the form of WordPress’s web services, Gatsby’s internal GraphQL API facilitates the single-file co-location of data requirements and data rendering, as all Gatsby GraphQL queries are written into Gatsby components.
A common question asked by Gatsby novices is why GraphQL is necessary in the first place. After all, Gatsby is primarily about generating static sites; why does it need an internal GraphQL API? Because Gatsby can pull in information from so many disparate sources, each with its own approach to exposing data, a unified data layer is required. For this reason, it’s common to hear the word data in Gatsby defined as “anything that doesn’t live in a React or Gatsby component.”
Before we jump into the sorts of GraphQL queries Gatsby’s GraphQL API makes available, we’ll first explore Gatsby’s developer ecosystem for GraphQL. This includes tools such as GraphiQL, GraphQL Explorer, and GraphQL Playground, all of which are useful means for Gatsby developers to test queries and introspect schemas.
Tip
Though Gatsby’s internal GraphQL API is the easiest way to retrieve and manipulate data within Gatsby, you can also use unstructured data and consume it through the createPages
API, which we discuss in Chapter 6.
GraphiQL
There’s no requirement to install a GraphQL dependency or otherwise configure GraphQL when you start implementing a Gatsby site. As soon as you run gatsby develop
or gatsby build
at the root of your codebase, Gatsby will automatically infer and create a GraphQL schema. Once the site is compiled by running gatsby develop
, you can explore Gatsby’s data layer and GraphQL API by navigating to the URL https://localhost:8000/___graphql (note the three underscores).
To see this in action, clone a new version of the Gatsby blog starter and run gatsby develop
. Once the development server is running, navigate to https://localhost:8000/___graphql:
$
gatsby
new
gtdg-ch4-graphiql
gatsbyjs/gatsby-starter-blog
$
cd
gtdg-ch4-graphiql
$
gatsby
develop
At this URL, you’ll see a two-sided UI consisting of a query editor on the lefthand side and a response preview pane on the righthand side (Figure 4-1). This is GraphiQL, an interactive GraphQL query editor that allows developers to quickly test queries. Figure 4-1 also shows the Documentation Explorer, accessed by clicking “Docs,” expanded.
To see GraphiQL in action, let’s try inserting a simple GraphQL query to extract some information about the site:
{ site { siteMetadata { title description } } }
After the description
field, try hitting the Enter key and typing Ctrl-Space or Shift-Space. You’ll see an autocomplete list appear with fields that are valid for insertion at that point, including author
and siteUrl
(Figure 4-2). This autocomplete feature is one way to insert fields into GraphiQL; you can also enter them manually.
Now, type Ctrl-Enter to run the query (or click the Play button in the header). You’ll see the response in the righthand pane, expressed in valid JSON (Figure 4-3). As you can see, GraphiQL will run the query against Gatsby’s internal GraphQL schema and return a response based on that schema. Testing our queries with GraphiQL, therefore, gives us full confidence that queries running correctly there will work properly within our code as well.
The Documentation Explorer on the right is GraphiQL’s internal schema introspection system, which allows you to search the GraphQL schema and to view what types are associated with certain fields. The information served by the Documentation Explorer matches the information displayed when you hover over a field in the query editor on the lefthand side. To open the Documentation Explorer while it is toggled closed, click the “Docs” link in the upper righthand corner of GraphiQL.
GraphiQL Explorer
Packaged with GraphiQL’s query editor is GraphiQL Explorer, a convenient introspection interface for developers to see what fields are available for a given query, along with nested fields within them. You can also use GraphiQL Explorer to construct a query by clicking on available fields and inputs, rather than writing out the query by hand. For developers who prefer more of a graphical query building experience, GraphiQL Explorer is a convenient tool (see Figure 4-4).
GraphiQL Explorer is particularly useful for advanced queries that require complex logic—especially unions, which are generally left up to the GraphQL implementation and lack a unified standard, and inline fragments, which can be frustrating for developers new to GraphQL to work with. GraphiQL Explorer lists all the available union types within the Explorer view and makes it easy to test inline fragments.
Gatsby also includes support for code snippet generation based on GraphiQL Explorer through GraphiQL’s Code Exporter. Rather than generating just the GraphQL query, which needs to be integrated with a Gatsby component, the Code Exporter is capable of generating a Gatsby page or component file based on a query constructed in GraphiQL Explorer.
Note
For more information about GraphiQL Explorer and GraphiQL’s Code Exporter, consult Michal Piechowiak’s blog post on the subject.
GraphQL Playground
Though GraphiQL is useful for most developers’ requirements, sometimes it’s important to have a more fundamental understanding of a GraphQL schema, at the level of how data is served to the schema by external data sources through source plugins. Exploring how schemas are constructed for GraphQL can help identify deeper-seated issues that require work within external data sources rather than within the schema itself.
GraphQL Playground, developed by Prisma, is an integrated development environment (IDE) for GraphQL queries that provides much more power than GraphiQL. It considers the logic of how data enters GraphQL schemas, rather than just allowing query testing like GraphiQL, but it requires installation and isn’t available out of the box in Gatsby. GraphQL Playground provides a range of useful tools, but for the time being it remains an experimental feature in Gatsby.
To use GraphQL Playground with Gatsby, add the GATSBY_GRAPHQL_IDE
flag to the value associated with the develop
script in your Gatsby site’s package.json file:
"develop"
:
"GATSBY_GRAPHQL_IDE=playground gatsby develop"
,
Now, instead of running gatsby develop
, run the following command, and when you visit the URL https://localhost:8000/___graphql you’ll see the GraphQL Playground interface instead of GraphiQL (Figure 4-5):
$
npm
run
develop
Tip
If you’re developing Gatsby sites on Windows, you will first need to install cross-env
(npm install --save-dev cross-env
) and change the value associated with the develop
script in your package.json to the following instead:
"develop"
:
"cross-env GATSBY_GRAPHQL_IDE=playground
gatsby develop"
,
Now that you have a solid understanding of the available developer tools for working with GraphQL in Gatsby, we can finally turn our attention to Gatsby’s page and component queries, the most important GraphQL building blocks in Gatsby.
Note
To continue using gatsby develop
to instantiate your local development environment instead of npm run develop
, add the dotenv
package to your gatsby-config.js file and, separately, add an environment variable file. Because we are concerned with the development environment here, name the file .env.development, and add the following line to it:
GATSBY_GRAPHQL_IDE=playground
Page and Component Queries
As we’ve seen in previous chapters, Gatsby works with both pages and components. Up to now, when exploring how we can build pages and components for Gatsby sites, we’ve always explicitly provided the data within the JSX that renders that data rather than pulling data from external sources or from the surrounding filesystem.
In this chapter’s introduction, we defined Gatsby’s data layer as the mediator between data, whether it originates from the local filesystem or from an external source like a database or CMS, and the rendering that occurs within Gatsby’s pages and components. Now, we can connect the dots and see how this rendering happens.
Whenever Gatsby sees a GraphQL query conforming to Gatsby’s standard GraphQL approach, it will parse, evaluate, and inject the query response into the page or component from which the query originates.
For the remainder of this chapter, we’ll focus our attention primarily on GraphQL queries that work with the surrounding filesystem, as Gatsby can pull data from Markdown or other files that it can access. In Chapter 5, we’ll discuss source plugins, which Gatsby employs to retrieve data from external systems and to populate a GraphQL schema.
Page Queries
In Gatsby, pages can be rendered using no data at all (if the data is hardcoded) or using data brought in via GraphQL queries. GraphQL queries in Gatsby pages are known as page queries, and they have a one-to-one relationship with a given Gatsby page. Unlike Gatsby’s static queries, which we’ll examine in the following sections, page queries can accept GraphQL query variables like those we saw in “GraphQL Query Variables”.
Gatsby makes available a graphql
tag for arbitrary GraphQL queries defined within a Gatsby page or component. To see this in action, let’s create a new Gatsby blog based on the Gatsby blog starter. Because it already comes with a source plugin enabled, we can jump right in and look at some of the GraphQL queries contained in its pages:
$
gatsby
new
gtdg-ch4-graphql
gatsbyjs/gatsby-starter-blog
$
cd
gtdg-ch4-graphql
Open src/pages/index.js, one of our Gatsby pages, and let’s go through it step by step:
// src/pages/index.js
import
React
from
"react"
import
{
Link
,
graphql
}
from
"gatsby"
import
Bio
from
"../components/bio"
import
Layout
from
"../components/layout"
import
SEO
from
"../components/seo"
const
BlogIndex
=
(
{
data
,
location
}
)
=
>
{
const
siteTitle
=
data
.
site
.
siteMetadata
?
.
title
||
`
Title
`
const
posts
=
data
.
allMarkdownRemark
.
nodes
if
(
posts
.
length
===
0
)
{
return
(
<
Layout
location
=
{
location
}
title
=
{
siteTitle
}
>
<
SEO
title
=
"All posts"
/
>
<
Bio
/
>
<
p
>
No
blog
posts
found
.
Add
Markdown
posts
to
"content/blog"
(
or
the
directory
you
specified
for
the
"gatsby-source-filesystem"
plugin
in
gatsby
-
config
.
js
)
.
<
/
p
>
<
/
Layout
>
)
}
return
(
<
Layout
location
=
{
location
}
title
=
{
siteTitle
}
>
<
SEO
title
=
"All posts"
/
>
<
Bio
/
>
<
ol
style
=
{
{
listStyle
:
`
none
`
}
}
>
{
posts
.
map
(
post
=
>
{
const
title
=
post
.
frontmatter
.
title
||
post
.
fields
.
slug
return
(
<
li
key
=
{
post
.
fields
.
slug
}
>
<
article
className
=
"post-list-item"
itemScope
itemType
=
"http://schema.org/Article"
>
<
header
>
<
h2
>
<
Link
to
=
{
post
.
fields
.
slug
}
itemProp
=
"url"
>
<
span
itemProp
=
"headline"
>
{
title
}
<
/
span
>
<
/
Link
>
<
/
h2
>
<
small
>
{
post
.
frontmatter
.
date
}
<
/
small
>
<
/
header
>
<
section
>
<
p
dangerouslySetInnerHTML
=
{
{
__html
:
post
.
frontmatter
.
description
||
post
.
excerpt
,
}
}
itemProp
=
"description"
/
>
<
/
section
>
<
/
article
>
<
/
li
>
)
}
)
}
<
/
ol
>
<
/
Layout
>
)
}
export
default
BlogIndex
export
const
pageQuery
=
graphql
`
query
{
site
{
siteMetadata
{
title
}
}
allMarkdownRemark
(
sort
:
{
fields
:
[
frontmatter___date
]
,
order
:
DESC
}
)
{
nodes
{
excerpt
fields
{
slug
}
frontmatter
{
date
(
formatString
:
"MMMM DD, YYYY"
)
title
description
}
}
}
}
`
-
This
import
statement brings in Gatsby’s<Link />
component andgraphql
tag. -
Here the connection is made between our GraphQL query and the
data
variable, which is populated with the props generated by the response to our GraphQL query, and in the same shape as our query. -
This code performs all of our rendering in JSX.
-
Note that we are using an
export
statement to ensure that Gatsby is aware of our GraphQL query. The name of the constant (here,pageQuery
) isn’t important, because Gatsby inspects our code for an exportedgraphql
string rather than a specific variable name. Many page queries in the wild are simply namedquery
. Only one page query is possible per Gatsby page. Also note that we are using a tagged template surrounded by backticks, allowing for a multiline string that contains our GraphQL query and is indicated by Gatsby’sgraphql
tag. The contents of these backticks must be a valid GraphQL query in order for Gatsby to successfully populate the data in the page. -
This is the GraphQL query that populates the home page of our Gatsby blog starter.
Let’s focus on that second section:
// src/pages/index.js
const
BlogIndex
=
({
data
,
location
})
=>
{
const
siteTitle
=
data
.
site
.
siteMetadata
?
.
title
||
`Title`
Rather than hardcoding data like we did in earlier chapters, we can now dig further into our data
object to access all the data we need. Let’s see this in action with a simple example that pulls from our Gatsby site information in gatsby-config.js. Change the preceding lines to use our description from gatsby-config.js instead:
// src/pages/index.js
const
BlogIndex
=
(
{
data
,
location
}
)
=>
{
const
siteTitle
=
data
.
site
.
siteMetadata
?
.
description
||
`
Description
`
We also need to update our pageQuery
GraphQL query to retrieve the site description as well as the title:
// src/pages/index.js
export
const
pageQuery
=
graphql
`
query { site { siteMetadata { title
description
} } allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }) { nodes { excerpt fields { slug } frontmatter { date(formatString: "MMMM DD, YYYY") title description } } } }
`
Now, when you save the file and execute gatsby develop
, you’ll see that the blog title has been updated to reflect the description text (“A starter blog demonstrating what Gatsby can do.”) rather than the title (“Gatsby Starter Blog”).
Page queries make up the majority of GraphQL queries you’ll construct while building rudimentary Gatsby sites. But what about GraphQL queries contained within components that aren’t pages? We’ll look at that next.
Component Queries with StaticQuery
As of Gatsby v2, Gatsby also allows individual components contained within a page to retrieve data using a GraphQL query through the StaticQuery
API. This is particularly useful when you split out a component from a surrounding page but require external data for just that component. We call these component queries.
Though StaticQuery
is capable of handling most of the use cases page queries already address, including fragments, static queries differ from page queries in several critical ways:
-
Although page queries can accept query variables, they do not function outside of Gatsby pages.
-
Static queries cannot accept query variables (this is why they’re called “static”), but they can be used in both pages and in-page components.
-
The
StaticQuery
API does not work properly withReact.createElement
invocations that fall outside of JSX’s purview. For these cases, Gatsby recommends using JSX and, if needed, explicitly usingStaticQuery
in a JSX element (<StaticQuery />
).
Static queries share one characteristic with page queries: only one static query can be used per component, just as only one page query can be used per page. Therefore, if you have separate data requirements in another portion of the component, you will need to split that logic out into another component before adding a new static query.
Importantly, static queries provide the same benefits of co-location within components that page queries do within pages. Using the StaticQuery
API allows you to both issue a query and render the data from the response in a single JSX element. Consider the following example, which demonstrates this co-location:
// src/components/header.js
import
React
from
"react"
import
{
StaticQuery
,
graphql
}
from
"gatsby"
export
default
function
Header
()
{
return
(
<
StaticQuery
query
=
{
graphql
`
query
{
site
{
siteMetadata
{
title
}
}
}
`
}
render
=
{
data
=>
(
<
header
>
<
h1
>{
data
.
site
.
siteMetadata
.
title
}</
h1
>
</
header
>
)}
/>
)
}
As you can see, using the <StaticQuery />
JSX element gives us the query
attribute, whose value is our component query, and the render
attribute, whose value represents how we want the response data from the component query to figure into our component’s rendering within a function.
Now that we have a means of issuing component queries, not just page queries, we’re all set! But there is one outstanding question remaining, particularly for developers who have adopted the React Hooks paradigm: how can we use a React hook to define a component query rather than a JSX element?
Warning
If you are performing type checking through PropTypes
, a common API in React applications, using the <StaticQuery />
JSX element will break this. For an example of how to restore PropTypes
type checking, consult the Gatsby documentation.
Component Queries with the useStaticQuery Hook
As of Gatsby v2.1.0, a separate means of accessing component queries is available in the form of the useStaticQuery
hook. For readers unfamiliar with the React Hooks paradigm, React hooks are methods to access state information and other key React features without having to create a class. In short, the useStaticQuery
hook accepts a GraphQL query and returns the response data. It can be used in any component, including pages and in-page components.
The useStaticQuery
hook has a few limitations, just like <StaticQuery />
:
-
The
useStaticQuery
hook cannot accept query variables (again, this is why it is called “static”). -
As with page queries and static queries, only one
useStaticQuery
hook can be used per component.
Tip
You must have React and ReactDOM 16.8.0 or later to use the useStaticQuery
hook. If you’re using an older version of Gatsby, run this command to update your React and ReactDOM versions to the appropriate version:
$ npm install react@^16.8.0 react-dom@^16.8.0
Let’s take another look at the example component from the previous section. This time we’ll use the useStaticQuery
hook instead of <StaticQuery />
:
// src/components/header.js
import
React
from
"react"
import
{
useStaticQuery
,
graphql
}
from
"gatsby"
export
default
function
Header
()
{
const
data
=
useStaticQuery
(
graphql
`
query
{
site
{
siteMetadata
{
title
}
}
}
`
)
return
(
<
header
>
<
h1
>{
data
.
site
.
siteMetadata
.
title
}</
h1
>
</
header
>
)
}
React hooks have become popular because developers can easily use them to create chunks of repeatable functionality, much like helper functions. Because useStaticQuery
is a hook, we can leverage it to compose and also recycle blocks of reusable functionality rather than invoking that functionality every time.
One common example of this is to create a hook that will provide data for reuse in any component so that the query is only issued once. This shortens the build duration and means that our Gatsby site can deploy slightly faster. For instance, we may want to only query for our site title once. In the following hook definition, we’ve created a React hook that can be reused in any component:
// src/hooks/use-site-title.js
import
{
useStaticQuery
,
graphql
}
from
"gatsby"
export
const
useSiteTitle
=
()
=>
{
const
{
site
}
=
useStaticQuery
(
graphql
`
query {
site {
siteMetadata {
title
}
}
}
`
)
return
site
.
siteMetadata
}
Now, we can import this React hook into our header component and invoke it there to get our Gatsby site title:
// src/components/header.js
import
React
from
"react"
import
{
useSiteTitle
}
from
"../hooks/use-site-title"
export
default
function
Header
()
{
const
{
title
}
=
useSiteTitle
()
return
(
<
header
>
<
h1
>{
title
}</
h1
>
</
header
>
)
}
Note
Consult the React documentation for more information about React Hooks.
Equipped with an understanding of page queries and component queries using either <StaticQuery />
or the useStaticQuery
hook, we now have a variety of approaches to query data from within our Gatsby pages and components.
Note
For more information about Gatsby’s GraphQL APIs, which are inspectable in the GraphiQL interface, consult the GraphQL API, query options, and the Node model and Node
interface documentation.
Conclusion
This chapter introduced GraphQL and Gatsby’s internal data layer, covering both the principles underlying GraphQL queries and APIs and how GraphQL appears in Gatsby in the form of page and component queries. Though it’s possible to use Gatsby without GraphQL, this is where much of the power inherent to Gatsby comes from, because it mediates the relationship between data—whether it originates from an external source or a local filesystem—and its rendering.
The way we write GraphQL queries in Gatsby with the graphql
tag and exported queries is by design when it comes to a favorable developer experience and separation of concerns. Because GraphQL queries usually sit alongside rendering code in Gatsby components, Gatsby’s internal data layer also facilitates the sort of co-location of data requirements and rendering logic that many React developers consider just as much a best practice as testing your queries in GraphiQL or GraphQL Playground.
But where exactly does all this data come from? What does Gatsby do to retrieve data from disparate external sources or the filesystem and populate its internal GraphQL schema? How can we connect CMSs and commerce systems to our Gatsby sites? Next, we cover source plugins and sourcing data—all of the ways Gatsby gets its hands on external data.
Get Gatsby: The Definitive Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.