Create Massive Amounts of Fake Data Using GraphQL Schemas
Have you ever found yourself in need of fake user profiles for testing your app? Perhaps you’re racing
against the clock in a hackathon, striving to develop a proof of concept without the necessary data for a demo.
Enter gqlfake
.
gqlfake
is your command-line companion, simplifying the creation of structured, synthetic data using
GraphQL schemas to define fields and data types.
Installation
To install and use gqlfake
, you must have Node.js installed.
We can install gqlfake
with npm
:
npm install gqlfake --location=global
This command will globally install gqlfake
so you can access the CLI tool in the terminal
from any path.
Now that we have gqlfake
installed, let’s take a look at a quick example on how to use it.
Generating Shaped Fake Data
Say we have a GraphQL schema file titled schema.graphql
with the following content:
type User {
name: String
avatar_url: String
}
The above schema defines a User
type with specific attributes. Now, let’s see what we’d have to
do if we wanted to generate 100 such fake User
objects.
Adding Directives to the Schema
To let gqlfake
know what kind of data to generate for each field, we’ll
have to use directives and a little
FakerJS magic.
Edit schema.graphql
to contain the following:
type User {
name: String @generate(code: "return faker.person.fullName()")
avatar_url: String @generate(code: "return faker.internet.avatar()")
}
Here, we attach the @generate
directive next to both fields and pass in the code
argument.
The code
argument is a string containing a call to any valid javascript. It
lets gqlfake
know what kind of data to populate each field with (for example, you wouldn’t want
the name field to be populated by an email, so we explicitly use the code
argument to
specify what genre of data a specific field needs to have).
Now that we’ve setup our schema file correctly, we can use gqlfake
to generate
fake but realistic data. The gqlfake generate
command allows us to do this:
gqlfake generate --schema-path ./schema.graphql --num-documents 100
This creates a JSON file containing 100 User
objects. This file will be stored
in a newly created directory titled datagen
(if you run this command twice,
the datagen
directory won’t be deleted, but the JSON file will be overwritten with
new fake data).
INFO:
--schema-path
to -s
and --num-documents
to
-n
when passing in command line arguments to gqlfake
.
Here is an example of what this file will look like:
[
{
"name": "Beverly Block",
"avatar_url": "https://cloudflare-ipfs.com/ipfs/Qmd3W5DuhgHirLHGVixi6V76LhCkZUz6pnFt5AJBiyvHye/avatar/624.jpg"
},
{
"name": "Wilson Zulauf",
"avatar_url": "https://cloudflare-ipfs.com/ipfs/Qmd3W5DuhgHirLHGVixi6V76LhCkZUz6pnFt5AJBiyvHye/avatar/684.jpg"
},
{
"name": "Kirk Kris",
"avatar_url": "https://cloudflare-ipfs.com/ipfs/Qmd3W5DuhgHirLHGVixi6V76LhCkZUz6pnFt5AJBiyvHye/avatar/866.jpg"
},
...97 more
]
Sharing Variables
You can also define and share code snippets across multiple fields and types. Let’s take
a look at an example with our User
type.
Modify your schema.graphql
file to contain the following:
type User {
firstName: String
@generate(
code: """
firstNameOfUser = faker.person.firstName()
return firstNameOfUser
"""
)
lastName: String
@generate(
code: """
lastNameOfUser = faker.person.lastName()
return lastNameOfUser
"""
)
emailID: String
@generate(
code: """
return faker.internet.email({
firstName: firstNameOfUser,
lastName: lastNameOfUser
})
"""
)
}
In the above example, we generate a fake firstName
and a lastName
for each
User
. We also store this data in variables called firstNameOfUser
and lastNameOfUser
. This allows us to use the generated first and last names
in the emailID
where we generate a realistic email using the two pieces
of data.
CAUTION:
const
, let
, or var
when defining variables.
If you use these variable keywords, the variables you define will only
be usable within that specific code snippet.
Compiling Generated Data Into One File
There may be cases where you want to compile all the generated data into one file.
This is especially useful if you want to, for example, start up a fake API
with tools such as json-server
.
We can compile data generated by the gqlfake generate
command using
the gqlfake compile
command.
Let’s say our schema.graphql
contains the following:
type Book {
id: ID!
title: String!
authorID: ID!
publicationYear: Int!
}
type Movie {
id: ID!
title: String!
directorID: ID!
releaseYear: Int!
genre: String!
}
We can now run the gqlfake generate
command with the following options:
gqlfake generate --schema-path ./schema.graphql --num-documents 2
We get two resulting JSON files, both in the datagen
directory:
Book.json
:
[
{
"id": "0bbf3f82-794e-4f05-bf30-9f269683c5a1",
"title": "odit sint veniam",
"authorID": "085d8b32-b0f9-4883-aef7-f16233c6a235",
"publicationYear": 1987
},
{
"id": "b44d9e5e-17e4-43e6-ad66-8d4ad9b305aa",
"title": "perspiciatis magnam ea",
"authorID": "9b7a9732-6a42-45ef-9948-51632fafeb7a",
"publicationYear": 1987
}
]
Movie.json
:
[
{
"id": "640a0d20-3e37-477e-88f9-0ecbd22b8176",
"title": "repudiandae id eius",
"directorID": "3b67a142-29bb-4a5a-bdab-bbb6bae7d118",
"releaseYear": 1987,
"genre": "exercitationem repellat"
},
{
"id": "c6711541-360f-470d-8b89-60f5a7ad8700",
"title": "repudiandae placeat voluptates",
"directorID": "f8643b5e-c464-4ebc-9848-29661418aa77",
"releaseYear": 1987,
"genre": "quibusdam accusamus"
}
]
Let us now compile all the generated data into one file with
gqlfake compile
. Run the following command:
gqlfake compile --output-path ./data.json
Executing this command creates a data.json
file that contains
the following:
{
"books": [
{
"id": "0bbf3f82-794e-4f05-bf30-9f269683c5a1",
"title": "odit sint veniam",
"authorID": "085d8b32-b0f9-4883-aef7-f16233c6a235",
"publicationYear": 1987
},
{
"id": "b44d9e5e-17e4-43e6-ad66-8d4ad9b305aa",
"title": "perspiciatis magnam ea",
"authorID": "9b7a9732-6a42-45ef-9948-51632fafeb7a",
"publicationYear": 1987
}
],
"movies": [
{
"id": "640a0d20-3e37-477e-88f9-0ecbd22b8176",
"title": "repudiandae id eius",
"directorID": "3b67a142-29bb-4a5a-bdab-bbb6bae7d118",
"releaseYear": 1987,
"genre": "exercitationem repellat"
},
{
"id": "c6711541-360f-470d-8b89-60f5a7ad8700",
"title": "repudiandae placeat voluptates",
"directorID": "f8643b5e-c464-4ebc-9848-29661418aa77",
"releaseYear": 1987,
"genre": "quibusdam accusamus"
}
]
}
We can now use json-server
to serve up a mock API using data.json
.
Install json-server
with:
npm install json-server --location=global
We can start the server with:
json-server ./data.json --watch
This mock API can now be used by, for example, your frontend code to display the generated data.
Executing Initial Code per Type
There may be cases where you want to execute some initial code in
each type
before the data for each field is generated.
Here’s an example where we want the id
of each User
object
to be incremented every time one is generated.
schema.graphql
:
type User
@init(
code: """
count = 0
"""
) {
id: Int
@generate(
code: """
count += 1
return count
"""
)
fullName: String @generate(code: "return faker.person.fullName()")
}
The above example uses the init
directive on the User
type to initialize the variable
count
to 0. (Notice how we don’t use any variable initialization keywords
like const
, let
, or var
because we want the count
variable to be accessible
in the different fields).
After this, every time an id
is generated, we run code to increment the
count by 1, and return its value.
When gqlfake generate
is run with the appropriate options, the below
JSON file is generated:
[
{
"id": 1,
"fullName": "Elizabeth Ankunding"
},
{
"id": 2,
"fullName": "Ashley Stehr"
},
{
"id": 3,
"fullName": "Rosalie Kessler"
}
...
]
As you can see, the id
field is incremented on the
generation of each User
object.
Using External Dependencies and Libraries
You may also want to use external dependencies in the code you write
within @generate
directives. In that case, you can point
gqlfake
to a Javascript file which exports your dependencies.
To do this, we first need to create our Javascript file which will import our necessary dependencies and export them at the bottom of the file.
myDependencies.js
:
const axios = require('axios')
// Export the required dependencies
module.exports = {
axios: axios
}
Now that we’ve exported our dependencies, we can use them in our
GraphQL schema’s @generate
directives:
type User {
fullName: String @generate(code: "return faker.person.fullName()")
favoriteQuote: String
@generate(
code: """
const response = await axios.get('https://api.quotable.io/random')
return response.data.content
"""
)
}
In the above schema, we use axios to get a random quote and set it as
the favoriteQuote
for a User
.
gqlfake
supports allows you to use top-level await
, so you don’t
have to create an async
function to use the await
keyword.
To generate our data, we use the gqlfake generate
command with
the --dependency-script
option:
gqlfake generate -s ./schema.graphql -n 3 --dependency-script ./myDependencies.js
gqlfake
imports the dependencies exported by the file pointed to by
--dependency-script
so they can be used in your GraphQL schema.
The above command generates a file called User.json
with the following
content:
[
{
"fullName": "Jimmie Gleason",
"favoriteQuote": "The highest stage in moral culture at which we can arrive is when we recognize that we ought to control our thoughts."
},
{
"fullName": "Jody Rogahn II",
"favoriteQuote": "If you accept the expectations of others, especially negative ones, then you never will change the outcome."
},
{
"fullName": "Dr. Jody Thompson",
"favoriteQuote": "I love wisdom. And you can never be great at anything unless you love it. Not be in love with it, but love the thing, admire the thing. And it seems that if you love the thing, and you don't just want to possess it, it will find you."
}
]
The favoriteQuote
property for each User
object was fetched using
axios
from api.quotable.io.
Exporting Data to Cloud Databases
Exporting data generated by gqlfake
to cloud databases can be useful if
you are trying to mock features that require web services (CRON jobs,
Cloud Functions, etc.).
Supported Cloud Databases
Google Cloud Firestore
Exporting to Cloud Firestore is as easy as running a single command:
gqlfake export-firestore --keypath ./serviceAccountKey.json
INFO:
Note that --keypath
is a required option that can be abbreviated to -k
.
It points gqlfake
to your service account key file. A service account key
is a form of credential that allows access to your web resources from
any environment. Google allows you to generate such keys for your
Google Cloud Projects. Learn more about service account keys and how to generate
them here.
Conclusion
gqlfake
is a powerful tool for generating massive amounts of fake data using GraphQL schemas.
Whether you’re in a hackathon crunch or need synthetic data for testing and development,
gqlfake
simplifies the process. By combining GraphQL schemas, directives, and the flexibility
of JavaScript code snippets, you can generate structured and realistic data.