Today we're open sourcing pg_graphql
, a work-in-progress native PostgreSQL extension adding GraphQL support. The extension keeps schema generation, query parsing, and resolvers all neatly contained on your database server requiring no external services.
pg_graphql
inspects an existing PostgreSQL schema and reflects a GraphQL schema with resolvers that are:
- performant
- always up-to-date
- compliant with best practices
- serverless
- open source
Interested? You're 3 commands away from a live GraphiQL demo.
Motivation
The Supabase stack is centered around PostgreSQL as the single source of truth. All data, configuration, and security are housed in the database so any GraphQL solution needed to be equivalently SQL-centric.
With that in mind, we took a look at the landscape and considered two excellent technologies, Graphile, and Hasura.
Requirements | Graphile | Hasura |
---|---|---|
Open Source | ✅ | ✅ |
Reflected GraphQL Schema | ✅ | ✅ |
Reflected Resolvers | ✅ | ✅ |
Always up-to-date | ✅ | ✅ |
Performant | ✅ | ✅ |
We found both options to be largely viable for the core feature set.
Which left us with one final hang-up: we host free-tier projects on VMs with 1 GB of memory. After tallying the resources reserved for PostgreSQL, PostgREST, Kong, GoTrue, and a handful of smaller services, we were left with a total memory budget of ... 0 MB 😬. Unsurprisingly, our pathological memory target disqualified any option that required launching another process in those VMs.
For that reason, we decided to invest in a lightweight alternative that runs in the database, and can be exposed over HTTP using the existing PostgREST deployments' RPC functionality.
By our most conservative estimate, that reduces the platform's memory requirements by 525 TB/hours every month, saving 💰 and 🌳.
Design
As a native PostgreSQL extension, pg_graphl
is written in a combination of C and SQL. Each GraphQL query is parsed, validated, and transpiled to SQL, all within the database.
Each GraphQL request is resolved by a single SQL statement. That SQL statement aggregates requested data as a JSON document to return to the caller. This approach results in blazing fast response times, avoids the N+1 query problem, and hits the theoretical minimum achievable network IO overhead of any GraphQL to SQL resolver. No special permissions are required for the PostgreSQL role executing queries, so pg_graphql
is fully compatible with your existing row level security policies.
Embedding the GraphQL server directly in the database allows us to leverage PostgreSQL's built-in solutions for common challenges:
Caching → PREPARE STATEMENT
Errors → RAISE EXCEPTION
Bad Data → ROLLBACK
Authorization → CREATE POLICY
Similarly, pg_graphql
benefits from PostgreSQL's strong ACID guarantees and can expose them through its API.
Ever wanted to execute multiple operations in a single transaction? Each request is managed in a single transaction so with a multi-operation GraphQL request and pg_graphql
, that behavior falls out for free!
Schema Reflection
As a limited example of how the reflection engine works, here's how it converts a single table into a full GraphQL schema.
1# schema.sql
2create table account(
3 id serial primary key,
4 email varchar(255) not null,
5 created_at timestamp not null,
6 updated_at timestamp not null
7);
8
Translates into
1# schema.graphql
2scalar Cursor
3scalar DateTime
4scalar JSON
5scalar UUID
6scalar BigInt
7
8type PageInfo {
9 hasNextPage: Boolean!
10 hasPreviousPage: Boolean!
11 startCursor: String!
12 endCursor: String!
13}
14
15type Query {
16 account(nodeId: ID!): Account
17 allAccounts(
18 after: Cursor,
19 before: Cursor,
20 first: Int,
21 last: Int
22 ): AccountConnection
23}
24
25type Account {
26 nodeId: ID!
27 id: String!
28 email: String!
29 createdAt: DateTime!
30 updatedAt: DateTime!
31}
32
33type AccountEdge {
34 cursor: String!
35 node: Account
36}
37
38type AccountConnection {
39 totalCount: Int!
40 pageInfo: PageInfo!
41 edges: [AccountEdge]
42}
43
Where Query
type's account
field selects a single account by its globally unique ID
and allAccounts
enables pagination via the relay connections specification. Under the SQL hood, iterating through pages is handled using keyset pagination giving consistent retrieval times on every page.
For a more complete examples with relationships, enums, and more exotic types check out the reflection doc.
API
pg_graphql
's public API is a single SQL function that returns JSON.
1gql.resolve( 2 stmt text, -- the graphql query/mutation 3 variables jsonb default '{}'::jsonb, -- key value pairs 4) 5 returns jsonb 6
For example, a GraphQL query selecting the id
field for a collection of type Book
would look like this:
1gqldb= select gql.resolve($$
2
3query {
4 allBooks {
5 edges {
6 node {
7 id
8 }
9 }
10 }
11}
12
13$$);
14
15 resolve
16----------------------------------------------------------------------
17{"data": {"allBooks": {"edges": [{"node": {"id": 1}}]}}, "errors": []}
18
We're opting to expose the function over HTTP through PostgREST but you could also connect to the PostgreSQL database and call the function directly from your server code in any programming language.
Performance
When it comes to APIs, performance counts. Here are some figures from Apache Bench showing 2,205 requests/second on a 4 core machine with 16 GB of memory.
1Concurrency Level: 8 2Time taken for tests: 3.628 seconds 3Complete requests: 8000 4Failed requests: 0 5Total transferred: 1768000 bytes 6Total body sent: 1928000 7HTML transferred: 368000 bytes 8Requests per second: 2205.21 [#/sec] (mean) 9Time per request: 3.628 [ms] (mean) 10Time per request: 0.453 [ms] (mean, across all concurrent requests) 11Transfer rate: 475.93 [Kbytes/sec] received 12
Full steps to reproduce this output are available in the docs.
Open Source
pg_graphql
is open source software. As always, Issues and PRs are welcome.
Try pg_graphql
today to see a live GraphiQL demo.