There are plenty of resources out there for preparing for PostgreSQL interview questions. Most posts are for technical interviews with a focus on PostgreSQL, however many just cover the basics and the advanced resources often conflate transactional SQL with analytical SQL (WINDOW/RANK functions, aggregates etc.).
Here, we're going to focus on PostgreSQL interview questions that are aimed to understand the transactional side of PostgreSQL, and offer some areas that you may want to go a little deeper on in order to really impress your interviewer (and more importantly, become a kick-ass software engineer).
1. Modeling
Know how to model 1-M, M-M, 1-1 relationships. And know how to use foreign keys.
A review of database normalization is a great place to start when thinking about how to correctly model relationships. However if you don't have the time to read through lecture notes, head over to DBDesigner and inspect their example schema. The table StudentCourses is a great example of how to model a Many-to-Many relationship, by using a join table. (side note: you can export these visual schemas to SQL using Ctrl+E). Modelling your data correctly is arguably the most important part of any software project, writing applications becomes a breeze if you can get the data layer right.
Know how to use pg rich type system: arrays, domains, JSONB, timestamptz, enums
Postgres has tons of useful types beyond the basics, knowing how to use them will show you can leverage the true power of Postgres. JSONB for example can be incredibly useful for storing non-structured data, which you can query using syntax like:
1-- grades = {'geography': 'A', 'history': 'B', 'postgres': 'A++'}
2SELECT * FROM students WHERE grades->>'geography' = 'A';
3
Know about namespacing with SCHEMAs
In Supabase for example we keep system schemas such as extensions
and auth
in separate schemas so that we don't pollute the default public
schema.
2. INDEXes
Know how speed up queries with indexes.
The art of indexing in Postgres could fill an entire book. In some circumstances it can happen that a bad index is worse for performance than no index, so it's worth spending a little time to learn some of the common strategies.
An index can be simple, for example, if your students table is most frequently queried on surname alone, you create an index:
1CREATE INDEX idx_students_surname 2ON students(surname); 3
The default index type used here is btree
(you could have specified this as USING btree
), but there are other types of indexes, such as BRIN
, GiST
, GIN
, hash
, and more. Readers wanting to go deeper may also want to explore Partial or Multicolumn Indexes.
Know how to analyze with EXPLAIN ANALYZE
Running
1EXPLAIN (ANALYZE) SELECT *
2FROM students
3WHERE surname = 'Krobb';
4
Before and after adding your index will show you the difference in approach the query planner took to finding your data. Note that using EXPLAIN alone will give us estimated plan costs. When used together with ANALYZE like: EXPLAIN (ANALYZE)
you will receive both estimated and actual costs.
3. VIEWs
Know how to create different representations of data with VIEWs.
We might create a VIEW transcripts
which pulls out data from students
, courses
, and grades
. It's useful for security, and logical abstractions. Check out our longer post on VIEWs here: /blog/2020/11/18/postgresql-views. Some purists may argue that you should always query VIEWs and never TABLEs.
Know about Autoupdatable views.
If a VIEW is named as the target relation in an INSERT, UPDATE, or DELETE and only SELECTs from a single base relation, then the underlying subquery is automatically rewritten to update the underlying base relation instead.
Limitations on VIEWs
One example of a limitation is when a VIEW is not Autoupdatable. This happens when the VIEW does not SELECT from a single base relation. If the user does not specify an INSTEAD OF trigger that upgrades the underlying query, then an error will be thrown, since the executor cannot update a view as such.
4. ROLEs
Know how to secure their database. Permissions at the table, column, row level.
All databases have different user types, your client for example doesn't usually need the ability to create and drop schemas, but your DB admin does. You should play around with creating roles, and granting various permissions.
Know about ROLEs, application ROLEs, the PUBLIC role, and GRANTs
In Postgres, the special “role” name PUBLIC can be used to grant a privilege to every role on the system. For example, if you want to grant insert access to all users on table students:
1GRANT INSERT ON students TO PUBLIC; 2
Know how to do RLS - Policies
We use Row Level Security in Supabase as a way to grant/restrict access on a row level basis. For example if you're writing a Discord clone, perhaps only a given user should be able to write their own messages:
1CREATE POLICY "Individuals can only write their own messages." ON messages FOR
2 INSERT WITH CHECK (auth.uid() = user_id);
3
4-- auth.uid() is a function provided by Supabase which plucks the uid out
5-- of the JWT sent along with an API request more on this here:
6-- https://www.youtube.com/watch?v=0LvCOlELs5U
7
5. FUNCTIONs
Know how to do business logic on SQL/PLPGSQL
PL/pgSQL is a procedural programming language that can be used to write functions inside of your database. It can be useful for making remote procedure calls from an API. You can go as deep as you want here, since it's an entire programming language, but understanding the basics will really go a long way, and give you super powers when working with your data.
You can use FUNCTIONs in combination with TRIGGERs to do cool stuff like have auto-updating updated_at
columns on your data:
1-- a function that sets the updated_at value to now()
2CREATE FUNCTION set_updated_at()
3RETURNS TRIGGER AS $$
4BEGIN
5 new.updated_at = now();
6 RETURN NEW;
7END;
8$$ LANGUAGE plpgsql;
9
10-- a trigger that fires when students table is updated
11CREATE TRIGGER handle_updated_at
12 BEFORE UPDATE ON students
13 FOR EACH ROW
14 EXECUTE PROCEDURE set_updated_at();
15
Thinks in SETs when doing logic
Whilst PL/pgSQL does have loops and cursors, there is usually a faster and more legible pure SQL based solution available using JOIN/UNION etc. So it's important to become well acquainted with thinking in these terms.
Know how to use CTEs
Common table expressions are temporary or intermediate result sets. They can make your queries more readable and even enable recursion. The typical form is:
1WITH ten_strumpers AS (
2 SELECT id, first_name
3 FROM students
4 WHERE surname = 'Strumper'
5 ORDER BY first_name
6 LIMIT 10
7)
8SELECT id
9FROM ten_strumpers
10WHERE first_name LIKE "S%";
11
If you can reason about most of the topics in this post then you'll be in a very strong position to impress with your answers to PostgreSQL interview questions. As with all programming topics however, the real learning starts when you put these things into practice. At Supabase we offer a very very fast (the fastest?) way to spin up a PostgreSQL database and start querying it, and our browser based SQL editor is getting more powerful every day.
Get started on Supabase for free here
Thanks Steve Chavez for providing all the good bits of this post :)