Believe it or not, I don't think that title is clickbait.
There is a set of things that you can do when working with a Postgres database which I have found made my and my coworker's lives much more pleasant. Each one is by itself small, but in aggregate have a noticeable effect.
UUIDs have downsides
But I've found those to be far outweighed by the upsides
CREATE TABLE person(
id uuid not null default gen_random_uuid() primary key,
not null
name text )
It's not a full history, but knowing when a record was created or last changed is a useful breadcrumb when debugging. Its also something you can't retroactively get unless you were recording it.
So just always slap a created_at
and updated_at
on your tables. You can maintain updated_at
automatically with a trigger.
CREATE TABLE person(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null
name text
);
CREATE FUNCTION set_current_timestamp_updated_at()
TRIGGER AS $$
RETURNS DECLARE
record;
_new BEGIN
:= NEW;
_new "updated_at" = now();
_new.RETURN _new;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER set_person_updated_at
BEFORE UPDATE ON person
FOR EACH ROW
EXECUTE PROCEDURE set_current_timestamp_updated_at();
You need to create the trigger for each table, but you only need to create the function once.
When you make a foreign key constraint on a table, always mark it with on update restrict on delete restrict
.
This makes it so that if you try and delete the referenced row you will get an error. Storage is cheap, recovering data is a nightmare. Better to error than do something like cascade
.
CREATE TABLE person(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null
name text
);
CREATE TABLE pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null,
name text not null references person(id)
owner_id uuid on update restrict
on delete restrict
);
By default, every table in Postgres will go into the "public
" schema. This is fine, but you are missing out if you don't take advantage of your ability to make new schemas.
Schemas work as namespaces for tables and for any moderate to large app you are going to have a lot of tables. You can do joins and have relationships between tables in different schemas so there isn't much of a downside.
CREATE SCHEMA vet;
CREATE TABLE vet.person(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null
name text
);
CREATE TABLE vet.pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null,
name text not null references vet.person(id)
owner_id uuid on update restrict
on delete restrict
);
There are a lot of ways to make "enums" in sql. One is to use the actual "enum types," another is to use a check constraint.
The pattern introduced to me by Hasura was enum tables.
Have a table with some text
value as a primary key and make columns in other tables reference it with a foreign key.
CREATE TABLE vet.pet_kind(
value text not null primary key
);
INSERT INTO vet.pet_kind(value)
VALUES ('dog'), ('cat'), ('bird');
CREATE TABLE vet.pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.person(id)
owner_id uuid on update restrict
on delete restrict,
not null references vet.pet_kind(value)
kind text on update restrict
on delete restrict
);
This way you can insert into a table to add more allowed values or attach metadata like a comment to explain what each value means.
CREATE TABLE vet.pet_kind(
value text not null primary key,
comment text not null default ''
);
INSERT INTO vet.pet_kind(value, comment)
VALUES
'dog', 'A Canine'),
('cat', 'A Feline'),
('bird', 'A 50 Year Commitment');
(
CREATE TABLE vet.pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.person(id)
owner_id uuid on update restrict
on delete restrict,
not null references vet.pet_kind(value)
kind text on update restrict
on delete restrict
);
This isn't even Postgres specific, just please name your tables using the singular form of a noun.
SELECT * FROM pets
might seem nicer than SELECT * FROM pet
but the moment you start doing anything more interesting with your queries you will notice that your queries are actually working in terms of individual rows.
SELECT *
FROM pet
-- It's a cruel coincidence that in english an "s"
-- suffix can sometimes work both as a plural
-- and a possessive, but notice how the where clause
-- is asserting a condition about a single row.
WHERE pets.name = 'sally'
The deeper you dig the more annoying edge cases you'll run into with plural table names. Just name your tables the same as what an individual row in that table represents.
Sometimes there are sensible names to give "join tables" - tables which form the basis for "many to many" relationships between data - but often there isn't. In those cases don't hesitate to just concatenate the names of the tables you are joining between.
CREATE TABLE vet.person(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now()
updated_at timestamptz
);
CREATE TABLE vet.pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now()
updated_at timestamptz
);
-- pet_owner would work in this context, but
-- I just want to demonstrate the table_a_table_b naming scheme
CREATE TABLE vet.person_pet(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.person(id)
person_id uuid on update restrict
on delete restrict,
not null references vet.pet(id)
pet_id uuid on update restrict
on delete restrict
);
CREATE UNIQUE INDEX ON vet.person_pet(person_id, pet_id);
I will reiterate that storage is cheap and recovering data is a nightmare.
If you have some domain specific need to delete (or otherwise mark as irrelevant) some data, use a nullable timestamptz
column. If there is a timestamp filled in, that's when it was deleted. If there is no timestamp it isn't deleted yet.
CREATE TABLE vet.prescription(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.pet(id)
pet_id uuid on update restrict
on delete restrict,
not null,
issued_at timestamptz -- Instead of deleting a prescription,
-- explicitly mark when it was revoked
revoked_at timestamptz );
Even outside the context of a soft delete, timestamps are usually more useful than a boolean. If you want to know whether something happened, you generally also want to know when it happened.
It is very tempting to represent the status of something as a single column. You submit some paperwork and it has a status
of submitted
. Someone starts to look at it then it transitions to in_review
. From there maybe its rejected
or approved
.
There are two problems with this
approved
, or by whom.Webhooks are a prime example of the 2nd situation. There's no way in the laws of physics to be sure you'll get events in exactly the right order.
To handle this you should have a table where each row represents the status of the thing at a given point in time. Instead of overloading created_at
or updated_at
for this, have an explicit valid_at
which says when that information is valid for.
CREATE TABLE vet.adoption_approval_status(
value text not null primary key
);
INSERT INTO vet.adoption_approval_status(value)
VALUES ('submitted'), ('in_review'), ('rejected'), ('approved');
CREATE TABLE vet.adoption_approval(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.person(id)
person_id uuid on update restrict
on delete restrict,
not null references vet.adoption_approval_status(value)
status text on update restrict
on delete restrict,
not null
valid_at timestamptz
);
CREATE INDEX ON vet.adoption_approval(person_id, valid_at DESC);
Just having an index on valid_at
can work for a while, but eventually your queries will get too slow. There are a lot of ways to handle this, but the one we've found that works the best is to have an explicit latest
column with a cheeky unique index and trigger to make sure that only the row with the newest valid_at
is the latest
one.
CREATE TABLE vet.adoption_approval(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.person(id)
person_id uuid on update restrict
on delete restrict,
not null references vet.adoption_approval_status(value)
status text on update restrict
on delete restrict,
not null,
valid_at timestamptz boolean default false
latest
);
CREATE INDEX ON vet.adoption_approval(person_id, valid_at DESC);
-- Conditional unique index makes sure we only have one latest
CREATE UNIQUE INDEX ON vet.adoption_approval(person_id, latest)
WHERE latest = true;
-- Then a trigger to keep latest up to date
CREATE OR REPLACE FUNCTION vet.set_adoption_approval_latest()
trigger
RETURNS
LANGUAGE plpgsqlAS $function$
BEGIN
UPDATE vet.adoption_approval
SET latest = false
WHERE latest = true and person_id = NEW.person_id;
UPDATE vet.adoption_approval
SET latest = true
WHERE id = (
SELECT id
FROM vet.adoption_approval
WHERE person_id = NEW.person_id
ORDER BY valid_at DESC
LIMIT 1
);
RETURN null;
END;
$function$;
CREATE TRIGGER adoption_approval_insert_trigger
AFTER INSERT ON vet.adoption_approval
FOR EACH ROW
EXECUTE FUNCTION vet.set_adoption_approval_latest();
system_id
It's not uncommon to end up with "special rows." By this I mean rows in a table that the rest of your system will rely on the presence of to build up behavior.
All rows in an enum table are like this, but you will also end up with rows in tables of otherwise normal "generated during the course of normal use" rows. For these, give them a special system_id
.
Unique indexes don't mind multiple rows with null values, so you can make a unique index on this system_id
and look up your special rows later as you need to.
CREATE TABLE vet.contact_info(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz references vet.person(id)
person_id uuid on update restrict
on delete restrict,
not null,
mailing_address text
system_id text
);
CREATE UNIQUE INDEX ON vet.contact_info(system_id);
-- Not hard to imagine wanting to build functionality that
-- automatically contacts the CDC for cases of rabies or similar,
-- but maybe every other bit of contact_info in the system is
-- for more "normal" purposes
INSERT INTO vet.contact_info(system_id, mailing_address)
VALUES ('cdc', '4770 Buford Highway, NE');
Views are amazing and terrible.
They are amazing in their ability to wrap up a relatively complex or error-prone query into something that looks basically like a table.
They are terrible in that removing obsolete columns requires a drop and recreation, which can become a nightmare when you build views on views. The query planner also seems to have trouble seeing through them in general.
So do use views, but only as many as you need and be very wary of building views on views.
CREATE TABLE vet.prescription(
id uuid not null default gen_random_uuid() primary key,
not null default now(),
created_at timestamptz not null default now(),
updated_at timestamptz not null references vet.pet(id)
pet_id uuid on update restrict
on delete restrict,
not null,
issued_at timestamptz -- Instead of deleting a prescription,
-- explicitly mark when it was revoked
revoked_at timestamptz
);
CREATE INDEX ON vet.prescription(revoked_at);
-- There are pros and cons to having this view
CREATE VIEW vet.active_prescription AS
SELECT
id,
vet.prescription.
vet.prescription.created_at,
vet.prescription.updated_at,
vet.prescription.pet_id,
vet.prescription.issued_atFROM
vet.prescriptionWHERE
IS NULL; vet.prescription.revoked_at
You might have heard that Postgres "supports JSON." This is true, but I had mostly heard it in the context of storing and querying JSON. If you want a table with some blob of info slap a jsonb
column on one your tables.
That is neat, but I've gotten way more mileage out of using JSON as the result of a query. This has definite downsides like losing type information, needing to realize your results all at once, and the overhead of writing into json.
But the giant upside is that you can get all the information you want from the database in one trip, no cartesian product nightmares or N+1 problems in sight.
SELECT jsonb_build_object(
'id', vet.person.id,
'name', vet.person.name,
'pets', array(
SELECT jsonb_build_object(
'id', vet.pet.id,
'name', vet.pet.name,
'prescriptions', array(
SELECT jsonb_build_object(
'issued_at', vet.prescription.issued_at
)FROM vet.prescription
WHERE vet.prescription.pet_id = vet.pet.id
)
)FROM vet.person_pet
LEFT JOIN vet.pet
ON vet.pet.id = vet.person_pet.pet_id
WHERE vet.person_pet.person_id = vet.person.id
),'contact_infos', array(
SELECT jsonb_build_object(
'mailing_address', vet.contact_info.mailing_address
)FROM vet.contact_info
WHERE vet.contact_info.person_id = vet.person.id
)
) FROM vet.person
WHERE id = '29168a93-cd14-478f-8c70-a2b7a782c714';
Which can net you something like the following.
{
"id": "29168a93-cd14-478f-8c70-a2b7a782c714",
"name": "Jeff Computers",
"pets": [
{
"id": "3e5557c0-c628-44ef-b4d1-86012c5f48bf",
"name": "Rhodie",
"prescriptions": [
{
"issued_at": "2025-03-11T23:46:18.345146+00:00"
}
]
},
{
"id": "ed63ca7d-3368-4353-9747-6b6b2fa6657a",
"name": "Jenny",
"prescriptions": []
}
],
"contact_infos": [
{
"mailing_address": "123 Sesame St."
}
]
}
You can find all the setup you'd need to do for that query here. You can try it out on https://onecompiler.com/postgresql if setting up a local postgres is a bit much.
If there is something I missed or got wrong, tell me very loudly in person or here on the internet.