ER diagrams with SQL and Mermaid

02.2023 | Category: How To | Tags: development

02.2023

Category: How To

Tags: development

What does ER stand for?

An Entity Relationship (ER) diagram is one of the most important tools for database design. It helps you visualize the relationships between different entities and how they interact with each other. Many GUI tools have their own tools to build ER diagrams, e.g. pgAdmin IV, DBeaver, etc.
In this blog post, we'll explore how to create an ER diagram for a PostgreSQL database using plain SQL and Mermaid. Mermaid is a JavaScript-based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.

Table of Contents

mermaid ER diagram — Mermaid ER diagram example

Setup

Today I will show you how to generate an ER diagram for any PostgreSQL database using only plain SQL and Mermaid. I propose to use the Pagila example database as a target. You may either install it locally, or run a Docker Compos script.

The task is to create an SQL script which will output valid Mermaid syntax. Later we can use either Mermaid Live Editor or a local installation to view and save the ER diagram in one of the preferred formats, e.g. .svg, .png, etc.

Script

If you examine Mermaid ER syntax, you will find that:

it's possible to define tables separately from references;
that you can specify table columns with types;
that you can add a name for the relationships.

Mermaid ER header

I've split the SQL script into 3 parts. The first one is very basic - it outputs the special keyword erDiagram to indicate how Mermaid should visualize the diagram.

select 'erDiagram'
union all
...

1

2

3

select 'erDiagram'

union all

...

Collecting tables and columns

select 
    format(E't%s{n%sn}', 
        c.relname, 
        string_agg(format(E'tt~%s~ %s', 
            format_type(t.oid, a.atttypmod), 
            a.attname
        ), E'n'))
from 
    pg_class c 
    join pg_namespace n on n.oid = c.relnamespace
    left join pg_attribute a ON c.oid = a.attrelid and a.attnum > 0 and not a.attisdropped
    left join pg_type t ON a.atttypid = t.oid
where 
    c.relkind in ('r', 'p') 
    and not c.relispartition
    and n.nspname !~ '^pg_' AND n.nspname <> 'information_schema'
group by c.relname
union all
...

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

select

format(E't%s{n%sn}',

c.relname,

string_agg(format(E'tt~%s~ %s',

format_type(t.oid, a.atttypmod),

a.attname

), E'n'))

from

pg_class c

join pg_namespace n on n.oid = c.relnamespace

left join pg_attribute a ON c.oid = a.attrelid and a.attnum > 0 and not a.attisdropped

left join pg_type t ON a.atttypid = t.oid

where

c.relkind in ('r', 'p')

and not c.relispartition

and n.nspname !~ '^pg_' AND n.nspname <> 'information_schema'

group by c.relname

union all

...

Here's what the above snippet does:

It selects the table name (c.relname) and the associated column names (a.attname) with data types (t.oid) from the pg_class, pg_namespace, pg_attribute, and pg_type tables.
It uses left join for columns and types because PostgreSQL allows the creation of tables without attributes, e.g. create table foo(); is a valid DDL statement.
The join condition for attributes explicitly specifies it wants only user-defined columns (a.attnum > 0) and columns that are still valid (not a.attisdropped).
It filters out any entities that are not regular tables (partitioned are OK), and excludes any tables in the PostgreSQL system or information_schema schemas. The partitions themselves are not interesting for us, because they are only implementation details.
To produce a correct type name it uses special format_type() system function. Pay attention, the snippet uses special string constants with C-style escapes. That allows you to easily format tab indents as well as new lines.
To aggregate column names with data types, a special string_agg() function is used.
It then formats the output as a new-line-delimited string with the column definitions in braces.

Collecting relationships

select 
    format('%s }|..|| %s : %s', c1.relname, c2.relname, c.conname)
from 
    pg_constraint c
    join pg_class c1 on c.conrelid = c1.oid and c.contype = 'f'
    join pg_class c2 on c.confrelid = c2.oid
where
    not c1.relispartition and not c2.relispartition;

1

2

3

4

5

6

7

8

select

format('%s }|..|| %s : %s', c1.relname, c2.relname, c.conname)

from

pg_constraint c

join pg_class c1 on c.conrelid = c1.oid and c.contype = 'f'

join pg_class c2 on c.confrelid = c2.oid

where

not c1.relispartition and not c2.relispartition;

This snippet is much easier:

It gets all the foreign key constraints in the database (c.contype = 'f').
It filters out any foreign keys that are on a partition table or reference a partition table.
It formats the output to show the table name, the referenced table name, and the constraint name. The cardinality is one-to-many by default.

Finally

Here is the final script to copy-paste:

select 'erDiagram'
union all
select 
    format(E't%s{n%sn}', 
        c.relname, 
        string_agg(format(E'tt~%s~ %s', 
            format_type(t.oid, a.atttypmod), 
            a.attname
        ), E'n'))
from 
    pg_class c 
    join pg_namespace n on n.oid = c.relnamespace
    left join pg_attribute a ON c.oid = a.attrelid and a.attnum > 0 and not a.attisdropped
    left join pg_type t ON a.atttypid = t.oid
where 
    c.relkind in ('r', 'p') 
    and not c.relispartition
    and n.nspname !~ '^pg_' AND n.nspname <> 'information_schema'
group by c.relname
union all
select 
    format('%s }|..|| %s : %s', c1.relname, c2.relname, c.conname)
from 
    pg_constraint c
    join pg_class c1 on c.conrelid = c1.oid and c.contype = 'f'
    join pg_class c2 on c.confrelid = c2.oid
where
    not c1.relispartition and not c2.relispartition;

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

select 'erDiagram'

union all

select

format(E't%s{n%sn}',

c.relname,

string_agg(format(E'tt~%s~ %s',

format_type(t.oid, a.atttypmod),

a.attname

), E'n'))

from

pg_class c

join pg_namespace n on n.oid = c.relnamespace

left join pg_attribute a ON c.oid = a.attrelid and a.attnum > 0 and not a.attisdropped

left join pg_type t ON a.atttypid = t.oid

where

c.relkind in ('r', 'p')

and not c.relispartition

and n.nspname !~ '^pg_' AND n.nspname <> 'information_schema'

group by c.relname

union all

select

format('%s }|..|| %s : %s', c1.relname, c2.relname, c.conname)

from

pg_constraint c

join pg_class c1 on c.conrelid = c1.oid and c.contype = 'f'

join pg_class c2 on c.confrelid = c2.oid

where

not c1.relispartition and not c2.relispartition;

And here is how the final result looks for the Pagila database (click to enlarge):

Thanks for reading, I hope you enjoyed it! If you liked this one, check out my blog post about usql, universal psql!

10 responses to “ER diagrams with SQL and Mermaid”

Florian Klein says:

February 21, 2023 at 12:48 pm

I love it! Thanks. Just tried it on a schema of mine containing names
with spaces (I like problems), and the mermaid parser failed. It should
render names with double quotes, something that format() should be
able to handle with a %I formatter, no?

Reply
- Pavlo Golub says:
  
  February 21, 2023 at 1:04 pm
  
  Yeah, it's not perfect. I already broke it several times ( https://github.com/mermaid-js/mermaid/issues/4008 ). And AFAIK one cannot use double quoting for relation names. It should be improved, agree. You can check the sources and maybe propose a PR ( https://github.com/mermaid-js/mermaid/tree/develop/packages/mermaid/src/diagrams/er ).
  
  Reply
  - Florian Klein says:
    
    February 21, 2023 at 1:55 pm
    
    from what I tried, it looks like it works, just wrap names with double quotes as in:
    
    family_has_attribute }|..|| family : "family has attribute"
    
    Reply
Yuriy_D says:

February 23, 2023 at 7:44 am

I want to make updates to your SQL sample:
from ____pg_class c ____join pg_namespace n on n.oid = c.relnamespace ____left join pg_attribute a ON c.oid = a.attrelid
I would write like this:
FROM pg_class c __JOIN pg_namespace n ____ON n.oid = c.relnamespace __LEFT JOIN pg_attribute a ____ON a.attrelid = c.oid
please note:
- reserved keywords are in upper case
- visually easy to find joined tables and conditions
- in join condition expression for joined table is always on left

Reply
- Pavlo Golub says:
  
  February 23, 2023 at 9:24 am
  
  Thanks. UPPERCASE reserved SQL words are kind of holy war nowadays. 🙂 If you check my previous posts, I always use uppercase there. This time I decided to use lowercase.
  
  Are these rules specified somewhere or are they your personal preferences?
  
  Reply
  - Yuriy_D says:
    
    February 23, 2023 at 10:21 am
    
    its my best practice after 27 years of experience working with sql
    
    Reply
- Pavlo Golub says:
  
  February 23, 2023 at 9:29 am
  
  The choice of using all uppercase or lowercase SQL reserved words in
  coding is a matter of personal preference, which may have originated
  from a time when code editors did not have color-coding features. While I
  used to favor all uppercase SQL reserved words, I am now gravitating
  towards using all lowercase. Regardless of the choice, it's essential to
  maintain consistency throughout.
  
  Reply
Chris says:

June 21, 2024 at 4:11 pm

The code samples don't show backslashes that are required to represent tab and newlines characters. Copying and pasting and SQL code as-is doesn't work. For example:

PgSQL

format(E't%s{n%sn}',

1

format(E't%s{n%sn}',

should be

PgSQL

format(E\'t%s{\n%s\n}',

1

format(E\'t%s{\n%s\n}',

Reply
Chris says:

June 21, 2024 at 4:13 pm

It seems like the web page strips single backslashes from string. My comment from a few months ago had as its 2nd code sample: format(E' followed by backslash-t -- but the backslash isn't shown in my comment. Probably the same is true with the code sample.

Reply
dfg says:

September 7, 2024 at 4:27 pm

I love this, thanks!

Reply

Leave a Reply Cancel reply

Pavlo Golub

Senior Developer & Consultant

Pavlo is a PostgreSQL expert and developer at CYBERTEC, PostgreSQL contributor and GSoC Org Admin. He's been working with PostgreSQL since 2002. He has an M.Ed. with an emphasis in Mathematics and Computer Science from the Central Ukrainian State Pedagogical University, and previously worked as a professor at the Kirovograd Medical Professional College. He is the co-founder of PostgreSQL Ukraine. He develops and maintains the PostgresDAC and pgxmock libraries, as well as the pgwatch, pg_timetable, and vip_manager projects.

Stay tuned with our

NEWSLETTER

CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0

office@cybertec-postgresql.com

ISO_27001_Badge

Customer Support

Support Platform

Services

Support CYBERTEC Partner PostgreSQL Books

Company

STAY TUNED WITH OUR NEWSLETTER

Get the newest PostgreSQL Info & Tools

Data Protection Policy Terms and conditions Terms of Service Imprint

©

2025

CYBERTEC PostgreSQL International GmbH