SQL is query a language that is usually pretty easy to read. However, if people don’t format their queries properly even SQL turns out to be a nightmare. That’s why developers often turn to an SQL beautifier to turn an ugly query into a nicely formatted string. Various tools are available on the web to achieve exactly that.

 

PostgreSQL as a SQL Beautifier

 

Can the same thing be achieved using only PostgreSQL onboard tools? The answer is yes. This post will show you how to achieve that.

How PostgreSQL handles views

In PostgreSQL a view is not stored as plain text. Instead, it is stored inside the system table in binary, machine readable format:

test=# \d pg_rewrite
Table "pg_catalog.pg_rewrite"
   Column   |    Type      | Collation | Nullable | Default
------------+--------------+-----------+----------+---------
 oid        | oid          |           | not null |
 rulename   | name         |           | not null |
 ev_class   | oid          |           | not null |
 ev_type    | "char"       |           | not null |
 ev_enabled | "char"       |           | not null |
 is_instead | boolean      |           | not null |
 ev_qual    | pg_node_tree | C         | not null |
 ev_action  | pg_node_tree | C         | not null |

Indexes:
  "pg_rewrite_oid_index" UNIQUE, btree (oid)
  "pg_rewrite_rel_rulename_index" UNIQUE, btree (ev_class, rulename)

The pg_node_tree data type contains all the magic here. This makes a lot of sense because data is more directly accessible during query execution. In addition, it allows PostgreSQL to easily handle changing column names and so on without breaking views. Internally, PostgreSQL is only using an object ID, and therefore names, and so on don’t matter at all. Views will not be invalidated by renaming tables or a column.

However, if you use \d+: How does PostgreSQL then provide the definition of a view in human readable format? The answer is: PostgreSQL reassembles the query again. This mechanism can be used to format an SQL string and turn it into something more beautiful.

Keep in mind: The mechanism was never intended to do that, but it is a nice illustration of what can be done.

Turning a view into a proper query string

The pg_get_viewdef function returns the definition of a view as a string. We can make use of that. Let’s take a look at the following function:

CREATE OR REPLACE FUNCTION format_sql(text)
RETURNS text AS
$$
   DECLARE
      v_ugly_string       ALIAS FOR $1;
      v_beauty            text;
      v_tmp_name          text;
   BEGIN
      -- let us create a unique view name
      v_tmp_name := 'temp_' || md5(v_ugly_string);
      EXECUTE 'CREATE TEMPORARY VIEW ' ||
      v_tmp_name || ' AS ' || v_ugly_string;

      -- the magic happens here
      SELECT pg_get_viewdef(v_tmp_name) INTO v_beauty;

      -- cleanup the temporary object
      EXECUTE 'DROP VIEW ' || v_tmp_name;
      RETURN v_beauty;
   EXCEPTION WHEN OTHERS THEN
      RAISE EXCEPTION 'you have provided an invalid string: % / %',
            sqlstate, sqlerrm;
   END;
$$ LANGUAGE 'plpgsql';

What it basically does is to take a string and turn it into a temporary view. This view is then turned into a string again and dropped. Here is the function in action:

test=# SELECT format_sql('SELECT * FROM
                  pg_tables UNION
                    ALL SELECT * FROM
          pg_tables');

           format_sql
-------------------------------
SELECT pg_tables.schemaname,+
       pg_tables.tablename, +
       pg_tables.tableowner, +
       pg_tables.tablespace, +
       pg_tables.hasindexes, +
       pg_tables.hasrules, +
       pg_tables.hastriggers, +
       pg_tables.rowsecurity + 
 FROM  pg_tables +
 UNION ALL +
 SELECT pg_tables.schemaname,+
       pg_tables.tablename, +
       pg_tables.tableowner, +       
       pg_tables.tablespace, +
       pg_tables.hasindexes, +
       pg_tables.hasrules, +
       pg_tables.hastriggers, +
       pg_tables.rowsecurity +
 FROM  pg_tables;
(1 row)

As you can see the string is returned in a proper format. PostgreSQL will even resolve the “*” for you and turn it into a proper column list. What you see here is that psql has added a + in case a newline appears. We can easily fix that and tell psql to change its behavior:

test=# \pset format unaligned
Output format is unaligned.

test=# SELECT format_sql('SELECT * FROM
                 pg_tables UNION
               ALL
                SELECT * FROM
                    pg_tables');
 format_sql
SELECT pg_tables.schemaname,
       pg_tables.tablename,
       pg_tables.tableowner,
       pg_tables.tablespace,
       pg_tables.hasindexes,
       pg_tables.hasrules,
       pg_tables.hastriggers,
       pg_tables.rowsecurity
 FROM  pg_tables
 UNION ALL

 SELECT pg_tables.schemaname,
       pg_tables.tablename,
       pg_tables.tableowner,
       pg_tables.tablespace,
       pg_tables.hasindexes,
       pg_tables.hasrules,
       pg_tables.hastriggers,
       pg_tables.rowsecurity
 FROM pg_tables;
(1 row)

PostgreSQL has created a beautiful SQL string for us.

Limitations of this solution

Of course, this approach comes with a couple of limitations. First of all, the function will only work if you provide a query that actually has a chance of being executed. If tables don’t exist, an error will be thrown:

test=# SELECT format_sql('SELECT * FROM not_there');
ERROR: you have provided an invalid string: 42P01 / relation "not_there" does not exist
CONTEXT: PL/pgSQL function format_sql(text) line 19 at RAISE

This can be seen as a feature or as a problem – it depends on what you are trying to achieve. In addition to that comments will be removed. There is no way to prevent that from happening.

Further reading

If you want to learn more about views in PostgreSQL, I highly encourage you to check out our post about views and row-level-security.

If you want to read more about views and dependency tracking check out our post about that.