The biggest change in PostgreSQL v12 from the viewpoint of backward compatibility is that recovery.conf
has been absorbed into postgresql.conf
.
This article describes the changes and how you can adapt to them.
recovery.conf
Up to now, the presence of the file recovery.conf
was the trigger for PostgreSQL to go into recovery mode upon server start. In addition, the file contained all parameters to configure recovery, for example
standby_mode
: determines whether this is normal archive recovery or standby moderestore_command
: command to restore archived WAL segmentsrecovery_target*
parameters to determine which point to recover toprimary_conninfo
: how to connect to the streaming replication primary serverrecovery.conf has been perceived as a wart for a long time, since it is unreasonable to have configuration parameters in more than one file.
In v12 PostgreSQL the move has been made, and the recovery.conf
parameters are now part of postgresql.conf
. These parameters are ignored until PostgreSQL is in recovery mode.
While this makes PostgreSQL work more naturally for newcomers, it has a few consequences that experienced users have to adapt to:
Up to now, the existence of the file recovery.conf
triggered recovery mode.
Since the file is gone, two new files have taken its place:
recovery.signal
: tells PostgreSQL to enter normal archive recoverystandby.signal
: tells PostgreSQL to enter standby modeThe files can be empty, only the fact that they exist is relevant. They are deleted as soon as recovery finishes.
The parameter “standby_mode
” is not necessary anymore and has been removed.
Before, recovery.conf
was renamed to recovery.done
after recovery was finished. This effectively removed the recovery parameters.
From v12 on, the recovery.signal
or standby.signal
files will be deleted when recovery finishes, but the parameters in postgresql.conf
remain. They will be ignored as long as PostgreSQL is not in recovery, but it is a good idea to disable them with a “#
” comment. This will avoid nasty surprises later on, for example when you create a standby server.
If you set the recovery parameters using ALTER SYSTEM
, reset them with ALTER SYSTEM RESET
.
--write-recovery-conf
” option of pg_basebackup
?The option is still there, and that is no mistake. It doesn't create recovery.conf
any more, but it adds recovery configuration parameters.
Rather than adding the parameters to postgresql.conf
, it adds them to postgresql.auto.conf
, just like ALTER SYSTEM
does. This is safer, since that file is made for automatic editing.
recovery.conf
?If you didn't notice the changes and try to recover PostgreSQL v12 with recovery.conf
, the following will happen:
recovery.conf
is ignored.backup_label
, but no signal file, so it will perform crash recovery from the checkpoint in backup_label
.pg_basebackup
and left the option “--wal-method
” at its default level “stream
”, PostgreSQL will recover successfully to the end of the backup, but no further.pg_wal
, crash recovery will fail with the error message:ERROR: could not find redo location referenced by checkpoint record
Backup is not affected by the change. So your backup scripts need no modification.
Scripts that perform archive recovery or set up standby servers will have to be adapted. The main difficulty will be to add the necessary recovery parameters.
My recommendation is to add the parameters to postgresql.auto.conf
, because as I said before, that file is made for automatic editing.
Here is some sample bash
code to add recovery_target_time
:
1 2 3 4 |
sed --in-place -e '/^recovery_target/d' -e '$ a recovery_target_time = '2019-11-20 12:00:00'' $PGDATA/postgresql.auto.conf |
The command first removes all options starting with “recovery_target
” and then adds the parameter.
Don't forget to do the following before starting the server:
1 |
touch $PGDATA/recovery.signal |
I recommend that you remove the recovery parameters again once recovery is done.
Any dedicated PostgreSQL backup software that is worth its salt should have adapted to these changes by now. Upgrade to a current release.
I have checked PostgreSQL v12 support for the most widely-used programs:
The changes in recovery will need some getting used to for experienced users, but it shouldn't be too hard to adapt.
Scripts that perform automatic recovery will have to be modified.
Take this as an opportunity to test your backup and recovery procedure!
For more information on backup in PostgreSQL, see my post on the deprecation of the "exclusive" backup method as of PostgreSQL v15.
In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.
A lot has been written about configuring postgresql.conf, postgresql.auto.conf and so on. However, sometimes it requires to take a second look in order to understand, how PostgreSQL really handles configuration parameters. You will notice that PostgreSQL configuration offers more than meets the eye at first glance. So let us dive into PostgreSQL GUCs and configuration on a more theoretical level!
Most people will directly change settings in postgresql.conf silently, assuming that this is the place to change PostgreSQL configuration parameters. However, this is not the only place you can use. The purpose of this blog is to show you which other options you have and how you can use these features to make your database configuration better.
For the sake of simplicity, I will use an easy configuration parameter to demonstrate how PostgreSQL operates:
1 2 3 4 5 |
test=# SELECT now(); now ------------------------------- 2019-11-23 13:08:32.869274+01 (1 row) |
The first thing you have to learn is how to figure out where configuration parameters actually come from. To do that, take a look at the pg_settings view:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
test=# x Expanded display is on. test=# SELECT * FROM pg_settings WHERE name = 'TimeZone'; -[ RECORD 1 ]---+---------------------------------------------------------------- name | TimeZone setting | Europe/Vienna unit | category | Client Connection Defaults / Locale and Formatting short_desc | Sets the time zone for displaying and interpreting time stamps. extra_desc | context | user vartype | string source | configuration file min_val | max_val | enumvals | boot_val | GMT reset_val | Europe/Vienna sourcefile | /home/hs/db12/postgresql.conf sourceline | 651 pending_restart | f |
postgresql.conf allows to include files. The idea is to give users the chance to break up postgresql.conf into smaller chunks.
The rule here is simple: If your parameter is used inside a configuration file more than once, the LAST entry is going to be taken. In general, a parameter should only be in a config file once, but in case an error happens, you can be sure that the last entry is the one that counts.
After the builtin settings, after taking what there is in postgresql.conf and after taking those include files into account, PostgreSQL will take a look at postgresql.auto.conf. The main question is: What is postgresql.auto.conf? It happens quite frequently, that administrators don’t have full access to the system (e.g. no SSH access). In this case superusers can take advantage of ALTER SYSTEM, which allows you to change PostgreSQL parameters using plain SQL. Here is how it works:
1 2 |
test=# ALTER SYSTEM SET timezone = 'UTC-4'; ALTER SYSTEM |
If you run ALTER SYSTEM, the database will made changes to postgresql.auto.conf:
1 2 3 4 |
[hs@asus db12]$ cat postgresql.auto.conf # Do not edit this file manually! # It will be overwritten by the ALTER SYSTEM command. timezone = 'UTC-4' |
These values will have precedence over postgresql.conf.
As you can see, the parameter is now GMT. This is the default value set by the PostgreSQL binaries, in case there are no configuration parameters at all.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
test=# x Expanded display is on. test=# SELECT * FROM pg_settings WHERE name = 'TimeZone'; -[ RECORD 1 ]---+---------------------------------------------------------------- name | TimeZone setting | GMT unit | category | Client Connection Defaults / Locale and Formatting short_desc | Sets the time zone for displaying and interpreting time stamps. extra_desc | context | user vartype | string source | default min_val | max_val | enumvals | boot_val | GMT reset_val | GMT sourcefile | sourceline | pending_restart | f |
However, in many cases you don’t want to set a value permanently. For instance, you might only want to set it during maintenance mode. Maybe you want to start PostgreSQL on a different port to manually, while fixing a problem, to lock out users. In this case you can pass parameters via pg_ctl directly:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[hs@asus db12]$ pg_ctl -D /home/hs/db12/ -l /dev/null -o '--timezone=UTC-3' restart waiting for server to shut down.... done server stopped waiting for server to start.... done server started [hs@asus db12]$ psql test psql (12.0) Type 'help' for help. test=# SELECT now(); now ------------------------------- 2019-11-23 15:11:17.906164+03 (1 row) |
In 80% of cases it is totally enough to either take the built-ins, postgresql.conf, or postgresql.auto.conf. Using -o is already quite rare. However, there is a lot more. Sometimes you want your configuration to be way finer grained. What if a parameter should only be used inside a specific database? Here is how it works:
1 2 |
test=# ALTER DATABASE test SET timezone = 'UTC-5'; ALTER DATABASE |
After reconnecting to the database, you will see that the value is set correctly:
1 2 3 4 5 |
test=# SELECT now(); now ------------------------------- 2019-11-23 17:15:15.587692+05 (1 row) |
Not all changes can be made at the database level. Things such as “shared_buffers”, “port” can only be changed at the instance level and are not possible at the database level anymore, as shown in the next example:
1 2 |
test=# ALTER DATABASE test SET port = 6000; ERROR: parameter 'port' cannot be changed without restarting the server |
So far changes have been made to postgresql.conf, postgresql.auto.conf, on startup as well as on a per-database level. However, how about specific users? To do that, consider ALTER USER … SET …:
1 2 |
test=# ALTER USER hs SET timezone = 'UTC-6'; ALTER ROLE |
After a reconnect the value will be shown:
1 2 3 4 5 |
test=# SELECT now(); now ------------------------------- 2019-11-23 18:16:29.362417+06 (1 row) |
But what if this is still not fine-grained enough? What if you only want to set a value for a user inside a transaction? PostgreSQL can even do that:
1 2 |
test=# ALTER USER hs IN DATABASE test SET timezone = 'UTC-7'; ALTER ROLE |
After a reconnect the value will be shown:
1 2 3 4 5 |
test=# SELECT now(); now ------------------------------- 2019-11-23 19:17:39.890558+07 (1 row) |
Why is this kind of configuration useful? Suppose you are using a “datawarehouse” user to run some specific aggregations in of the databases. These specific operations might need special memory parameters, such as work_mem, to be efficient.
Sometimes hardwiring configuration settings is still not flexible enough. In PostgreSQL configuration, parameters can even be changed on a per session level. But be careful: This seemingly simple feature is highly sophisticated. The important thing to consider, is that in PostgreSQL everything is transaction. This includes PostgreSQL configuration parameters, as you can see in the next example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
test=# BEGIN; BEGIN test=# SET timezone = 'UTC-9'; SET test=# SAVEPOINT a; SAVEPOINT test=# SELECT now(); now ------------------------------- 2019-11-23 21:18:39.625348+09 (1 row) test=# SET timezone = 'UTC-10'; SET test=# ROLLBACK TO SAVEPOINT a; ROLLBACK test=# SELECT now(); now ------------------------------- 2019-11-23 21:18:39.625348+09 (1 row) test=# ROLLBACK; ROLLBACK test=# SELECT now(); now ------------------------------- 2019-11-23 20:19:05.245293+08 (1 row) |
What you can see, is that PostgreSQL even takes savepoints et cetera into account. If a transaction is not committed, the configuration parameters will be rolled back.
After this introduction, there is a final feature I want to share: Parameters can be assigned to functions. Consider the following scenario:
1 2 3 |
SELECT accounting_tokyo(), accounting_miami(), accounting_berlin(); |
The problem is that a “day” is not the same everywhere on the planet. So let us assume you want to calculate the turnover of every office per day. You can basically assign the timezone setting to each of those functions. Every function could run in a different timezone within the same SELECT statements.
CREATE FUNCTION shows how a setting can be passed to a function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
test=# h CREATE FUNCTION Command: CREATE FUNCTION Description: define a new function Syntax: CREATE [ OR REPLACE ] FUNCTION name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = } default_expr ] [, ...] ] ) [ RETURNS rettype | RETURNS TABLE ( column_name column_type [, ...] ) ] … | SET configuration_parameter { TO value | = value | FROM CURRENT } | AS 'definition' | AS 'obj_file', 'link_symbol' } … |
Configuring PostgreSQL parameters is really way more powerful than most users recognize. There are many ways to set parameters and it makes sense to explore these options to optimize your configuration. If you want to learn more about PostgreSQL configuration, you might want to check out my post about configuring parallel index creation.
SQL is query a language that is usually pretty easy to read. However, if people don't format their queries properly even SQL turns out to be a nightmare. That's why developers often turn to an SQL beautifier to turn an ugly query into a nicely formatted string. Various tools are available on the web to achieve exactly that.
Can the same thing be achieved using only PostgreSQL onboard tools? The answer is yes. This post will show you how to achieve that.
In PostgreSQL a view is not stored as plain text. Instead, it is stored inside the system table in binary, machine readable format:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
test=# d pg_rewrite Table 'pg_catalog.pg_rewrite' Column | Type | Collation | Nullable | Default ------------+--------------+-----------+----------+--------- oid | oid | | not null | rulename | name | | not null | ev_class | oid | | not null | ev_type | 'char' | | not null | ev_enabled | 'char' | | not null | is_instead | boolean | | not null | ev_qual | pg_node_tree | C | not null | ev_action | pg_node_tree | C | not null | Indexes: 'pg_rewrite_oid_index' UNIQUE, btree (oid) 'pg_rewrite_rel_rulename_index' UNIQUE, btree (ev_class, rulename) |
The pg_node_tree data type contains all the magic here. This makes a lot of sense because data is more directly accessible during query execution. In addition, it allows PostgreSQL to easily handle changing column names and so on without breaking views. Internally, PostgreSQL is only using an object ID, and therefore names, and so on don't matter at all. Views will not be invalidated by renaming tables or a column.
However, if you use d+: How does PostgreSQL then provide the definition of a view in human readable format? The answer is: PostgreSQL reassembles the query again. This mechanism can be used to format an SQL string and turn it into something more beautiful.
Keep in mind: The mechanism was never intended to do that, but it is a nice illustration of what can be done.
The pg_get_viewdef function returns the definition of a view as a string. We can make use of that. Let's take a look at the following function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
CREATE OR REPLACE FUNCTION format_sql(text) RETURNS text AS $ DECLARE v_ugly_string ALIAS FOR $1; v_beauty text; v_tmp_name text; BEGIN -- let us create a unique view name v_tmp_name := 'temp_' || md5(v_ugly_string); EXECUTE 'CREATE TEMPORARY VIEW ' || v_tmp_name || ' AS ' || v_ugly_string; -- the magic happens here SELECT pg_get_viewdef(v_tmp_name) INTO v_beauty; -- cleanup the temporary object EXECUTE 'DROP VIEW ' || v_tmp_name; RETURN v_beauty; EXCEPTION WHEN OTHERS THEN RAISE EXCEPTION 'you have provided an invalid string: % / %', sqlstate, sqlerrm; END; $ LANGUAGE 'plpgsql'; |
What it basically does is to take a string and turn it into a temporary view. This view is then turned into a string again and dropped. Here is the function in action:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
test=# SELECT format_sql('SELECT * FROM pg_tables UNION ALL SELECT * FROM pg_tables'); format_sql ------------------------------- SELECT pg_tables.schemaname,+ pg_tables.tablename, + pg_tables.tableowner, + pg_tables.tablespace, + pg_tables.hasindexes, + pg_tables.hasrules, + pg_tables.hastriggers, + pg_tables.rowsecurity + FROM pg_tables + UNION ALL + SELECT pg_tables.schemaname,+ pg_tables.tablename, + pg_tables.tableowner, + pg_tables.tablespace, + pg_tables.hasindexes, + pg_tables.hasrules, + pg_tables.hastriggers, + pg_tables.rowsecurity + FROM pg_tables; (1 row) |
As you can see the string is returned in a proper format. PostgreSQL will even resolve the “*” for you and turn it into a proper column list. What you see here is that psql has added a + in case a newline appears. We can easily fix that and tell psql to change its behavior:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
test=# pset format unaligned Output format is unaligned. test=# SELECT format_sql('SELECT * FROM pg_tables UNION ALL SELECT * FROM pg_tables'); format_sql SELECT pg_tables.schemaname, pg_tables.tablename, pg_tables.tableowner, pg_tables.tablespace, pg_tables.hasindexes, pg_tables.hasrules, pg_tables.hastriggers, pg_tables.rowsecurity FROM pg_tables UNION ALL SELECT pg_tables.schemaname, pg_tables.tablename, pg_tables.tableowner, pg_tables.tablespace, pg_tables.hasindexes, pg_tables.hasrules, pg_tables.hastriggers, pg_tables.rowsecurity FROM pg_tables; (1 row) |
PostgreSQL has created a beautiful SQL string for us.
Of course, this approach comes with a couple of limitations. First of all, the function will only work if you provide a query that actually has a chance of being executed. If tables don't exist, an error will be thrown:
1 2 3 |
test=# SELECT format_sql('SELECT * FROM not_there'); ERROR: you have provided an invalid string: 42P01 / relation 'not_there' does not exist CONTEXT: PL/pgSQL function format_sql(text) line 19 at RAISE |
This can be seen as a feature or as a problem - it depends on what you are trying to achieve. In addition to that comments will be removed. There is no way to prevent that from happening.
If you want to learn more about views in PostgreSQL, I highly encourage you to check out our post about views and row-level-security.
If you want to read more about views and dependency tracking check out our post about that.