Manage encryption keys with PostgreSQL TDE

PostgreSQL TDE is a version of PostgreSQL which offers on-disk encryption for PostgreSQL. Currently we at CYBERTEC maintain this piece of software for public use. What we have noticed recently is that people kept asking about how to integrate PostgreSQL TDE with existing key management solutions. To help you manage encryption keys with PostgreSQL TDE, we have decided to outline the basic concepts and help people to implement key management strategies for PostgreSQL to secure their infrastructures.

PostgreSQL TDE: How encryption keys are handled

Running PostgreSQL TDE is easy. However, it requires some basic understanding of how we handle keys in general. The following image shows the basic architecture of how to manage encryption keys using our security-enabled version of PostgreSQL:

Manage encryption keys with TDE

Let’s focus on each of these parts separately.

Deploying a PostgreSQL instance

When a new PostgreSQL instance is created, initdb is executed behind the scenes. Most systems do that for you when the service is started for the first time. The purpose of initdb is to create the initial content of the data directory, deploy a config file and so on.

The point is: In the case of TDE (= Transparent Data Encryption), the data files on disk have to be encrypted. The natural question which arises is: How can we find the key? It’s not a good idea to store the key directly in the data directory, so we have to be a little creative to handle keys properly.

When we developed PostgreSQL TDE, we decided on a simple plugin architecture. Key management can be quite diverse. Many different keystores exist and the requirements of every single company differ depending on use cases, philosophy, guidelines and sometimes even due to legal requirements.

The way PostgreSQL TDE fetches the key is by using a plugin which is an executable we have to point to:

[tde@cybertec]$ initdb --help

initdb initializes a PostgreSQL database cluster.

Usage:

  initdb [OPTION]... [DATADIR]

Options:

...

  -k, --data-checksums      use data page checksums

  -K, --encryption-key-command

                            command that returns encryption key

The -K option requires you to point to a script that returns the key as a string. Note that this key is never visible to anybody but the database server. It does not show up in a process table. It does not show up anywhere. The beauty of this design is that you can write any plugin for any key manager of your choice. For testing purposes, you can even call up the command line and make the administrator type in the key. In the real world, almost no-one does that – but even that approach is possible.

As stated: Provide a script or an executable that echoes the key to stdout. PostgreSQL will read it and use it to encrypt your data files. Having this script gives you total flexibility to integrate with every possible kind of key management.

Here is an example:

[tde@cybertec]$ ./initdb -D /path/db -K /path/key_manager.sh

The files belonging to this database system will be owned by user “hs”.

This user must also own the server process.

The database cluster will be initialized with locales

  COLLATE:  en_US.UTF-8
  CTYPE:    en_US.UTF-8
  MESSAGES: en_US.UTF-8
  MONETARY: de_DE.UTF-8
  NUMERIC:  de_DE.UTF-8
  TIME:     de_DE.UTF-8

The default database encoding has been set accordingly, to “UTF8”.

The default text search configuration will be set to “english”.

Data page checksums are disabled.

Data encryption is enabled.

creating directory /path/db ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Vienna
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

[tde@cybertec]$ ./pg_ctl -D /path/db -l logfile start

The -K option is really the path to success here. In the simplest of all cases the script simply returns a constant:

[tde@cybertec]$ cat /path/key_manager.sh 
#!/bin/sh
echo 4e5358ab309bcdea23450934546298ab

But, as stated earlier, this is really just a showcase. DO NOT store the key on the same server – instead, integrate it with a proper key store and manage the key remotely. Otherwise, TDE would be pointless and simply a waste of resources.

What is really important to note here is that the key has to be in hex and it has to be 32 characters long. Otherwise initdb will error out.

initdb: fatal: encryption key is too short, should be a 32 character hex key

For using secrets that humans can memorize there is a bundled pg_keytool utility that takes as input a passphrase and uses a key derivation function to form a correctly formatted output key. By using a key derivation function that takes a non-trivial amount of time to calculate, this adds an extra layer of security against dictionary attacks.

Once the instance has been deployed, we can actually start PostgreSQL normally.

Starting PostgreSQL

What initdb has done for us is to adjust the relevant values in postgresql.conf:

# - Encryption -

encryption_key_command = '/path/key_manager.sh'

Every time the server starts up, it will run the same program to fetch the key. Remember, we need this key to decrypt those data files. The server has already taken care to ensure that nothing can go wrong – assuming that the program does indeed return the correct key. If the key cannot be provided, then the server cannot be started. There is NO BACKDOOR, no emergency key or anything of that kind. If you cannot provide the correct key your data is lost. That is the entire idea of TDE. We do not want backdoors.

Ensuring encryption is enabled

People keep asking: How can I figure out if my PostgreSQL instance is indeed encrypted? The way to do that is to control pg_controldata:

[tde@cybertec]$ pg_controldata /path/db/
...
Database system identifier:           7036323720088499175
Database cluster state:               shut down
pg_control last modified:             Di 30 Nov 2021 12:29:41 CET
...
Mock authentication nonce:            f12fd3308a81f946ffbb36a0a3bd4d41c87bbb9ba4d612b809a001a2202cdc6a
Data encryption:                      on
Data encryption fingerprint:          D2D2CED8FE9F3980599289B1F468A9FB

At the end of the listing, you can find the encryption details. Make sure that you’re using the RIGHT version of pg_controldata. Otherwise, you won’t be able to find that information.

Popular keystores recommended for PostgreSQL TDE encryption keys

One of the main questions is: Which key store can one use for PostgreSQL? The answer is: Any of them. Here are some which are quite popular and widely used by many companies. Keep in mind that the list is not exhaustive:

A typical solution for a production system is to use a scheme where the database master key is generated randomly, then encrypted using a key from a key management system, and finally stored in the PostgreSQL data directory. This allows for keys to be rotated or even the key management system to be swapped out without having to re-encrypt the whole database.

Finally…

You can read more about the topic of encryption in our introduction to Transparent Layer Security (TLS).

Exciting news in the realm of GIS – we recently released a website that offers ready-to-use GIS data. If you are using PostGIS and OSM data, this is the place to go. We provide you with ready-made database dumps featuring the entire planet. Check out our new GIS site.