target_session_attrs used in real life - well, almost real
© Laurenz Albe 2021

 

PostgreSQL commit ee28cacf61 has added new options for the target_session_attrs connection parameter. This article explores how the new options can be used to simplify high availability and horizontal scaling solutions.

What is target_session_attrs?

The PostgreSQL client shared library libpq has support for connection strings to more than one database server:

psql 'host=server1.my.org,server2.my.org port=5432,5432 dbname=mydb user=myself'

In the URL connection string format, that would look like

psql 'postgresql://myself@server1.my.org:5432,server2.my.org:5432/mydb'

Then the client will first try to connect to server1.my.org, and if it cannot reach that server, it will try to connect to server2.my.org.

This is useful in combination with high-availability solutions, because it allows the client to fail over to a secondary server without the need for additional load balancing software. However, there is one shortcoming: if we don’t know which of the servers is the currently active primary server, we might accidentally end up connecting to a standby server. This is usually not what you want, because you cannot run data modifying statements there.

To make up for this, you can specify the additional parameter “target_session_attrs”, which determines when a server is acceptable. If you specify

target_session_attrs=read-write

the PostgreSQL client will not accept a connection to a server where it cannot modify data. This allows you to connect to the streaming replication primary, regardless of the order of the servers in the connection string.

Improved options for target_session_attrs in v14

PostgreSQL v14 introduces these new options:

  • read-only: only connect to a server that does not allow data modifications. This is the converse of read-write.
  • primary: only connect to a database that is not a streaming replication standby.
  • standby: only connect to a streaming replication standby.
  • prefer-standby: connect to a standby server if there is one. Only if there is no standby server on the list, accept a connection to a server that is not in recovery mode.

You are probably confused by the distinction between “read-only” and “standby” or between “read-write” and “primary”.
The difference is subtle: if the parameter “default_transaction_read_only” is set to “on”, a server can be read-only even if it is not a streaming replication standby server.

Which client software can use target_session_attrs?

Since this functionality is part of the client C library libpq, all clients that link with that library automatically support this connection parameter. This includes the Python library psycopg2, drivers for Go, Rust and most other programming languages.

Notable exceptions to this are the PostgreSQL JDBC driver and the .NET provider Npgsql, which do not link with libpq. However, the JDBC driver already has support for this functionality with the “targetServerType” connection parameter, which you can set to “primary”, “secondary” and “preferSecondary” to achieve the same behavior. Npgsql does not offer this functionality.

Making use of target_session_attrs for high availability

The application is obvious: you always want to connect to the primary server. While target_session_attrs=read-write served the purpose fairly well, you should switch to using target_session_attrs=primary. This will do the right thing even in the unlikely case that somebody set default_transaction_read_only on the primary server.

Making use of target_session_attrs for horizontal scaling

You can use PostgreSQL streaming replication for horizontal scaling by directing read requests to standby servers. There are some limitations to this approach:

  • You can only scale read requests, not write requests.
  • Data modifications on the primary may not immediately be visible on the standby. The delay is normally small, but can increase through replication conflicts with queries on the standby server.
  • Your application has to be able to execute read requests on a different data source than write requests.

Other than that, you can use target_session_attrs to implement scaling without the need for additional load balancing software:

  • The connection string for write requests contains target_session_attrs=read-write.
  • The connection string for read requests contains target_session_attrs=prefer-standby. If you have more than one standby server, you could also use standby — then your application will fail rather than directing read requests to the primary server.

Conclusion

The new settings for target_session_attrs allow a more reliable high-availability setup and some degree of horizontal scaling without the need for additional software. This can simplify your architecture, thus making it more reliable.