CYBERTEC Logo
Before there was TCP keepalive
© Laurenz Albe 2021

 

(Updated 2023-06-22) If you've heard about TCP keepalive but aren't sure what that is, read on. If you've ever been surprised by error messages like:

then this article is for you.

Reasons for broken connections

There are several possible causes for broken connections:

Database server crashes

The first two messages in the above list can be the consequence of a PostgreSQL server problem. If the server crashes for whatever reason, you'll see a message like that. To investigate whether there is a server problem, you should first look into the PostgreSQL log and see if you can find a matching crash report.

We won't deal with that case in the following, since it isn't a network problem.

Connections abandoned by the client

If the client exits without properly closing the database connection, the server will get an end-of-file or an error while communicating on the network socket. With the new session statistics introduced in v14, you can track the number of such “abandoned” database connections in pg_stat_database.sessions_abandoned.

For example, if an application server fails and is restarted, it typically won't close the connections to the database server. This isn't alarming, and the database server will quickly detect it when the server tries to send data to the client. But if the database session is idle, the server process is waiting for the client to send the next statement (you can see the wait_eventClientRead” in pg_stat_activity). Then the server won't immediately notice that the client is no longer there! Such lingering backend processes occupy a process slot and can cause you to exceed max_connections.

PostgreSQL v14 has introduced a new parameter idle_session_timeout which closes idle connections after a while. But that will terminate “healthy” idle connections as well, so it isn't a very good solution. TCP keepalive provides a much better solution to this problem.

Connections closed by a network component

Sometimes both ends of the database connection experience the same problem: each sees that the other end “hung up on them”. In that case, the problem lies somewhere between the database client and the server.

Network connections can get disconnected if there is a real connectivity problem. There's nothing you can do to change that on the software level. But very often, disconnections are caused by the way firewalls or routers are configured. The network component may have to “memorize” the state of each open connection, and the resources for that are limited. So it can seem expedient to “forget” and drop connections that have been idle for a longer time.

Since a lot of today's TCP traffic is via HTTP, and HTTP is stateless, that's not normally a problem. If your HTTP connection is broken, you simply establish a new connection for your next request, which isn't very expensive. But databases are different:

This is where TCP keepalive comes in handy as a way to keep idle connections open.

What is TCP keepalive?

Keepalive is a functionality of the TCP protocol. When you set the SO_KEEPALIVE option on a TCP network socket, a timer will start running as soon as the socket becomes idle. When the keepalive idle time has expired without further activity on the socket, the kernel will send a “keepalive packet” to the communication partner. If the partner answers, the connection is considered good, and the timer starts running again.

If there is no answer, the kernel waits for the keepalive interval before sending another keepalive packet. This process is repeated until the number of keepalive packets sent reaches the keepalive count. After that, the connection is considered dead, and attempts to use the network socket will result in an error.

Note that it is the operating system kernel, not the application (database server or client) that sends keepalive messages. The application is not aware of this process.

TCP keepalive serves two purposes:

TCP keepalive default settings

The default values for the keepalive parameters vary from operating system to operating system. On Linux and Windows, the default values are:

Thanks to Vahid Saber for the MacOS settings!

Using TCP keepalive to keep an idle database session alive

To keep firewalls and routers from closing an idle connection, we need a much lower setting for the keepalive idle time. Then keepalive packets get sent before the connection is closed. This will trick the offending network component into believing that the connection isn't idle, even if neither database client nor server send any data.

For this use case, keepalive count and keepalive interval are irrelevant. All we need is for the first keepalive packet to be sent early enough.

Using TCP keepalive to detect dead connections

For this use case, reducing the keepalive idle time is often not enough. If the server sends nine keepalive packets with an interval of 75 seconds, it will take more than 10 minutes before a dead connection is detected. So we'll also reduce the keepalive count, or the keepalive interval, or both - as in this case.

There is still one missing piece to the puzzle: even if the operating system detects that a network connection is broken, the database server won't notice, unless it tries to use the network socket. If it's waiting for a request from the client, that will happen immediately. But if the server is busy executing a long-running SQL statement, it won't notice the dead connection until the query is done and it tries to send the result back to the client! To prevent this from happening, PostgreSQL v14 has introduced the new parameter client_connection_check_interval, which is currently only supported on Linux. Setting this parameter causes the server to “poll” the socket regularly, even if it has nothing to send yet. That way, it can detect a closed connection and interrupt the execution of the SQL statement.

Setting TCP keepalive parameters on the PostgreSQL server

The PostgreSQL server always sets SO_KEEPALIVE on TCP sockets to detect broken connections, but the default idle timeout of two hours is very long.

You can set the configuration parameters tcp_keepalives_idle, tcp_keepalives_interval and tcp_keepalives_count (the last one is not supported on Windows) to change the settings for all server sockets.

This is the most convenient way to configure TCP keepalive for all database connections, regardless of the client used.

Setting TCP keepalive parameters on the PostgreSQL client

The PostgreSQL client shared library libpq has the connection parameters keepalives_idle, keepalives_interval and keepalives_count (again, the latter is not supported on Windows) to configure keepalive on the client side.

These parameters can be used in PostgreSQL connection strings with all client interfaces that link with libpq, for example, Psycopg or PHP.

The PostgreSQL JDBC driver, which does not use libpq, only has a connection parameter tcpKeepAlive to enable TCP keepalive (it is disabled by default), but no parameter to configure the keepalive idle time and other keepalive settings.

Setting TCP keepalive parameters on the operating system

Instead of configuring keepalive settings specifically for PostgreSQL connections, you can change the operating system default values for all TCP connections - which can be useful, if you are using a PostgreSQL client application that doesn't allow you to set keepalive connection parameters.

On Linux, this is done by editing the /etc/sysctl.conf file:

To activate the settings without rebooting the machine, run

On older MacOS versions, you can also edit /etc/sysctl.conf, but the parameters are different:

On newer MacOS versions (tested on 13), create the file /Library/LaunchDaemons/sysctl.plist

You will have to reboot to activate the changes.

On Windows, you change the TCP keepalive settings by adding these registry keys:

As noted above, there is no setting for the number of keepalive probes, which is hard-coded to 10. The registry keys must be of type DWORD, and the values are in milliseconds rather than in seconds.

After changing these keys, restart Windows to activate them.

Conclusion

Configuring TCP keepalive can improve your PostgreSQL experience, either by keeping idle database connections open, or through the timely detection of broken connections. You can do configure keepalive on the PostgreSQL client, the server, or on the operating system.

In addition to configuring keepalive, set the new parameter client_connection_check_interval to cancel long-running queries when the client has abandoned the session.

 

To learn more about terminating database connections, see our blog post here.

If you would like to learn about connections settings, see my post about max_connections.

© Laurenz Albe 2021

 

Both cursors and transactions are basic ingredients for developing a database application. This article describes how cursors and transactions interact and how WITH HOLD can work around their limitations. We will also see the dangers involved and how to properly use WITH HOLD cursors in a PL/pgSQL procedure.

Cursors in PostgreSQL

When a query is ready for execution, PostgreSQL creates a portal from which the result rows can be fetched. During normal query execution, you receive the whole result set in one step. In contrast, a cursor allows you to fetch the result rows one by one. A cursor marks a position within a result set. Cursors are particularly useful in procedural code on the client or in the database, because they allow you to loop through the query results. Another advantage is that a cursor allows you to have more than one SQL statement running at the same time, which is normally not possible in a single database session.

A simple example for PL/pgSQL code that uses a cursor would be:

In this example, the SELECT is executed concurrently with the DROP TABLE statements.

The above is not the most readable way to write this in PL/pgSQL (you could have used “FOR v_schema, v_name IN SELECT ... LOOP ... END LOOP;”, which uses a cursor “under the hood”), but I wanted to make the cursor explicit.

Note that it is often possible to avoid a cursor loop by using a join in the database. Such a join is more efficient, because it does all the work in a single statement. However, we have to use a cursor in our case, since we need to execute a dynamic SQL statement inside the loop.

Cursors and transactions

One basic property of a PostgreSQL cursor is that it only exists for the duration of a database transaction. That is not surprising, since a cursor is a single SQL statement, and an SQL statement is always part of one transaction. In the above example we had no problem, because a DO statement is always executed in a single transaction anyway.

Cursors are automatically closed at the end of a transaction, so it is usually not necessary to explicitly close them, unless they are part of a long-running transaction and you want to free the resources allocated by the statement.

Cursors in SQL

A special feature of PostgreSQL is that you can use cursors in SQL. You create a cursor with the DECLARE statement:

Here is a short description of the different options:

There is also an SQL statement FETCH that is more powerful than its PL/pgSQL equivalent, in that it can fetch more than one row at a time. Like PL/pgSQL, SQL also has a MOVE statement that moves the cursor position without retrieving rows.

SQL cursors are closed with the CLOSE statement, or by the end of the transaction.

SCROLL cursors

Some execution plans, like a B-tree index scan or a sequential scan, can be executed in both directions. A cursor for a query with such an execution plan is implicitly scrollable, that is, you can move the cursor position backwards in the result set. PostgreSQL calculates query result rows “on demand” and streams them to the client, so scrollable cursors for such queries come with no extra cost.

Other, more complicated execution plans require the explicit keyword SCROLL for the cursor to become scrollable. Such cursors incur an overhead, because the server must cache the entire result set.

Here is a little example that showcases scrollable cursors:

Cursor sensitivity

The SQL standard distinguishes SENSITIVE, INSENSITIVE and ASENSITIVE cursors. A sensitive cursor reflects modifications of the underlying data; one consequence of this is that scrolling back to a previous row might fetch a different result. PostgreSQL does not implement sensitive cursors: that would be difficult, because a statement always sees a stable snapshot of the data in PostgreSQL.

PostgreSQL cursors are always insensitive, which means that changes in the underlying data after the cursor has started processing are not visible in the data fetched from the cursor. “Asensitive”, which means that the sensitivity is implementation dependent, is the same as “insensitive” in PostgreSQL.

Note that this insensitivity also applies if you modify a table via the special statements “UPDATE/DELETE ... WHERE CURRENT OF ”.

WITH HOLD corsors

Since WITH HOLD cursors live longer than a transaction, but statements don't, PostgreSQL must calculate the complete result set at COMMIT time and cache it on the server. This can result in COMMIT taking an unusually long time.

Moreover, WITH HOLD cursors are not automatically closed at the end of the transaction, so you must not forget to CLOSE them if you don't want the result set to hog server resources until the end of the database session.

Here is an example of a WITH HOLD cursor in action:

Cursors in PL/pgSQL

Cursors in PL/pgSQL are variables of the special data type refcursor. The value of such a variable is actually a string. That string is the name of the portal that is opened when a query is bound to the cursor variable and the cursor is opened.

Using refcursor variables, you can also pass cursors between PL/pgSQL functions or procedures:

Cursor declarations in PL/pgSQL support SCROLL, but not WITH HOLD, for the historical reason that PostgreSQL functions always run inside a single transaction. Also, you can only FETCH a single row at a time from a PL/pgSQL cursor.

WITH HOLD cursors in PL/pgSQL procedures

Procedures, introduced in PostgreSQL v11, support transaction commands like COMMIT and ROLLBACK under certain circumstances. Consequently, it would be useful to have WITH HOLD cursors in procedures. There are two ways to work around the lack of WITH HOLD cursors in PL/pgSQL:

Here is sample code that illustrates the second technique:

Note how the code makes dead sure that the cursor cannot “leak” from the procedure!

Conclusion

Both cursors and transactions are well-known database features. Normally, cursors exist only within a single database transaction. But by using WITH HOLD, you can escape that limitation. Useful as this feature is, you have to be aware of the performance impact during COMMIT, and you have to make sure that you close the cursor to free the server's resources.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

In this article, I will answer the questions: why isn't there a  SHOW TABLES command in PostgreSQL, when will SHOW TABLES in PostgreSQL be available, and how do I list tables with native PostgreSQL methods?

Why isn't there a SHOW TABLES command in PostgreSQL?

People who come from MySQL are always asking the same question: why doesn't the command SHOW TABLES work in PostgreSQL ?

First of all, there is a SHOW command in PostgreSQL, but it's responsible for returning the value of a run-time parameter.

Now you know what the error message "unrecognized configuraton parameter" means: there is no run-time parameter with the name "tables".

By the way, we have a comprehensive article on how to set PostgreSQL parameters.

But how can it be that the same command means opposite things in the two most popular open-source RDBMS? The answer is quite obvious: the SHOW command is not part of the SQL ISO or ANSI standard. That allows anybody to use it as an extension to standard SQL.

When will SHOW TABLES be available in PostgreSQL?

I would say never! First, the SHOW command has its own semantics in PostgreSQL. Second, it's not part of the SQL standard. And probably it never will be, because the standard committee decided to use a different approach, called the Information Schema. However, even now, information schema support is not at the appropriate level in most databases.

How do I show tables in PostgreSQL?

Answers to this question may be found in a lot of places. I will just repeat them here.

The PostgreSQL way

If you're using the psql command-line utility, then try the dt built-in command.

Mnemonic rule: dt = Describe Table.

If you're using any other utility than psql, then these SQLs are probably the best to show tables in PostgreSQL:

But as you can see, they will list all the tables in the database, so you probably will need filtering anyway.

The ANSI standard way

As I said, there is a SQL-standard way to show tables in PostgreSQL by querying information_schema:

Conclusion

There is no universal way of doing common things in all databases. Even though the SQL standard declares the Information Schema to be the proper way, not every database manufacturer implemented its support on a decent level. If you're using the native command-line psql tool, you will have built-in commands; otherwise, you need to query system tables and/or views. If you're using a GUI application, then you probably don't have this problem. Every known GUI application with PostgreSQL support allows you to list, filter, and find any objects.

In case you were wondering how to compare the content of two PostgreSQL tables, see Hans' post about creating checksums for tables.

In response to repeated customer requests seeking spatial datasets based on the OpenStreetMap service, CYBERTEC decided to start an initiative to address this demand.

CYBERTEC implemented a "download OpenStreetMap" service which periodically generates extracts of OpenStreetMap data in various forms, and outputs the data as an sql dump to streamline and simplify its usage. Extracts cover a particular region of interest-- typically, aligned with the boundaries of countries or continents. We already set extract imports’ parameters for your convenience. Import variants implicitly state how your dataset was imported into our database. New variants can be created according to your wishes, contact us for more details.

The main entry point of our service, which is free of charge, is located at https://gis.cybertec-postgresql.com/.

Dataset import variants

Importing OpenStreetMap data into PostGIS can be a hard and time-consuming task. To simplify this process and speed up the import, we decided to kick off a service offering two dataset types, periodically served as sql dumps.

The basis for all import variants are pure OSM extracts, provided as pbfs.

The first type of import variant is drawn from the main objective of OpenStreetMap, which is to solve spatial questions by analyzing and visualizing spatial data. There are endless ways to import and map OSM data to PostGIS in order to serve this purpose. To support most typical use cases while retaining flexibility, we decided to implement a generic import like that of  OpenStreetMap Carto (https://github.com/gravitystorm/openstreetmap-carto). Incidentally, this enables us to use this dump seamlessly as a source for tiling services, such as https://switch2osm.org/serving-tiles/ or https://github.com/Overv/openstreetmap-tile-server.
A brief description of the import chain can be found at https://gis.cybertec-postgresql.com/osmtile/readme.txt.

The second type of import variant addresses the demand to solve various kinds of routing-related questions. This time, import native datasets utilizing osm2po (https://osm2po.de/), a great java application which turns pure OpenStreetMap data into a routable graph. Ideally, the resulting table will be used together with pgRouting (https://pgrouting.org/), another terrific extension which brings geospatial routing functionality to PostgreSQL.
Checkout the import chain at https://gis.cybertec-postgresql.com/osmrouting/readme.txt.

Dataset coverage

CYBERTEC offers extracts for the whole world. Since we don't see the need to re-invent the wheel, we align datasets' spatial extents with the boundaries of countries or continents. To simplify access and downloads, we structured the datasets hierarchically, as stated in figure 1.

data hierarchyDataset hierarchy

Dataset frequency

How often will our datasets be generated? We generate native extracts (pbfs) daily, and produce dumps 1 to 2 times per month. Currently, we try to optimize the whole process to provide more frequent current dumps for our customers.

Dataset usage and import

So how can customers use our service? Let’s go through a typical customer scenario.

A customer plans to set up a tiling server which offers raster data covering Austria. Let’s assume tiling infrastructure is already up and running, however the missing link is a database containing OpenStreetMap data. Instead of utilizing osm2pgsql to import the dataset, the customer can take a nice shortcut and directly download the appropriate sql dump from our service 😊.

General instructions are as follows:

First, head to https://gis.cybertec-postgresql.com/ and choose your region of interest. Figure 2 shows available datasets for Austria at the present time. Since we want to setup a tiling service, our dataset import variant of choice is “Analysis, Tiling”. Let’s download the latest dump for this region from https://gis.cybertec-postgresql.com/osmtile/europe/austria/osmtile_europe_austria_latest_compressed.dump

database download
Dataset download, restore instructions

 

Second, we must import our dump. How can this be accomplished?

You can see instructions beside each published dump as readme.txt. For Austria, instructions can be accessed at https://gis.cybertec-postgresql.com/osmtile/europe/austria/readme.txt.

Let’s quickly look at the instructions to better understand how we must proceed. As a requirement, you must prepare the database with the extensions PostGIS and hstore enabled.

 

Finally, the dump can be restored by executing

 

This results in a new database schema osmtile_europe_austria.

The listing below shows the tables generated within our schema.

Outlook

This article briefly introduces CYBERTEC's brand new OpenStreetMap download service. The service has been released recently, and we are very curious what customers think about it. Please leave a message at gis@cybertec.at to give us feedback, discuss further dataset import variants, or in case you need assistance. Check out this post if you want to know how to start out with PostgreSQL and PostGIS.


Please leave your comments below. In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram