August 2016

By Kaarel Moppel - Walbouncer was covered on the blog when it was first announced, but that was almost 2 years ago. So it's a good time to echo it out again, especially in light of compatibility update to support PostgreSQL 9.5, addition of a helper script for bootstrapping replicas called Walbouncer-companion and making the walbouncer code available on GitHub.

About the walbouncer update

To recap the previous blogpost – Walbouncer is a standalone application acting as a replication proxy, developed by Cybertec. It stands between a master and a replica, enabling selective replication by forwarding data only for specific databases/tablespaces to the connected replicas, filtering out other data (replacing it with “no-op”-s). One Walbouncer can serve multiple replicas, with different database/tablespace configurations.

Main use case

The main use case for applying Walbouncer is in environments where many databases happen to be located on the same instance, but say for load-balancing or analytical query reasons, one needs a single database only. With the current means provided by the Postgres project, one would need to copy the whole instance with all of the databases! Thus disk space could become a serious problem. Also with Walbouncer the general system load on the replicas would sink as uninteresting WAL data wouldn’t be applied but just thrown away.

Steps for trying out Walbouncer

# Download and compile Walbouncer
git clone https://github.com/cybertec-postgresql/walbouncer.git
cd walbouncer/src && make

# Customize the sample config file by saying which databases/tablespaces should be replicated
vim ../walbouncer.conf.sample

./walbouncer -c ../walbouncer.conf.sample

# Download the helper Python script for creating a “selective basebackup” for cases when disk space is scarce or network is slow, so that only data files for databases/tablespaces that we’re interested in are copied. NB! When using tablespaces, one needs to ensure that identical paths exist on the replica also
# pg_basebackup could be used as well here, when the initial full size of the cluster is not a problem
git clone https://github.com/cybertec-postgresql/walbouncer-companion.git
cd walbouncer-companion
pip install -r requirements.txt
walbouncer_companion.py -c path_to_walbouncer/walbouncer.conf.sample –replica-name repl1 –pgdata filtered_replica/

# Create an customize a recovery.conf to point to the running Walboucer instance
cat  filtered_replica/recovery.conf
recovery_target_timeline = 'latest'
primary_conninfo = 'host=localhost port=5433 user=postgres application_name=repl1'

# Start the replication server. All done. When databases/tablespaces that are filtered out are connected to, user will just get an error
pg_ctl -D filtered_replica/ start

# Download and compile Walbouncer

git clone https://github.com/cybertec-postgresql/walbouncer.git

cd walbouncer/src && make

# Customize the sample config file by saying which databases/tablespaces should be replicated

vim ../walbouncer.conf.sample

./walbouncer -c ../walbouncer.conf.sample

# Download the helper Python script for creating a “selective basebackup” for cases when disk space is scarce or network is slow, so that only data files for databases/tablespaces that we’re interested in are copied. NB! When using tablespaces, one needs to ensure that identical paths exist on the replica also

# pg_basebackup could be used as well here, when the initial full size of the cluster is not a problem

git clone https://github.com/cybertec-postgresql/walbouncer-companion.git

cd walbouncer-companion

pip install -r requirements.txt

walbouncer_companion.py -c path_to_walbouncer/walbouncer.conf.sample –replica-name repl1 –pgdata filtered_replica/

# Create an customize a recovery.conf to point to the running Walboucer instance

cat filtered_replica/recovery.conf

recovery_target_timeline = 'latest'

primary_conninfo = 'host=localhost port=5433 user=postgres application_name=repl1'

# Start the replication server. All done. When databases/tablespaces that are filtered out are connected to, user will just get an error

pg_ctl -D filtered_replica/ start

Summary

Before full built-in logical replication is integrated into core PostgreSQL, using Walbouncer would currently be the easiest way to implement highlighted special scenarios, where having a full replica is not feasible. Walbouncer currently works with PostgreSQL 14 and any feedback or pull requests on GitHub would be very much appreciated.

In case you need any assistance, please feel free to contact us.

By Kaarel Moppel - In recent years, there has been quite a lot of buzz about “insert-only” approaches and some database-like products (Datomic, Apache Samza) have emerged, being inspired by the idea of having an immutable data store. In light of cheap storage and powerful hardware, I see the idea definitely having potential for certain use cases. So why not to apply the concept using a PostgreSQL database? There are no inherent limitations there. One should just find a suitable use case and design the data model accordingly.

My own interest was more towards performance of the approach in comparison to the standard “update” way, as in theory according to my understanding “insert-only” approach should provide some performance benefits as Postgres implementation of UPDATE is actually a DELETE+INSERT together with the possible update of all index entries in the worst-case scenario.

Pros and Cons

When doing only INSERT’s instead I would reckon that altogether fewer IO operations would be needed, bloat would be minimized (meaning no AUTOVACUUM hiccups), mostly sequential writing would happen (index pages could still split), a lot less full page images need to be written into XLOG. Plus on the logical level one would get the full history of changes making audit tables unnecessary.

Of course there would also be downsides – more disk space and additional indexing would be needed, queries could easily get unwieldy for joins etc, limited foreign key usage, for really big amounts of data one could need custom batch jobs for cleaning up old versions of rows still at some point, to name a few considerations.

Finding a suitable use case

So after some thinking about a good but simple test case for my theoretical hunch about possible performance benefits, I decided to model my “insert-only” schema to match the short version (using the –skip-some-updates flag) of the default “pgbench” TPC-B scenario. What the short version does, is that it leaves out balance updates on the smaller “pgbench_branches” and “pgbench_tellers” tables (where locking could occur, thus not maxing out IO), meaning that my customized “insert-only” script would become very simple, only doing inserts into the “pgbench_accounts”, leaving out also the insert on “pgbench_history” as we would already have the full history in the accounts table itself anyways.

Now to the course of testing itself. Files for converting the schema to insert-only mode and the custom “pgbench” test script can be found here.

# Initialize the normal “pgbench” schema with a small 1.3GB dataset (fits in RAM)
pgbench -i -s 100 &>/dev/null

# run the short version for 5min
pgbench -T 300 --skip-some-updates -c4 -j4 –protocol=prepared
starting vacuum...end.
transaction type: Update only pgbench_accounts
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 300 s
number of transactions actually processed: 1555859
latency average: 0.771 ms
tps = 5186.167970 (including connections establishing)
tps = 5186.221344 (excluding connections establishing)

# Initialize the read-only schema with a small 1.3GB dataset
pgbench -i -s 100 &>/dev/null

# convert the schema to insert-only mode
psql -f insert_only_data_model_changes.sql

# run the custome version for 5min
pgbench -s 100 -T 300 -c4 -j4  --protocol=prepared -f insert_only_tx_pgbench.sql
starting vacuum...end.
transaction type: Custom query
scaling factor: 100
query mode: prepared
number of clients: 4
number of threads: 4
duration: 300 s
number of transactions actually processed: 2228239
latency average: 0.539 ms
tps = 7427.369377 (including connections establishing)
tps = 7427.546212 (excluding connections establishing)

# Initialize the normal “pgbench” schema with a small 1.3GB dataset (fits in RAM)

pgbench -i -s 100 &>/dev/null

# run the short version for 5min

pgbench -T 300 --skip-some-updates -c4 -j4 –protocol=prepared

starting vacuum...end.

transaction type: Update only pgbench_accounts

scaling factor: 100

query mode: prepared

number of clients: 4

number of threads: 4

duration: 300 s

number of transactions actually processed: 1555859

latency average: 0.771 ms

tps = 5186.167970 (including connections establishing)

tps = 5186.221344 (excluding connections establishing)

# Initialize the read-only schema with a small 1.3GB dataset

pgbench -i -s 100 &>/dev/null

# convert the schema to insert-only mode

psql -f insert_only_data_model_changes.sql

# run the custome version for 5min

pgbench -s 100 -T 300 -c4 -j4 --protocol=prepared -f insert_only_tx_pgbench.sql

starting vacuum...end.

transaction type: Custom query

scaling factor: 100

query mode: prepared

number of clients: 4

number of threads: 4

duration: 300 s

number of transactions actually processed: 2228239

latency average: 0.539 ms

tps = 7427.369377 (including connections establishing)

tps = 7427.546212 (excluding connections establishing)

Test results

After looking at the TPS numbers for both runs we see that the performance benefit expressed in TPS was +43% here! And behind the scenes I actually also measured the amount of WAL generated, the difference there was even bigger with +135%.

To sum it up - although a very simplistic example here - “insert-only” data modelling provided a nice TPS boost for our test scenario, while providing exactly the same (and even more, full history) information. So in short it could be a worthwhile investigation when one has to deal with sudden peaks for example, as it should give you more predictable IO when applied correctly. A nice exercise in thought in any case.

You also may be interested in my post about insert-only data modelling to smooth peaks on slow disks.

By Kaarel Moppel - PostgreSQL has a bagful of server configuration parameters (249 according to my counting for version 9.5) at your disposal, which is mostly a good thing, as it enables you to take the maximum out of your hardware, if you’re willing to put in the necessary time. However, some of the parameters might leave the door open for misinterpretation, if you haven’t looked into the documentation properly. So here’s a short reminder about the behavior of the “log_statement” parameter’s “mod” mode, as I’ve seen it misinterpreted when trying to track down how some table rows got modified during the development phase.

When you’re editing the “postgresql.conf” file for your newly created database cluster, under “What to Log” section you’ll see something like that:

#log_statement = 'none'      # none, ddl, mod, all

1	#log_statement = 'none' # none, ddl, mod, all

Looking at the four possible options, it’s easy to think that setting it to “mod” will log all statements doing data modifications on your tables... but not quite so. It actually applies more to the statement type, covering INSERT, UPDATE, DELETE, TRUNCATE and some more exotic options. So what about a common case of calling a simple stored procedure that updates a row for example? Something like:

SELECT * FROM increment_login_count();

1	SELECT * FROM increment_login_count();

Will something be logged then? Nope. As it is a SELECT statement. In this case setting log_statements=‘all’ would do the trick, but most of the time you would want to avoid that for “production” use.

Baseline

In short - it’s only possible to log top level statements with everything happening down the stream (triggers, stored procedures code, CTE parts doing data modification) not logged. So one cannot think about PostgreSQL logging as “log all changes done to table rows” (this could generate billions of log lines easily) but rather “log statements issued by the client”. The parameter name “log_statement” actually hints at that also.

Have a nice day!

In case you need any assistance, please feel free to contact us.

Sometimes it happens that a query slows down dramatically for no obvious reason. In other cases queries go south in case a certain set of input parameters is used. In some of those situations, wrong optimizer estimates can be the root cause of all evil. One of my “favorite” issues with wrong planner estimates: Underestimated nested loops.

What are nested loops?

Nested loops are a means to join things. They are usually used if “one side” of the join consists of only a small set of data. Here is some pseudo code explaining how it works:

for x in a:
    for y in b:
        if a == y:
        return row

for x in a:

for y in b:

if a == y:

return row

Maybe you can join a table containing millions of people with a table containing only a handful of genders - this might be a good candidate for a nested loop.

What can go wrong

But what will happen if the optimizer underestimates the number of rows involved in a nested loop? Runtime will go through the roof. A nested loop is O(n^2) so runtime will grow very fast as the amount of data grows. Logically the consequences will be a disaster.

Here is an example:

Nested Loop (cost=18539.46..30587.66 rows=22 width=111)
    (actual time=160.149..291490.221 rows=35005 loops=1)

1 2	Nested Loop (cost=18539.46..30587.66 rows=22 width=111) (actual time=160.149..291490.221 rows=35005 loops=1)

In this example the optimizer thinks that 22 rows will be needed. However, in reality the nested loop ended up finding 35005 rows. Why is that critical? Just look at the “actual time”. It went up from 160 ms to a staggering 291490 ms. Runtime has simply exploded.

There are a couple of ways to attack the problem:

Fix estimates
Turn nested loops for the specific query off

As you can imagine a.) is definitely the best solution. However, in some cases this is not possible so running “SET enable_nestloop TO off” before the query might just fix the problem. However, make sure they are turned on again after the query in doubt has completed.

In case you need any assistance, please feel free to contact us.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

It happens on a quite regular basis that people contact the CYBERTEC PostgreSQL Support Desk to get support for a pretty common thing: Paginators on websites. What is the main issue? How hard can be it to display simple tables after all?

A typical scenario with paginators

Suppose you want to display a table on your website. You definitely don't want to display 1 million entries at once, so you turn to a pagination. What many people do is:

SELECT …
FROM tab
WHERE …
LIMIT 100

SELECT …

FROM tab

WHERE …

LIMIT 100

… and then …

SELECT count(*)
FROM tab
WHERE …

SELECT count(*)

FROM tab

WHERE …

What you have just seen is a classical recipe for disaster. Why that? The first query might be highly efficient and execute on milliseconds. However, what about the second one? Can anybody be sure that it also takes just milliseconds? No, because if the WHERE clause is not selective, the count might yield a really high number. Just recently our support team had a typical case: The second query took 9 (!) minutes to complete.

Therefore you have to ask yourself: Do you really want to wait for minutes to have an exact page count? The user has to wait minutes just to figure out that his query has yielded exactly 4.353.902.535 pages. How many people do you know, whoever checked more than maybe the first 5 pages?

To fix that problem you basically got two choices

use estimates
display the first 100 rows only

As far as estimates are concerned: Have you ever used Google? Google will stop providing you with answers after around 30 pages. Google does that for a reason: Nobody will ever go to page number 31. Everybody will refine his search long before.

So, estimates are fine. In PostgreSQL you can simply use EXPLAIN to see, how many rows PostgreSQL expects. This is fine in many cases - especially when result sets are very large.

The second option is a query like that:

SELECT …
FROM tab
WHERE …
LIMIT 101

SELECT …

FROM tab

WHERE …

LIMIT 101

Assuming that you want to display exactly 100 rows, you can always tell the user that there are more than 100 rows or more than, say, 10 pages. However, there is an even bigger advantage: Deterministic runtime. If you fetch 101 rows only - can execution times vary? Sure, they can but they will certainly stay within an acceptable range. If there is no limit, runtimes can theoretically be infinite.

Implications of slow queries:

Let us assume we got 1.000.000 queries, which need 1 ms each, and let us assume we got 1.000.000 queries, which need 1 second each.

In the first scenario, life is easy: A single CPU can do that in around 16 minutes.

In the second case 12 CPUs will be busy for a day.
So just imagine what will happen, if your “slow” query does not take 1 second but 1 minute? This is not uncommon in case you really want to count everything, to display an exact page count (which is usually done in grey, font size = 6).

In case you need any assistance, please feel free to contact us.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.