CYBERTEC PostgreSQL Logo

By Kaarel Moppel - I could make good use of the "slow time" around the turn of the year and managed to push out another set of features for our Open Source PostgreSQL monitoring tool called pgwatch2 - so a quick overview on changes in this post. Continuing the tradition, I'm calling it "Feature Pack 4" as it's mostly about new features. Git and Docker images carry version number 1.5.0. As our last pgwatch2-related blogpost covered only 1.4.0, I'll include here also most important stuff from 1.4.5 minor feature release.

Highlight - Monitoring Postgres with Postgres

This is the biggest one this time – finally and quite appropriately for "Postgres-minded" people, there's now a chance to store all the gathered metrics in Postgres! This of course doesn't necessarily mean that Postgres is best for storing Time-Series Data although it performs very nicely thanks to JSONB...but in general it's a good compromise – more disk space (~3-4x) at comparable query times to InfluxDB...but with full power of SQL! Meaning some saved time learning a new (and quite limited) query language. And after all, only a few people are running dozens and dozens of databases, so performance is mostly not an issue. And on the plus side we can now ask questions that were previously plainly not possible (no joins, remember) or were only possible by storing some extra columns of data (de-normalizing).

The new functionality is designed for the latest Postgres version of 11, but as people run all kinds of different versions and might not want to set up a new cluster, there is also a legacy mode, that will cost more IO though. In total there are 4 different "schema modes" so that people could optimize their IO based on needs:

For the partitioned modes there is also automatic "retention management" – by default 1 month of data is kept. Partitions live in a separate "subpartitions" schema, top level tables are in "public".

To test it out fire up a Docker container with something like:

Call for feedback

And as always, please do let us know on GitHub if you’re still missing something in the tool or are experiencing difficulties - any feedback would be highly appreciated!

Project GitHub link – here.
Full changelog – here.

Most important changes for v1.5.0

There are multiple (4) storage schema types supported so even legacy PG versions and custom needs should be covered. PG 11+ needed though to use time-based partitioning. Comes with automatic "retention policy" enforcement (given the gatherer is running).

A by-product of testing Postgres metrics storage, it helps to quickly estimate metrics data volumes under real life conditions to see if your hardware or selected storage "schema type" can handle the amount of planned metrics. Metrics will be fetched one time from a user specified DB and then data inserted multiple times with correct intervals for the simulated host count. There are also some scripts provided to generate bigger data amounts faster and to test typical dashboard queries speed.

Previously monitoring user passwords where stored in plain text for both "config DB" and "YAML" mode. Now an encryption key or keyfile can be provided to the Gatherer and the Web UI to transparently encrypt passwords. Default will remain "plain-text" as pgwatch2 is assumable mostly used in safe environments or for ad-hoc troubleshooting and it's just more convenient so.

Previously only 'disabled' and 'require' were supported. Certs need to be present on the machine where the gatherer is running.

For example, the "pg_stat_database_conflicts" view is always empty on a primary, so it makes no sense to query it there. This should result in less errors both in Postgres and pgwatch2 logs.

Previously "public" schema for extensions and "metric fetching helpers" was assumed, but now no such assumption is made, allowing any schema – user just needs to make sure that the monitoring role has it's "search_path" correctly set. Also, no more "public" grants for helpers, only for the monitoring role.

Paths need to be adjusted manually both for the Gatherer and Web UI. Thanks @slardiere!

50% less internal message passing over channels meaning much better performance when monitoring 50+ DB-s.

Also, when metric definitions are not found etc, one error per hour only.

Some new metrics ("wal_size") added and "db_size" split up from "db_stats". Gathering intervals for some "not so used" metrics have also been increased in the preset configs.

Most important changes for v1.4.5

Two different metric queries can now run simultaneously on a single monitored DB.

All most important health indicators on a singe page, with green / yellow / red indicators.

Top scanned, un-used, biggest, duplicate and invalid indexes overview.

Top-N by size and growth/scan and INSERT/UPDATE/DELETE rates.

 

pgwatch2 is constantly being improved and new features are added. Learn more >>

Screenshots of new Dashboards

Health-check dashboard
"Health-check" dashboard
Index overview dashboard
"Index overview" dashboard
Top tables dashboard
"Top tables" dashboard

Security is an important topic. This is not only true in the PostgreSQL world – it holds truth for pretty much any modern IT system. Databases, however, have special security requirements. More often than not confidential data is stored and therefore it makes sense to ensure that data is protected properly. Security first! This blog post describes the extension pg_permissions, which helps you to get an overview of the permissions in your PostgreSQL database.

PostgreSQL: Listing all permissions

Gaining an overview of all permissions granted to users in PostgreSQL can be quite difficult. However, if you want to secure your system, gaining an overview is really everything – it can be quite easy to forget a permission here and there and fixing things can be a painful task. To make life easier, Cybertec has implemented pg_permissions (https://github.com/cybertec-postgresql/pg_permissions). There are a couple of things which can be achieved with pg_permissions:

In short: pg_permissions can do more than just listing what there is.

However, let us get started with the simple case – listing all permissions. pg_permissions provides a couple of views, which can be accessed directly once the extension has been deployed. Here is an example:

The easiest way is to use the “all_permissions” view to gain an overview of EVERYTHING. However, if you are only interested in function, tables, columns, schemas and so on there are more views, which you can use. “all_permissions” will simply show you all there is:

PostgreSQL: Detecting security issues with pg_permissions

Securing your application is not too hard when your application is small – however, if your data model is changing small errors and deficiencies might sneak in, which can cause severe security problems in the long run. pg_permissions has a solution to that problem: You can declare, how the world is supposed to be. What does that mean? Here is an example: “All bookkeepers should be allowed to read data in the bookkeeping schema.” or “Everybody should have USAGE permissions on all schemas”. What you can do now is to compare the world as it is with the way you want it to be. Here is how it works:

The user also needs USAGE privileges on the appseq sequence in that schema:

You will instantly get an overview and see, which differences between your desired state and your current state exist. By checking the differences directly during your deployment process, our extension will allow you to react and fix problems quickly.

Changing permissions as fast as possible

Once you have figured out, which permissions there are, which ones might be missing or which ones are wrong, you might want to fix things. Basically, there are two choices: You can fix stuff by hand and assign permissions one by one. That can be quite a pain and result in a lot of work. So why not just update your “all_permissions” view directly? pg_permissions allows you to do exactly that … You can simply update your views and pg_permissions will execute the desired changes for you (fire GRANT and REVOKE statements behind the scene). This way you can change hundreds or even thousands of permission using a simple UPDATE statement. Securing your database has never been easier.

Many people are struggling with GRANT and REVOKE statements. Therefore, being able to use UPDATE might make life easier for many PostgreSQL users out there.

Making pg_permissions even better

We want to make pg_permissions even better: if there are any cool ideas out there, don’t hesitate to contact us anytime. We are eagerly looking for new ideas and even better concepts.

Further reading

For more info about security and permissions in PostgreSQL, read our other blog posts.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

pgBouncer authentication can be like a bouncer at a discotheque who has to ask for the password himself
© Laurenz Albe 2023

UPDATED August 2024
pgbouncer is the most widely used connection pooler for PostgreSQL.
This blog will provide a simple cookbook recipe for how to configure user authentication with pgbouncer.

I wrote this cookbook using Fedora Linux and installed pgbouncer using the PGDG Linux RPM packages available from the download site.
But it should work pretty similarly anywhere.

What is a connection pooler?

Setting max_connections to a high value can impact performace and can even bring your database to its knees if all these connections become active at the same time.

Also, if database connections are short-lived, a substantial amount of your database resources can be wasted just opening database connections.

To mitigate these two problems, we need a connection pooler. A connection pooler is a proxy between the client and the database: clients connect to the connection pooler, which handles the SQL requests via a relatively stable set of persistent database connections (the “connection pool”).

Since clients connect to pgbouncer, it will have to be able to authenticate them, so we have to configure it accordingly.

The very simple method (authentication file)

This method is useful if the number of database users is small and passwords don't change frequently.

For that, we create a configuration file userlist.txt in the pgbouncer configuration directory (on my system /etc/pgbouncer).
The file contains the database users and their passwords, so that pgbouncer can authenticate the client without resorting to the database server.
It looks like this:

You can write the file by hand using the information from the pg_shadow catalog table, or you can create it automatically.
For that to work, you need

Then you can simply create the file like this:

Once you have created userlist.txt, add the following to the [pgbouncer] section of /etc/pgbouncer/pgbouncer.ini:

Done!

Note: the above example uses md5 authentication, which is pretty much obsolete now. Instead, you should be using scram-sha-256. For that to work, you have to use a password hashed with scram-sha-256 in userlist.txt. You cannot use the trick above to generate these hashes. Instead, query the catalog table pg_authid for the actual password hashes. See my article about moving from md5 authentication to scram-sha-256 for more information.

The slightly more complicated method (querying the database)

If users and passwords change frequently, it would be annoying to have to change the user list all the time.
In that case it is better to use an “authentication user” that can connect to the database and get the password from there.

You don't want everyone to see your database password, so we give access to the passwords only to this authentication user.
Using psql, we connect to the database as superuser and run the following:

pg_shadow is only accessible to superusers, so we create a SECURITY DEFINER function to give pgbouncer access to the passwords.

Then we have to create a userlist.txt file as before, but it will only contain a single line for user pgbouncer.

The configuration file in /etc/pgbouncer/pgbouncer.ini should look like this:

Now whenever you authenticate as a user other than pgbouncer, the database will be queried for the current password of that user.

Since the auth_query connection will be made to the destination database, you need to add the function to each database that you want to access with pgbouncer.

Advanced authentication with pg_hba.conf

You can determine which connections pgbouncer will accept and reject using a pg_hba.conf file like in PostgreSQL, although pgbouncer only accepts a subset of the authentication methods provided by PostgreSQL.

To allow connections only from two application servers, the file could look like this:

If the file is called /etc/pgbouncer/pg_hba.conf, your pgbouncer.ini would look like this:

You can of course also use auth_query in that case.

Further References


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

I once received a mail with question: Can you tell me why I can't select the column references?

Well, the quick answer will be: because REFERENCES is a keyword you should use double quotes around it, e.g.

=# select references from v_table_relation;

But why do you sometimes need to explicitly quote identifiers and sometimes it works without them? How do you know what keywords exactly are used in the PostgreSQL? And what keywords may be used as object names? And so on...

As usual one may find the answers in the PostgreSQL manual:
www.postgresql.org/docs/current/sql-keywords-appendix.html

There we have detailed table that lists all tokens that are key words in the SQL standard and in PostgreSQL. From the manual you will know about reserved and non-reserved tokens, SQL standard compatibility and much more which is out of scope of this post.

There are two more ways to know what tokens can or cannot be used in certain situations. First is for true programmers, and the second is for true admins. Choose your pill, Neo.

Investigating the sources

PostgreSQL uses LALR(1)* parser to work with SQL statements. Implementation of the grammar (parser itself) may be found in the gram.y and implementation of the lexical analyzer (lexer) in the scan.l file.

Query first chewed to by lexer, which splits the text into tokens and sends them to the parser. It is logical to assume that lexer is the first who knows if the token is a keyword or not. Examining the source of it we find such lines:

So, the lexer knows if the identifier is a keyword, but this doesn't give us much, since during lexical analyzis we're lacking context. And that is what parser takes care of.

Our journey started from simple SELECT statement producing error, so let's try to examine gram.y from the top to bottom. Our route will be like this:

stmtblock -> stmtmulti -> stmt -> CreateStmt

Here we see that our columns are listed in a OptTableElementList node, so let's dig it:

OptTableElementList -> TableElementList -> TableElement -> columnDef -> ColId

Bamn! We found the right place!

By the way, I'm absolutely sure PostgreSQL sources are the best academic reading in the world. From here we see that we want to accept as much "reserved" keywords as possible, but this produces difficulties in different places causing grammar conflicts (reducereduce mostly, because shiftreduce are solved automatically). Thus, to prevent these errors developers divided keywords into several classes, e.g. "unreserved_keyword" and "col_name_keyword". And only one step left: to find these classes. Piece of cake! Let's check "unreserved_keyword" node:

Asking the server

There is one more way knowing how restrictive keyword usage is: execute query

SELECT * FROM pg_get_keywords();

Function pg_get_keywords() returns list of SQL keywords and their categories:

The catcode column contains a category code: U for unreserved, C for column name, T for type or function name, or R for reserved. The catdesc column contains a possibly-localized string describing the category.

If you need assistantance - please feel free to contact us!

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram