Tag: statistics
pg_stat_io and PostgreSQL 16 performance
Learn about pg_stat_io‘s debugging power: PostgreSQL 16 blesses users around the world with many features which ensure an even better user experience. One of those features is a system view called pg_stat_io. It offers deep insights into the I/O behavior of your PostgreSQL database server. From PostgreSQL 16 onward, it will make it far easier […]
Enabling and disabling autovacuum in PostgreSQL
UPDATED August 2023: Autovacuum has been part of PostgreSQL for a long time. But how does it really work? Can you simply turn it on and off? People keep asking us these questions about enabling and disabling autovacuum. PostgreSQL relies on MVCC to handle concurrency in a multiuser environment. The problem which arises with concurrent […]
Estimating connection pool size with PostgreSQL database statistics
© Laurenz Albe 2021 PostgreSQL v14 has new connection statistics in pg_stat_database. In this article, I want to explore one application for them: estimating the correct size for a connection pool. New connection statistics in v14 Commit 960869da080 introduced some new statistics to pg_stat_database: session_time: total time spent by sessions in the database active_time: time […]
PostgreSQL: CREATE STATISTICS – advanced query optimization
PostgreSQL query optimization with CREATE STATISTICS is an important topic. Usually, the PostgreSQL optimizer (query planner) does an excellent job. This is not only true for OLTP but also for data warehousing. However, in some cases the optimizer simply lacks the information to do its job properly. One of these situations has to do with […]
PostgreSQL: ANALYZE and optimizer statistics
Our PostgreSQL 24×7 support team recently received a request from one of our customers who was facing a performance problem. The solution to the problem could be found in the way PostgreSQL handles query optimization (specifically, statistics). So I thought it would be nice to share some of this knowledge with my beloved readers. The […]
Generating a normal distribution in SQL
SQL and PostgreSQL are perfect tools to analyze data. However, they can also be used to create sample data which has to possess certain statistical properties. One thing many people need quite often is a normal distribution. The main question therefore is: How can I create this kind of sample data? Tablefunc: Creating normal distributions […]
Correlation of PostgreSQL columns explained
After you ANALYZE a PostgreSQL table to collect value distribution statistics, you will find the gathered statistics for each column in the pg_stats system view. This article will explain the meaning of the correlation column and its impact on index scans. Physical vs. logical ordering Most common PostgreSQL data types have an ordering: they support […]
3 ways to detect slow queries in PostgreSQL
(Last updated 18.01.2023) When digging into PostgreSQL performance, it is always good to know which options one has to spot performance problems, and to figure out what is really happening on a server. Finding slow queries in PostgreSQL and pinpointing performance weak spots is therefore exactly what this post is all about. There are many […]
Detect PostgreSQL performance problems easily
Detect PostgreSQL performance problems with ease: Is there a single significantly large and important database in the world which does not suffer from performance problems once in a while? I bet that there are not too many. Therefore, every DBA (database administrator) in charge of PostgreSQL should know how to track down potential performance problems […]
How a bad network configuration can cause table bloat
(Updated 2023-04-20) I recently had an interesting support case with table bloat that shows how the cause of a problem can sometimes be where you would least suspect it. Note: the problem I describe in this article can only occur in PostgreSQL v14 and below, since the statistics collector has been rewritten in v15. About […]
Visualizing data in PostgreSQL with R Shiny
R is a free Open Source programming language and statistics package that is excellent for data analysis, statistics and graphics. It has been developed in the 90s and its community has steadily grown over the years. Developers and analysts can use R to analyze data from many different sources including PostgreSQL. This article shows, how […]
Optimization issues: Cross column correlation
UPDATE Sept. 2023: For more recent information about planner statistics, see the PostgreSQL documentation about extended statistics, or this blog about CREATE STATISTICS. You can also find out more recent info in this blog about Improving Group By with Create Statistics. The blog post below about cross correlation has been preserved for reference. Planner estimates […]