Category: Uncategorized
Calculating differences between rows in SQL
Recently we had some clients who had the desire to store timeseries in PostgreSQL. One of the questions, which seems to interest people in this area, is related to calculating the difference between values in timeseries data. How can one calculate the difference between the current and the previous row? To answer this question I […]
Join strategies and performance in PostgreSQL
© Laurenz Albe 2020 There are three join strategies in PostgreSQL that work quite differently. If PostgreSQL chooses the wrong strategy, query performance can suffer a lot. This article explains the join strategies, how you can support them with indexes, what can go wrong with them and how you can tune your joins for better […]
PostgreSQL: ltree vs. WITH RECURSIVE
After my last post about ltree and recursive data in PostgreSQL people have asked me privately about performance issues. To share this information, I decided to come up with a follow up post to discuss this topic in a bit more detail. WITH RECURSIVE in PostgreSQL is efficient. However, ltree does have its strengths as […]
SQL trickery: Hypothetical aggregates
“If we had this data what would it mean?” – these kinds of questions can be answered using plain SQL. The technique you will need in PostgreSQL is a “hypothetical aggregate” which is of course part of the ANSI SQL standard. This post will show what an hypothetical aggregate is good for and how it […]
Wrapping Db2 with PostgreSQL
Since SQL/MED (Management External Data) was implemented in PostgreSQL, hundreds of projects have emerged that try to connect PostgreSQL with other data sources. Just by doing a simple search on GitHub with the keys “postgres” + “fdw” you can figure that out. Sadly not all extensions are well maintained and as a consequence they are […]
Composite type performance issues in PostgreSQL
PostgreSQL is a really powerful database and offers many features to make SQL even more powerful. One of these impressive things is the concept of a composite data type. In PostgreSQL a column can be a fairly complex thing. This is especially important if you want to work with server side stored procedures or functions. […]
Deduplication in PostgreSQL v13 B-tree indexes
© Laurenz Albe 2020 A while ago, I wrote about B-tree improvements in v12. PostgreSQL v13, which will come out later this year, will feature index entry deduplication as an even more impressive improvement. So I thought it was time for a follow-up. Deduplication for B-tree indexes If the indexed keys for different table rows […]
PostgreSQL: Speeding up recursive queries and hierarchic data
A hierarchical query is an SQL query that handles hierarchical model data such as the structure of organizations, living species, and a lot more. All important database engines including PostgreSQL, Oracle, DB2 and MS SQL offer support for this type of query. However, in some cases hierarchical queries can come with a price tag. This […]
SQL trickery: Configuring windowing functions
Generating simple data sets Before we get started I want to introduce my favorite set-returning functions which can help you to generate sample data: All we do here is simply to generate a list from 1 to 10 and print it on the screen. Let us play around with windowing a bit now: There are […]
Partition management – do you really need a tool for that?
The functionality of using table partitions to speed up queries and make tables more manageable as data amounts grow has been available in Postgres for a long time already, with nicer declarative support available from v10 – so in general it’s a known technique for developers. But what is not so uniformly clear is the […]
How to count hits on a website in PostgreSQL
Recently we have covered “count” quite extensively on this blog. We discussed optimizing count(*) and also talked about “max(id) – min(id)” which is of course a bad idea to count data in any relational database (not just in PostgreSQL). Today I want to focus your attention on a different kind of problem and its solution: […]
Binary data performance in PostgreSQL
© Laurenz Albe 2020 A frequently asked question in this big data world is whether it is better to store binary data inside or outside of a PostgreSQL database. Also, since PostgreSQL has two ways of storing binary data, which one is better? I decided to benchmark the available options to have some data points […]