CYBERTEC Logo

A quick check on Postgres 10 Beta Analytics Performance vs version 9.6

06.2017 / Category: / Tags:

Some weeks ago the Beta 1 of upcoming Postgres 10 was released. Besides the very visible version numbering scheme change (no more X.Y.Z ) the release notes promised amongst other very cool and long-awaited features - like built-in Logical Replication and Table Partitioning - also many performance improvements, especially in the area of analytics. Release notes stated up to +40% for cases, with large number of rows (which could mean different things to different people of course), with an added note to test it out and report back. So now I found time to exactly do that and I'm again just laying out the numbers on running some analytical queries for you to evaluate.

Test Queries

Postgres has excellent analytical support, so making a choice on what to test exactly without going through the v10 Git changelog in details caused me halt for a moment, but I thought I'll keep it simple this time (will hopefully go deeper for the final Release Candidate) and I conjured up 3 quite simple SELECT queries based on the hints from release notes and on the schema generated by our good old friend pgbench.

For pbench "scale" I chose 100, meaning 10 Million rows (1.3GB) will be populated into the pgbench_accounts table. This gives enough rows but still fits into memory so that runtimes are more predictable.

Hardware & Server Config

I spun up a modest c3.xlarge (4 cores, Intel Xeon E5-2680 v2 @ 2.8GHz, 7.5GB RAM, RHEL 7.3) instance on AWS, but as we're not testing hardware it should be fine. The disk doesn't really matter here, as it's a read-only test and everything will fit into RAM. For additional testing fairness, my test script also performed queries in turns between 9.6 and 10 Beta to try to suppress the effects of possible background load jitter of a shared virtual machine.

For running Postgres, I used official Postgres packages from the YUM repo for both 9.6.3 and 10 Beta1, just changing the port to 5433 for 10 Beta.
Concerning server settings I left everything to defaults, except below changes on both clusters for reasons added as comments.

Results

After running my test script (1h of analytical queries for both clusters), which basically does all the work out of the box and can be found here, I got the below numbers. For generating the percentage differences btw, you could use this query here.

Query Instance Mean time (s) % Change Stddev time (s) % Change
SUM 9.6 9.74 3.86
SUM 10beta1 9.28 -5.0 2.05 -87.9
JOIN 9.6 5.05 2.13
JOIN 10beta1 5.98 +15.5 1.53 -39.0
CUBE 9.6 10.35 1.53
CUBE 10beta1 8.75 -18.3 1.93 21.0

So what do these numbers tell us? 2 queries out of 3 have improved – that's good of course. Any significant changes? Not so many. Only for the grouping sets query. But the advertised unique join feature improvement definitely did not show out, the opposite sadly. For consolation - at least it's more predictably slow – standard deviation fell by 40% 🙂

But to sum it up – the general impression is still positive as aggregate runtimes over all queries still improved by 5%. One could think that 5% is not much – but the fact is that Postgres has matured over decades and is already using very good algorithms and big improvements can come only from venturing into the "parallel worlds".

NB! Not to forget - it's a BETA release and these numbers don't mean too much in the long run.

A drop of tar on another front

After I was done with the analytical queries, I also got hungry for more and thought I'll check out how do normal read queries perform? So I also ran a quick "pgbench –select-only" (single index scans) tests for different client/scale settings... and actually noticed that on average the 10 Beta was consistently a couple of percent slower there when measured in TPS! Not a big difference of course, given that we're still dealing with beta software here... but it still made me a bit sceptical still. So if anyone has time, please test it out yourself (for example using my script here) and see if this was some setup glitch from my side or something else. Or just comment what you think of this test. Thanks a lot!

0 0 votes
Article Rating
Subscribe
Notify of
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
RustProof Labs
6 years ago

Great post, thanks for writing this.

I believe the calculations for % change in time for the JOIN and CUBE queries are incorrect. The JOIN query took 5.05 on the old, and 5.98 on the new. This calculation is now (5.98-5.05) / 5.05 = 0.1842 = +18.4%, instead of +15.5

The CUBE on the other hand was (8.75 - 10.35) / 10.35 = -0.1555 = -15.6%.

It looks like possibly those two numbers simply got transposed in the post.

Morten Sickel
Morten Sickel
6 years ago

I have a database of gpstracks of a bit over 1 million records. For each logpoint I need to find the logpoint as close as possible to 20 minutes after if there exists one within 19 - 21 minutes, so I do:

select g1.id as gpspointid_base, g2.id as gpspointid_follower, 20 as minutes
from gpspoint as g1
cross join lateral(
select id,datetime
from gpspoint as g2
where g2.datetime between g1.datetime+'19 minutes' and g1.datetime+'21 minutes' and g1.datafileid=g2.datafileid
order by g1.datetime+'20 minutes' g2.datetime asc
limit 1)
as g2(id,datetime)

This query needs up to 1.5 times as much time on 9.4 as on 10 on my system

By adding a where query that restricts g1, I have the following numbers in ms using timing in psql

rows pg 9.4 pg10
6782 1883 1263
25095 27180 20028
137699 812713 566355
Same experience with larger subsets, but I do not have the data available at the moment.

I do not say that your metrics are wrong for what you tested, but for my use case, I got a significant performance boost by using (the released version of ) postgresql 10

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    2
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram