CYBERTEC PostgreSQL Logo

Shrink the storage footprint of PostgreSQL data

08.2014 / Category: / Tags: |

When writing data structures and table layouts, people usually don't care about the order of columns. Why should they anyway? Columns can be reordered inside a SELECT at will, so why care? Well, you should care because the order of columns might have significant impact on the size of your table. That's right: The size of the table might depend on the order of columns - even if the same data is stored.

Why is that so? Internally, there is something called CPU alignment which governs the internal size of a data structure. By cleverly aligning and ordering columns, there is some potential to optimize things. You don't believe it? Let us give it a try:

In this example there are 6 columns. 3 integer columns close to each other and 3 varchar columns also close to each other. Let us add 10 million rows to the table:

The overall size of the table is 574 MB:

Let us try to change the layout of those columns. In the next example there is a varchar field followed by an integer column. This layout is repeated three times:

If 10 million rows are added now …

The data in the table is the same - it has just been chosen in a way to demonstrate the effect. If I had used “abc” instead of “abcd” there would be no difference in size because the 4 character string made sure that the string just did not fit into a smaller space.

Conclusion

The important conclusion here is that it definitely makes sense, to pack similar data types together. And, I have found it useful to pack integer columns at the beginning of the tables. In many cases this can yield in some extra percentage points of speed - just because things are a bit smaller than if you don't make use of this little rule.

Get some great hints and tips about indexes in my post, "Find and Fix a Missing PostgreSQL Index".

0 0 votes
Article Rating
Subscribe
Notify of
guest
3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Shaun Thomas
10 years ago

This is an important point. It's not something that jumps out as an obvious concern.

Sean Hs
Sean Hs
10 years ago

Thanks for the great post. I'm wondering if changing the varchar's to something like char's would make a difference? My understanding with the varchar datatype is that it's dynamically resized.

Jim Nasby
Jim Nasby
10 years ago
Reply to  Sean Hs

In Postgres, no. char() is stored as a varlena, just like varchar, numeric, bytea and some other data types. So it won't make any difference.

Do note that "char" is a single-character datatype which is not the same as char().

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram