When writing data structures and table layouts, people usually don’t care about the order of columns. Why should they anyway? Columns can be reordered inside a SELECT at will, so why care? Well, you should care because the order of columns might have significant impact on the size of your table. That’s right: The size of the table might depend on the order of columns – even if the same data is stored.

Why is that so? Internally, there is something called CPU alignment which governs the internal size of a data structure. By cleverly aligning and ordering columns, there is some potential to optimize things. You don’t believe it? Let us give it a try:

 

test=# CREATE TABLE t_test (
     i1       int,
     i2       int,
     i3       int, 
     v1       varchar(100),
     v2       varchar(100),
     v3       varchar(100)
);
CREATE TABLE 

 

In this example there are 6 columns. 3 integer columns close to each other and 3 varchar columns also close to each other. Let us add 10 million rows to the table:

 

test=# INSERT INTO t_test SELECT 10, 20, 30, 'abcd', 'abcd', 'abcd'
       FROM generate_series(1, 10000000);
INSERT 0 10000000

The overall size of the table is 574 MB:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
 574 MB
(1 row)

 

Let us try to change the layout of those columns. In the next example there is a varchar field followed by an integer column. This layout is repeated three times:

 

test=# CREATE TABLE t_test (
     v1       varchar(100),
     i1       int,
     v2       varchar(100),
     i2       int,
     v3       varchar(100),
     i3       int
);
CREATE TABLE

If 10 million rows are added now …

 

test=# INSERT INTO t_test SELECT 'abcd', 10, 'abcd', 20, 'abcd', 30
       FROM generate_series(1, 10000000);
INSERT 0 10000000

… the table will be a lot larger:

test=# SELECT pg_size_pretty(pg_relation_size('t_test'));
 pg_size_pretty
----------------
 651 MB
(1 row)

 

The data in the table is the same – it has just been chosen in a way to demonstrate the effect. If I had used “abc” instead of “abcd” there would be no difference in size because the 4 character string made sure that the string just did not fit into a smaller space.

Conclusion

The important conclusion here is that it definitely makes sense, to pack similar data types together. And, I have found it useful to pack integer columns at the beginning of the tables. In many cases this can yield in some extra percentage points of speed – just because things are a bit smaller than if you don’t make use of this little rule.

Get some great hints and tips about indexes in my post, “Find and Fix a Missing PostgreSQL Index“.