PostgreSQL: LIMIT vs FETCH FIRST ROWS ... WITH TIES

Most people in the SQL and in the PostgreSQL community have used the LIMIT clause provided by many database engines. However, what many do not know is that LIMIT / OFFSET are off standard and are thus not portable. The proper way to handle LIMIT is basically to use SELECT … FETCH FIRST ROWS. However, there is more than meets the eye.

Table of Contents

LIMIT vs. FETCH FIRST ROWS

Before we dig into some of the more advanced features we need to see how LIMIT and FETCH FIRST ROWS can be used. To demonstrate this feature, I have compiled a simple data set:

test=# CREATE TABLE t_test (id int);
CREATE TABLE
test=# INSERT INTO t_test 
        VALUES 	(1), (2), (3), (3), 
                (4), (4), (5);
INSERT 0 7
test=# TABLE t_test;
 id
----
  1
  2
  3
  3
  4
  4
  5
(7 rows)

test=# CREATE TABLE t_test (id int);

CREATE TABLE

test=# INSERT INTO t_test

VALUES (1), (2), (3), (3),

(4), (4), (5);

INSERT 0 7

test=# TABLE t_test;

----

(7 rows)

Our data set has 7 simple rows. Let’s see what happens if we use LIMIT:

test=# SELECT * FROM t_test LIMIT 3;
 id
----
  1
  2
  3
(3 rows)

test=# SELECT * FROM t_test LIMIT 3;

----

(3 rows)

In this case, the first three rows are returned. Note that we are talking about ANY rows here. Whatever can be found first is returned. There is no special order.

The ANSI SQL compatible way of doing things is as follows:

test=# SELECT * 
           FROM  t_test 
           FETCH FIRST 3 ROWS ONLY;
 id
----
  1
  2
  3
(3 rows)

test=# SELECT *

FROM t_test

FETCH FIRST 3 ROWS ONLY;

----

(3 rows)

Many of you may never have used or seen this kind of syntax before, but this is actually the “correct” way to handle LIMIT.

However, there is more: What happens if NULL is used inside your LIMIT clause? The result might surprise you:

test=# SELECT * FROM t_test LIMIT NULL;
 id
----
  1
  2
  3
  3
  4
  4
  5
(7 rows)

test=# SELECT * FROM t_test LIMIT NULL;

----

(7 rows)

The database engine does not know when to stop returning rows. Remember, NULL is undefined, so it does not mean zero. Therefore, all rows are returned. You have to keep that in mind in order to avoid unpleasant surprises...

Fetch first ... rows with ties

WITH TIES has been introduced in PostgreSQL 13 and fixes a common problem: handling duplicates. If you fetch the first couple of rows, PostgreSQL stops at a fixed number of rows. However, what happens if the same data comes again and again? Here is an example:

test=# SELECT * 
           FROM  t_test 
           ORDER BY id 
           FETCH FIRST 3 ROWS WITH TIES;
 id
----
  1
  2
  3
  3
(4 rows)

test=# SELECT *

FROM t_test

ORDER BY id

FETCH FIRST 3 ROWS WITH TIES;

----

(4 rows)

In this case, we’ve actually got 4 rows, not just 3. The reason is that the last value shows up again after 3 rows, so PostgreSQL decided to include it as well. What is important to mention here is that an ORDER BY clause is needed, because otherwise, the result would be quite random. WITH TIES is therefore important if you want to include all rows of a certain kind - without stopping at a fixed number of rows.

Suppose one more row is added:

test=# INSERT INTO t_test VALUES (2);
INSERT 0 1
test=# SELECT * 
           FROM  t_test 
           ORDER BY id 
           FETCH FIRST 3 ROWS WITH TIES;
 id
----
  1
  2
  2
(3 rows)

test=# INSERT INTO t_test VALUES (2);

INSERT 0 1

test=# SELECT *

FROM t_test

ORDER BY id

FETCH FIRST 3 ROWS WITH TIES;

----

(3 rows)

In this case, we indeed get 3 rows, because it is not about 3 types of values, but really about additional, identical data at the end of the data set.

WITH TIES: Managing additional columns

So far we have learned something about the simplest case using just one column. However, that’s far from practical. In a real work application, you will certainly have more than a single column. So let us add one:

test=# ALTER TABLE t_test 
           ADD COLUMN x numeric DEFAULT random();
ALTER TABLE
test=# TABLE t_test;
 id |     	x     	 
----+--------------------
  1 |  0.258814135879447
  2 |  0.561647200043165
  3 |  0.340481941960185
  3 |  0.999635345010109
  4 |  0.467043266494571
  4 |  0.742426363498449
  5 | 0.0611112678267247
  2 |  0.496917052156565
(8 rows)

test=# ALTER TABLE t_test

ADD COLUMN x numeric DEFAULT random();

ALTER TABLE

test=# TABLE t_test;

id | x

----+--------------------

1 | 0.258814135879447

2 | 0.561647200043165

3 | 0.340481941960185

3 | 0.999635345010109

4 | 0.467043266494571

4 | 0.742426363498449

5 | 0.0611112678267247

2 | 0.496917052156565

(8 rows)

In the case of LIMIT nothing changes. However, WITH TIES is a bit special here:

test=# SELECT * 
            FROM  t_test 
            ORDER BY id 
            FETCH FIRST 4 ROWS WITH TIES;
 id |     	x    	 
----+-------------------
  1 | 0.258814135879447
  2 | 0.561647200043165
  2 | 0.496917052156565
  3 | 0.999635345010109
  3 | 0.340481941960185
(5 rows)

test=# SELECT *

FROM t_test

ORDER BY id

FETCH FIRST 4 ROWS WITH TIES;

id | x

----+-------------------

1 | 0.258814135879447

2 | 0.561647200043165

2 | 0.496917052156565

3 | 0.999635345010109

3 | 0.340481941960185

(5 rows)

What you can see here is that 5 rows are returned. The fifth row is added because id = 3 appears more than once. Mind the ORDER BY clause: We are ordering by id. For that reason, the id column is relevant to WITH TIES.

Let’s take a look at what happens when the ORDER BY clause is extended:

test=# SELECT * 
           FROM  t_test 
           ORDER BY id, x 
           FETCH FIRST 4 ROWS WITH TIES;
 id |     	x    	 
----+-------------------
  1 | 0.258814135879447
  2 | 0.496917052156565
  2 | 0.561647200043165
  3 | 0.340481941960185
(4 rows)

test=# SELECT *

FROM t_test

ORDER BY id, x

FETCH FIRST 4 ROWS WITH TIES;

id | x

----+-------------------

1 | 0.258814135879447

2 | 0.496917052156565

2 | 0.561647200043165

3 | 0.340481941960185

(4 rows)

We are ordering by two columns. Therefore WITH TIES is only going to add rows if both columns are identical, which is not the case in my example.

LIMIT... Or finally...

WITH TIES is a wonderful new feature provided by PostgreSQL. However, it is not only there to limit data. If you are a fan of windowing functions you can also make use of WITH TIES as shown in one of my other blog posts covering advanced SQL features provided by PostgreSQL.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

PostgreSQL: LIMIT vs FETCH FIRST ROWS ... WITH TIES

LIMIT vs. FETCH FIRST ROWS

Fetch first ... rows with ties

WITH TIES: Managing additional columns

LIMIT... Or finally...

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts