Today somebody has asked me how to remove duplicates which accidentally made it into a table.
The problem is: A normal DELETE won’t do because you would delete both values – not just the one which is in there twice. 

To solve the problem you have to use a “secret” column called “ctid”.
The “ctid” identifies a row inside a table.

Here is an example:

test=# CREATE TABLE t_test (idint4);
CREATE TABLE

test=# INSERT INTO t_test VALUES (1), (2), (2), (3);
INSERT 0 4

As you can see two values show up twice.
To find out how we can remove the duplicate value we can query the “ctid”:

test=# SELECT ctid, * FROM t_test;
 ctid  | id
——-+—-
 (0,1) |  1
 (0,2) |  2
 (0,3) |  2
 (0,4) |  3
(4 rows)

We can make use of the fact that the ctid is not the same for our values.
The subselect will check for the lowest ctid for a given value and delete it:

test=# DELETE FROM t_test
                               WHERE ctid IN (SELECT min(ctid)
                                                                                          FROM t_test
                                                                                          GROUP BY id
                                                                                          HAVING count(*) > 1)
RETURNING * ;
 id
—-
  2
(1 row)

This query works nicely if we can rely on the fact that we only got values which don’t show up more often than twice. If we want to do things in a generic way we can use a simple windowing function to make things work:

test=# SELECT ctid FROM (SELECT ctid, id,
                                                                                 count(*) OVER (PARTITION BY id ORDER BY ctid)
                                                                         FROM t_test) AS x
                                       WHERE count > 1;
 ctid
——-
 (0,3)
 (0,4)
(2 rows)

test=# DELETE FROM t_test
                               WHERE ctid IN (SELECT ctid
                                                                            FROM (SELECT ctid, id,
                                                                                                       count(*) OVER (PARTITION BY id ORDER BY ctid)
                                                                                            FROM t_test) AS x
                                                                                    WHERE count > 1)
                        RETURNING *;
 id
—-
  2
  2
(2 rows)

Now we can check for the result:

test=# SELECT ctid, * FROM t_test;
 ctid  | id
——-+—-
 (0,1) |  1
 (0,2) |  2
 (0,5) |  3
(3 rows)