Some time ago I wrote about joins and especially about outer
joins. However, people repeatedly make the same errors over and over again, so I
thought that it might be worth it to address the topic again – maybe I can rescue
some souls and prevent some bugs.

Here is a basic example:

test=# CREATE TABLE a (aid int4);
CREATE TABLE
test=# CREATE TABLE b (bid int4);
CREATE TABLE

We can populate the tables nicely:

test=# INSERT INTO a VALUES (1), (2), (3);
INSERT 0 3
test=# INSERT INTO b VALUES (2), (3), (4);
INSERT 0 3

The most simplistic case is a normal inner join. This is not going to cause any
surprises to most users:

test=# SELECT * FROM a, b WHERE a.aid = b.bid;
 aid | bid
—–+—–
   2 |   2
   3 |   3
(2 rows)

People often ask if the way the join is written will make any difference:

test=# explain SELECT * FROM a, b WHERE a.aid = b.bid;
                           QUERY PLAN                            
—————————————————————–
 Merge Join  (cost=337.49..781.49 rows=28800 width=8)
   Merge Cond: (a.aid = b.bid)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: a.aid
         ->  Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: b.bid
         ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
(8 rows)

test=# explain SELECT * FROM a JOIN b ON (a.aid = b.bid);
                           QUERY PLAN                            
—————————————————————–
 Merge Join  (cost=337.49..781.49 rows=28800 width=8)
   Merge Cond: (a.aid = b.bid)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: a.aid
         ->  Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
   ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
         Sort Key: b.bid
         ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
(8 rows)

There is absolutely no difference between an explicit and an implicit join. It
is simply a matter of style but it has no impact on the way the planner works
(unless you exceed join_collapse_limit).

Inner joins should be a pretty common thing. Outer joins cause a lot more
problems. Here is a simple case:

test=# SELECT * FROM a LEFT JOIN b ON (a.aid = b.bid);
 aid | bid
—–+—–
   1 |    
   2 |   2
   3 |   3
(3 rows)

Caution !
The situation starts to be tricky if you start to add filters to the query.
People might expect to reduce the number of rows by a filter like that but this
is NOT the case – keep in mind: A filter in the ON-clause WILL NOT reduce the
number of rows on either side of the join. This is not a bug – this is totally
desired and logical behavior:

test=# SELECT * FROM a LEFT JOIN b ON (a.aid = b.bid AND aid = 2);
 aid | bid
—–+—–
   1 |    
   2 |   2
   3 |    
(3 rows)

Most people would think that we should get just one row instead of three BUT all
we do is to limit the data which PostgreSQL will use to find common rows.

If you want to filter data – use a WHERE-clause instead:

test=# SELECT * FROM a LEFT JOIN b ON (a.aid = b.bid) WHERE a.aid = 2;
 aid | bid
—–+—–
   2 |   2
(1 row)

Please think twice when you are writing outer joins – otherwise your results
will simply be plain wrong.