CYBERTEC PostgreSQL Logo

A PostgreSQL story about "NULL IS NULL = NULL AND NOT NULL"

12.2018 / Category: / Tags:

After years of software development, some might still wonder: What is a NULL value? What does it really mean and what is its purpose? The general rule is: NULL basically means “undefined”. Many books state that NULL means “empty” but I think that is not the ideal way to see things: If you wallet is empty, your financial situation is perfectly defined (= you are broke). But, “undefined” is different. It means that we don’t know the value. If we don’t know how much cash you got, you might still be a millionaire. So using the word “unknown” to describe NULL in SQL is really better than to use word “empty”, which can be pretty misleading in my judgement.

NULL values in PostgreSQL: Basic rules

First of all: NULL is a super useful thing in SQL and people should be aware of the details associated with it. Before digging deeper into NULL it is necessary to take a look at the most basic rules. The following example shows a mistake commonly made by many developers:

Many people assume that the output of this query is actually “false”, which is not correct. The result is NULL. Why is that? Suppose you got 10 bucks in your left pocket and nobody knows how much cash you got in your right pocket. Is the amount of cash in your pockets the same? We don’t know. It might be very well so, but we simply don’t know. Thus the result of this query has to be NULL.

Let's try something else:

The same is true for this query. The result has to be NULL. We don’t know how much cash is in your left pocket and we got no idea how much there is in your right pocket. Is it identical? Again: We have absolutely no idea – the result is undefined.

To figure out if two values are actually NULL we have to use the following syntax:

In this case the result is true because “IS” actually check if both value are indeed NULL.
Consequently the next query is going to return false:

However, there is more to NULL than just simple operations. NULL is key and therefore it is important to also check some of the lesser known aspects and corner cases.

row() and NULL handling

Some of my readers might already have seen the row() function, which can be used to form a tuple on the fly. In general pretty much the same rules will apply in this case.

Consider the following example:

As expected the result is NULL because all values on both sides are “undefined” and therefore there is no way the output of this query can ever be true.

What is important to see is that a row can be compared to a single NULL value. In short: The entire tuple is considered to be NULL by PostgreSQL:

This is not true for row(10, NULL) - in this case the query returns false. True is only returned if all fields are NULL. However, there is one thing which might as a surprise to some people. The “IS” keyword won't work if you are comparing the output of two “row” functions:

PostgreSQL will immediately issue a syntax error.

NULL handling in LIMIT clauses

Some time ago I saw some people using NULL in LIMIT / OFFSET clauses. That is somewhat scary but still and interesting issue to think about. Consider the following example:

The table simply contains 3 rows. Here is what LIMIT NULL does:

As you can see the entire resultset will be returned. That makes sense because PostgreSQL does not really know when to stop returning rows. Thus, the query is equivalent to “SELECT * FROM demo”. In PostgreSQL there is also an ANSI SQL compliant way to limit the result of a query: FETCH FIRST … ROWS ONLY” is the “proper” way to limit the result of a query. In PostgreSQL 11 “FETCH FIRST ROWS ONLY” will also accept NULL and behave the same way as LIMIT NULL. Here is an example:

Mind that this was not always the case. Old versions of PostgreSQL did not accept a NULL value here.

NULL handling in ORDER BY clauses

NULL values are especially tricky if you want to sort data. Usually NULL values appear at the end of a sorted list. The following listing shows an example:

The point is: Suppose you want to sort products by price. The most expensive ones or most likely not the ones without a price. Therefore it is usually a good idea to out NULL values at the end of the list if you are ordering descending.

Here is how it works:

Having the NULL values at the end is somewhat more intuitive and usually offers better user experience.

NULL and sum, count, avg, etc.

The way NULLs are handled are also important if you want to run a more analytical type of workload. In general rule is simple: Aggregate functions will simply ignore NULL values. The only exception to the rule is count(*). Here is an example:

count(*) will count ALL the rows - regardless of its content. count(column) will only count the not-NULL values inside a column, which is simply a different thing than just counting everything. Let us take a look at the next example:

As I have stated before: The aggregates do not count the NULL values, which means that the average of those 4 rows will be 2 and not 1.5.

The fact that count(*) counts all rows can create subtle bugs when used in an outer join. Consider the following example:

In this case every count will be at least 1 - even if the person in the list has no house. Keep in mind: The LEFT JOIN will add NULL values to the right side of the join. count(*) will count those NULL values and therefore even the poorest fellow will end up with at least one house. count(*) and outer joins are usually an alarm signal and should be handled with care.

 


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

0 0 votes
Article Rating
Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Pavel Stěhule
Pavel Stěhule
5 years ago

Note: There are null safe compare operator - IS NOT DISTINCT FROM (like =) and IS DISTINCT FROM (like )

CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram