As a user and advocate of PostgreSQL I have been wondering for a while, how this NoSQL hype can ever happen.

I have been thinking about the idea of having no schema for a while and must admit that to me this seems more of less impossible. The idea of storing a document instead of a fully normalized data structure can be appealing in many cases and PostgreSQL has all the functionality to store this kind of data in a way and flexible way.

However, if you store data, it must have a meaning – if not to the database, at least to the application. One way or the other people have to agree on what they want to store and how to handle it. Otherwise nasty things might happen. Let us assume we got two apps feeding JSON data into a NoSQL database:

App 1:
{“id”:1,”long”:35.6,”lat”:27.4,”location”:”some office”}

App 2:
{“id”:1,”lng”:35.6,”lat”:27.4,”location”:”some office”}

Clearly, if the app analyzing and maybe aggregating the data has no idea about the name of the fields, it will be impossible to get anything useful out of the data.

Developers might argue that people can use NoSQL just agreeing on some common rules and information. My answer to this has always been: There is no point in having rules if they are not enforced by something or somebody. SQL has always followed a “think first” philosophy while NoSQL seems to rather fancy a “store first” approach.

In my world a “store first” philosophy must lead to troubles in the long run. Here is an example: We have already seen people moving from MongoDB back to PostgreSQL because of troubles related to legacy, a lack of consistency, and integrity – this is exactly what has to be expected when it turns out that there is actually no such thing as “no schema” in data needed for analysis.

I have recently given a talk on this issue. If you are interested in the presentation: click here.