PGConfEU2023 is over and a New Year has started. For those who did not manage to make it to Prague to attend the conference, I wanted to use the occasion, to sum up what I tried to communicate during my keynote session and reflect a bit on the importance of true Open Source versus Closed Source as well as on subscription services.

Longevity in data processing

data matters hans jürgen pgconf eu

On August 10th a new baby was born in my family. The question is: Why would anybody care? Kids are born every day all over the world and people leave the face of the earth from time to time.

What is the point of all this? Here is why we should care about those things. What does a new child mean for database people?

  • Life expectancy: 82 years
  • Government data: Stored for 100+ years
  • Pension fund: Stored for 60+ years
  • Banking data: Bank accounts might exist for decades

The important point here is that we have to store data for a really long time in the most reliable way. We are talking about way more than the lifespan of a single person and we have to do that for entire countries, millions of people, and potentially for entire continents. In other words: This is a huge, important, and relevant task we are facing here. In today’s world people are not only in existence physically but also digitally – therefore one’s digital existence is of utmost importance.

Data outlives applications

An additional observation is that data usually outlives applications. This might seem odd at first place but this is true. Just consider a simple e-banking system. The data feeding those

applications is basically living forever while your frontends might change once in a while depending on trends, customer requirements, and a lot more. The way the amount of cash in your account is stored is mostly untouched for way longer.

Why does this observation actually matter?

Commercial database products tend to die

Well, database products and commercial products die once in a while. What do I mean by that? Let us take a look at various examples here:

  • Sybase Anywhere:
    • 1992 – 2025 (RIP) = 33 years
  • MySQL:
    • First release 1995
    • Oracle bought Sun in 2010
  • DEC / Oracle Rdb:
    • Initial release 1984
    • Bought by Oracle 1994
    • No more Itanium support since 2010 (have fun with those binaries)

When we take into account that we want to keep data around for 100 years, 33 years doesn’t seem to be an attractive proposal. The fact that a product will die once in a while is per se ok BUT what is a problem is the fact that customers usually have NO CLUE how data is stored internally. Let us take a look at the DEC / Oracle Rdb example: Imagine your data is stored in binary format on some old Intel Itanium chip. Even if you throw many at the problem, it will be hard to decipher those data files in 20 years.

The situation is even worse in cloud environments but I decided to leave this topic for another day.

The strength of true Open Source

In the case of Open Source, one can fix issues by throwing money at the problem. There is always someone who can make some old code run function somehow in case it is necessary – there is always a way to reverse engineer data files or simply run stuff longer than officially supported in case it is necessary.

In other words: Your favorite subscription service might not be around 30 years from now.

Let me recall the problem: The life expectancy of our newborn baby is 82 years and our government infrastructure will store data for 100+ years. Longevity matters.

Think long term – not in quarters

Having said that, the conclusion I have drawn can actually be summed up nicely in one sentence: Think long-term and understand that data is not a short-term thing.
Data matters – especially if it is created in a business or government context. It therefore does make sense to think beyond the next quarter and ensure full independence from vendors and subscription services.

If your life depended on it: Where would you store your data? Would you upload your life to Amazon, Google, Alibaba or Microsoft? Would you want your life to depend on the quality of some kind of subscription service? I wouldn’t want my physical life to depend on external factors and I don’t want the relevant parts of my digital life to depend on such things.

Conclusion

Open Source can help us to protect data – to have a long-term strategy and to provide us with a decent level of safety needed in many areas of life. Sure, your gaming high score might not face the same requirements as your pension fund. However, we should all keep in mind how important data has become these days and pay a little more attention to those things in general.

 

More about “Why PostgreSQL”