By Kirk Wolak

Continuous Improvement is an important part of reducing technical debt. Over 30 years our active database has collected some technical debt. We wanted to rename all of our views to be consistently named. Besides that, we had a lack of code reviews for what is happening in the database. The latter situation has been rectified, and we now have strictly codified rules on object naming, style, etc.

The problem

We have over 600 tables, and 300 views, many having 2-3 different naming conventions applied to them in our database. We want to clean this up. But we cannot tell if these views are even in use today. Some views are run by managers via scripts or items slightly outside of our control. Others are accessed systematically in operations where the names of the views are often build up from pieces (e.g. vw_SYSTEM_top_{100 | 50 | 10} because we have multiple systems that do similar things against various tables and require consistent reporting. So the user can choose options, which generates the view names/dashboards.)

View access logging – the double dilemma

We simply cannot afford to rename all of our views, and hope for the best. So, we have 2 dilemmas. First, we want to rename all of our views to be consistent. PG makes this pretty easy. But we don’t want to break anything. Also, we need to track who is potentially using the old view, and how often it is being used, maybe when it is being used.

There are a few challenges here. How do we inject the tracking code we need, but only do it when the old views are used? Furthermore, how do we avoid huge penalties?

Our solution

The following solution is what we chose to implement. It has a single big drawback that it changes previously read-only transactions into read-write transactions. It also does some extra logging, and if you want to know the call stack, that is even more detail/work done with the view. Finally, we want this to work even if the view returns 0 rows of data.

The approach we took to repair our view access logging was:

  1. Rename all views to the new standards
  2. Create “Synonym Views” with the old names that log the hit, and call the new name
  3. Modify our many code bases, and publish these changes as close together as possible
  4. Monitor the logging table, and find/fix whatever we missed, and clear the logging table
  5. Allow for storing the STACK details to determine the actual internal code calling it, but keep it turned off until we are hunting for hard to find code
  6. Only when this stabilizes in “Staging” will we move it to production
  7. At some point in the future, drop the “Synonym Views” and the logging table/function
  8. Via Code Reviews make sure we don’t access the old names in new code!

The key takeaway should be that this will take weeks to months. Our view is that you must make at least 2 quarters to have high confidence that nothing is using something, and you must exceed 1 year to be certain. Because like many companies, we have routines that run on those schedules.

The key magic… “Synonym Views” (with the old names)

Each of these views has the exact same structure:

    WITH qry AS MATERIALIZED (SELECT view_access_log('old_name') AS logged)
  SELECT vw.* FROM vw_new_name vw, qry WHERE qry.logged > 0;

Effectively, this has the effect of materializing the single row qry.
It only returns the columns from the new view. Any filters or sorts applied to this view are translated onto that view naturally.

But now we have a function that is called. Let’s see what we can do.
The first iteration of this function was simply:

CREATE SEQUENCE seq_view_consumer;
CREATE OR REPLACE FUNCTION view_access_log(text) 
 RETURN nextval('seq_view_consumer');

That sequence was important to validating that regardless of the number of rows, we only logged one item. Disaster would ensue if we had a view with a million rows and we tried logging every row!

Now with the concept tested and proven, we go to the next level

We create a table and log what happens. Here is the table we need in order to log what happens:

CREATE TABLE view_access_log
    id          bigint GENERATED ALWAYS AS IDENTITY
        PRIMARY KEY,
    dtwhen      timestamp WITH TIME ZONE DEFAULT NOW(),
    view_name   text,
    call_stack  text

And the function:

CREATE FUNCTION touch_view(view_str text) RETURNS integer
    LANGUAGE plpgsql
    s_stack text;
    -- GET CURRENT DIAGNOSTICS s_stack = PG_CONTEXT;  /* Uncomment this to log the call stack.  Too intense originally! */

    INSERT INTO test.view_access_log (view_name, call_stack) VALUES(view_str, s_stack);
    RETURN 1;

That’s it.

But how do we implement all of this?

Well, we will generate the DDL required to make this work straight from the DB. Then we have a choice to edit this or run it directly. Since it involves naming objects, we know we will have to review the DDL and edit it to make things the best we can. See Laurenz Albe’s article on querying for views here. Because that portion would be too specific for this blog.

The simplicity of this solution to repair our view access logging is what makes me appreciate doing things in PG. At the same time, it is strange to some of us that SELECTs can have side effects in the database. This is clearly an example where one of those side effects works to your advantage!

About the author, Kirk Wolak

Here’s what Kirk has to say about himself:
“I sub-contract for multiple companies (literally with my name @ their
My title is usually Chief [Software] Architect. I’ve worked with Chrysler, Ford, UAW, and even taught at the local university (for a year). Personally, my company, KiraSoft, is named after my daughter. We are a small team with 5 clients… One has been a client for 30 years, and another for 20+… We do custom back office solutions.”

“I’ve been doing software professionally since 1984… I’m just new to PostgreSQL… And loving it!”

Kirk is a client of CYBERTEC and enjoys giving back to the community that supports him. CYBERTEC was his first choice in training and support.

In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.