Intersecting GPS-Tracks to identify infected individuals

In times of COVID-19, governments contemplate tough measures to identify and trace infected people. These measures include the utilization of mobile phone data to trace down infected individuals and subsequently contacts to curb the epidemic. This article shows how PostGIS’ functions can be used to identify “overlapping” sections of infected and healthy individuals by analysing tracks spatio-temporally.
This time we don’t focus on performance and tuning, rather strive boosting your creativity on PostgreSQL’s spatial extension and its functionalities.

Table of Contents

The article is structured as follows:

Set up data structures within PostgreSQL
Set up sample tracks via QGIS
Segment sample tracks to retrieve induvial track points
Intersection of infected individual to retrieve possible contacts

Data structures of GPS tracks

Let’s start by defining tables representing tracks and their points.
Table mobile_tracks acts as a helper table storing artificial tracks of individuals, which have been drawn via QGIS.
Table mobile_points stores points, which result from segmenting tracks. Finally, they have been enriched by timestamps, which are used to identify temporal intersections on top of spatial intersections.

The test area is located in Vienna/Austria; therefore I chose MGI/Austria 34 (EPSG as 31286) as appropriate projection.

create table if not exists mobile_tracks
(
   gid serial not null
      constraint mobile_tracks_ok
         primary key,
   customer_id integer,
   geom geometry(LineString,31286),
   infected boolean default false
);
 
create index if not exists mobile_tracks_geom_idx
   on mobile_tracks using gist (geom);
 
create unique index if not exists mobile_tracks_customer_id_uindex
    on mobile_tracks (customer_id);

create table if not exists mobile_tracks

(

gid serial not null

constraint mobile_tracks_ok

primary key,

customer_id integer,

geom geometry(LineString,31286),

infected boolean default false

);

create index if not exists mobile_tracks_geom_idx

on mobile_tracks using gist (geom);

create unique index if not exists mobile_tracks_customer_id_uindex

on mobile_tracks (customer_id);

create table if not exists mobile_points
(
   gid serial not null
      constraint mobile_points_ok
         primary key,
   customer_id integer,
   geom geometry(Point,31286),
   infected boolean,
   recorded timestamp
);
 
create index if not exists mobile_points_geom_idx
   on mobile_points using gist (geom);

create table if not exists mobile_points

(

gid serial not null

constraint mobile_points_ok

primary key,

customer_id integer,

geom geometry(Point,31286),

infected boolean,

recorded timestamp

);

create index if not exists mobile_points_geom_idx

on mobile_points using gist (geom);

GPS tracks and points

As mentioned in the beginning, for points I decided to generate artificial ones from tracks, I previously digitized via QGIS. Figure 1 shows tracks of infected people in red, healthy individuals are represented as green line strings as foundation.

GPS tracks of healthy and infected individuals, GPS-Tracks — *Figure 1 GPS tracks of healthy and infected individuals*

For our simple example, I made the following assumptions:

Individuals are moving with the same speed
Tracks of individuals start at the same time

To extract individual points for tracks, I utilized PostGIS’s ST_Segmentize function as follows:

with dumpedPoints as (
    select (st_dumppoints(st_segmentize(geom, 1))).geom,
           ((st_dumppoints(st_segmentize(geom, 1))).path[1]) as path,
           customer_id,
           infected,
           gid
    from mobile_tracks),
     aggreg as (
         select *, now() + interval '1 second' * row_number() over (partition by customer_id order by path) as tstamp
         from dumpedPoints)
insert
into mobile_points(geom, customer_id, infected, recorded)
select geom, customer_id, infected, tstamp
from aggreg;

with dumpedPoints as (

select (st_dumppoints(st_segmentize(geom, 1))).geom,

((st_dumppoints(st_segmentize(geom, 1))).path[1]) as path,

customer_id,

infected,

gid

from mobile_tracks),

aggreg as (

select *, now() + interval '1 second' * row_number() over (partition by customer_id order by path) as tstamp

from dumpedPoints)

insert

into mobile_points(geom, customer_id, infected, recorded)

select geom, customer_id, infected, tstamp

from aggreg;

The query creates points every meter, timestamps are subsequently increased by 1000 milliseconds.

Extracted GPS-Points, GPS-Tracks — *Figure 2 Extracted GPS-Points*

Identification of intersection points

Now it’s time to start with our analysis. Did our infected individual meet somebody?
Let’s start with an easy approach and select points of healthy individuals, which have been within 2 meters to infected individuals while respecting a time interval of 2 seconds.

SELECT distinct on (m1.gid) m1.customer_id infectionSourceCust,
                            m2.customer_id infectionTargetCust,
                            m1.gid,
                            m1.recorded,
                            m2.gid,
                            m2.recorded
FROM mobile_points m1
         inner join
     mobile_points m2 on st_dwithin(m1.geom, m2.geom, 2)
where m1.infected = true
  and m2.infected = false
  and m1.gid < m2.gid AND (m2.recorded >= m1.recorded - interval '1seconds' and
       m2.recorded <= m1.recorded + interval '1seconds')
order by m1.gid, st_dwithin(m1.geom, m2.geom, 2) asc

SELECT distinct on (m1.gid) m1.customer_id infectionSourceCust,

m2.customer_id infectionTargetCust,

m1.gid,

m1.recorded,

m2.gid,

m2.recorded

FROM mobile_points m1

inner join

mobile_points m2 on st_dwithin(m1.geom, m2.geom, 2)

where m1.infected = true

and m2.infected = false

and m1.gid < m2.gid AND (m2.recorded >= m1.recorded - interval '1seconds' and

m2.recorded <= m1.recorded + interval '1seconds')

order by m1.gid, st_dwithin(m1.geom, m2.geom, 2) asc

Figure 3 and 4 show results for the given query by highlighting points of contact for our individuals in blue. As mentioned before, this query covers the most basic solution to identify people who met.
Alternatively, PostGIS’ ST_CPAWithin and ST_ClosestPointOfApproach functions could be used here as well to solve this in a similar manner. To do so, our points must get modelled as trajectories first.

Contact points, GPS-Tracks — *Figure 3 Contact points*

To refine our solution, let’s identify temporal coherent segments of spatial proximity and respective passage times. The goal is to understand, how long people have been close enough together to inherit possible infections more realistically. The full query is attached to the end of the post.

Step 1 - aggregStep1

Based on our first query results, for each point we calculate the time interval to its predecessor. Considering our initial assumptions, a non-coherent temporal segment will lead to gaps >1 second.

Temporal gaps between gps points, GPS-Tracks — *Figure 5 Temporal gaps between gps points*

Step 2 – aggregStep2

With our new column indicating the temporal gap between current and previous point, we cluster segments

Step 3 – aggregStep3

For each cluster, we subtract min from max timestamp to extract a passage interval.

Coherent segments + passage time by infected/healthy customer, GPS-Tracks — *Figure 6 Coherent segments + passage time by infected/healthy customer*

Step 4

Finally, for each combination of infected/healthy individual, the segment with the maximum passage time is being extracted and a linestring is generated by utilizing st_makeline (see figure 7).

*Figure 7 Max passage time by infected/healthy customer*

Even though results can serve as foundation for further analysis, our approach still remains constrained by our initially taken assumptions.

Recently, a very promising PostgreSQL extension named MobilityDB has been published, which offers a lot of functionalities to solve all kinds of spatio-temporal related questions. Take a look at this blogpost to learn more.

with points as
         (
             SELECT distinct on (m1.gid) m1.customer_id infectionSourceCust,
 
                                         m2.customer_id infectionTargetCust,
                                         m1.gid,
                                         m1.geom,
                                         m1.recorded    m1rec,
                                         m2.gid,
                                         m2.recorded
             FROM mobile_points m1
                      inner join
                  mobile_points m2 on st_dwithin(m1.geom, m2.geom, 2)
             where m1.infected = true
               and m2.infected = false
               and m1.gid < m2.gid AND (m2.recorded >= m1.recorded - interval '1seconds' and
                    m2.recorded <= m1.recorded + interval '1seconds') order by m1.gid, m1.recorded), aggregStep1 as ( SELECT *, (m1rec - lag(m1rec, 1) OVER (partition by infectionSourceCust,infectionTargetCust ORDER by m1rec ASC)) as lag from points), aggregStep2 as (SELECT *, SUM(CASE WHEN extract('epoch' from lag) > 1 THEN 1 ELSE 0 END)
                 OVER (partition by infectionSourceCust,infectionTargetCust ORDER BY m1rec ASC) AS legSegment
          from aggregStep1),
     aggregStep3 as (
         select *,
                min(m1rec) OVER w   minRec,
                max(m1rec) OVER w   maxRec,
                (max(m1rec) over w) -
                (min(m1rec) OVER w) recDiff
         from aggregStep2
             window w as (partition by infectionSourceCust,infectionTargetCust,legSegment)
     )
select distinct on (infectionsourcecust, infectiontargetcust) infectionsourcecust,
                                                              infectiontargetcust,
                                                              (extract('epoch' from (recdiff))) passageTime,
                                                              st_makeline(geom) OVER (partition by infectionSourceCust,infectionTargetCust,legSegment)
from aggregStep3
order by infectionsourcecust, infectiontargetcust, passageTime desc

with points as

(

SELECT distinct on (m1.gid) m1.customer_id infectionSourceCust,

m2.customer_id infectionTargetCust,

m1.gid,

m1.geom,

m1.recorded m1rec,

m2.gid,

m2.recorded

FROM mobile_points m1

inner join

mobile_points m2 on st_dwithin(m1.geom, m2.geom, 2)

where m1.infected = true

and m2.infected = false

and m1.gid < m2.gid AND (m2.recorded >= m1.recorded - interval '1seconds' and

m2.recorded <= m1.recorded + interval '1seconds') order by m1.gid, m1.recorded), aggregStep1 as ( SELECT *, (m1rec - lag(m1rec, 1) OVER (partition by infectionSourceCust,infectionTargetCust ORDER by m1rec ASC)) as lag from points), aggregStep2 as (SELECT *, SUM(CASE WHEN extract('epoch' from lag) > 1 THEN 1 ELSE 0 END)

OVER (partition by infectionSourceCust,infectionTargetCust ORDER BY m1rec ASC) AS legSegment

from aggregStep1),

aggregStep3 as (

select *,

min(m1rec) OVER w minRec,

max(m1rec) OVER w maxRec,

(max(m1rec) over w) -

(min(m1rec) OVER w) recDiff

from aggregStep2

window w as (partition by infectionSourceCust,infectionTargetCust,legSegment)

)

select distinct on (infectionsourcecust, infectiontargetcust) infectionsourcecust,

infectiontargetcust,

(extract('epoch' from (recdiff))) passageTime,

st_makeline(geom) OVER (partition by infectionSourceCust,infectionTargetCust,legSegment)

from aggregStep3

order by infectionsourcecust, infectiontargetcust, passageTime desc

Intersecting GPS-Tracks to identify infected individuals

Data structures of GPS tracks

GPS tracks and points

Identification of intersection points

Step 1 - aggregStep1

Step 2 – aggregStep2

Step 3 – aggregStep3

Step 4

Leave a Reply Cancel reply

Florian Nadler

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts