CYBERTEC Logo

Open Street Map to PostGIS - The Basics

11.2019 / Category: / Tags: |

OSM to PostGIS – The Basics

Ever wondered how to import OSM (OpenStreetMap) data into PostGIS [1] for the purpose of visualization and further analytics? Here are the basic steps to do so.
There are a bunch of tools on the market— osm2pgsql; imposm; ogr2org; just to mention some of those. In this article I will focus on osm2pgsql [2] .

Let’s start with the software prerequisites. PostGIS comes as a PostgreSQL database extension, which must be installed in addition to the core database. Up till now, the latest PostGIS version is 3, which was released some days ago. For the current tasks I utilized PostGIS 2.5 on top of PostgreSQL 11.
This brings me to the basic requirements for the import – PostgreSQL >= 9.4 and PostGIS 2.2 are required, even though I recommend installing PostGIS >=2.5 on your database;  it’s supported from 9.4 upwards. Please consult PostGIS’ overall compatibility and support matrix [3] to find a matching pair of components.

Osm2pgsql Setup

Let’s start by setting up osm2pgsql on the OS of your choice – I stick to Ubuntu 18.04.04 Bionic Beaver and compiled osm2gsql from source to get the latest updates.

Install required libraries

Grab the repo

Compile

If everything went fine, I suggest checking the resulting binary and its release by executing

Data acquisition

In the world of OSM, data acquisition is a topic of its own, and worth writing a separate post discussing different acquisition strategies depending on business needs, spatial extent and update frequency. I won’t get into details here, instead, I’ll just grab my osm data for my preferred area directly from Geofabrik, a company offering data extracts and related daily updates for various regions of the world. This can be very handy when you are just interested in a subregion and therefore don’t want to deal with splitting the whole planet osm depending on your area of interest - even though osm2pgsql offers the possibility to hand over a bounding box as a spatial mask.
As a side note – osm data’s features are delivered as lon/lat by default.

So let’s get your hands dirty and fetch a pbf of your preferred area from Geofabrik’s download servers [4] [5]. For a quick start, I recommend downloading a dataset covering a small area:

Optionally utilize osmium to check the pbf file by reading its metadata afterwards.

Database setup

Before finally importing the osm into PostGIS, we have to set up a database enabling the PostGIS extension. As easy as it sounds – connect to your database with your preferred database client or pgsql, and enable the extension by executing

.
Subsequently, execute

to validate the PostGIS installation within your database.

Database import

osm2pgsql is a powerful tool to import osmdata into PostGIS offering various parameters to tune. It’s worthwhile to mention the existence of the default-style parameter [6], which defines how osm data is parsed and finally represented in the database. The diagram below shows the common database model generated by osm2pgsql using the default style.

osm_to_postGIS
Figure 1 - Default osm2pgsql db-model [6]

 

It’s hard to give a recommendation on how this style should be adopted, as this heavily depends on its application. As a rule of thumb, the default style is a good starting point for spatial analysis, visualizations and can even be fed with spatial services, since this layout is supported by various solutions (e.g. Mapnik rendering engine).

I will start off with some basic import routines and then move to more advanced ones. To speed up the process in general, I advise you to define the number of processes and cache in MB to use. Even if this blogpost is not intended as a performance report, I attached some execution times and further numbers for the given parametrized commands to better understand the impact the mentioned parameters have.

The imports were performed on a virtualized Ubuntu 18.04 (KVM) machine equipped with 24 cores (out of 32 logical cores provided by an AMD Ryzen Threadripper 2950X), 24GB RAM, and a dedicated 2TB NVMe SSD (Samsung 970 EVO).

Mandatory parameters

Before going into details, internalize the following main parameters:
-U for database user, -W to prompt for the database password, -d refers to the database and finally -H defines the host. The database schema is not exposed as a parameter and therefore must be adjusted via the search path.

Import utilizing default style

Default import of pbf: existing tables will be overwritten. Features are projected to WebMercator (SRID 3857) by default.

The command was executed in ~11 seconds. The table below shows generated objects and cardinalities.

table_schema table_name row_estimate total index table
public planet_osm_line 142681 97 MB 13 MB 67 MB
public planet_osm_point 145017 19 MB 7240 kB 12 MB
public planet_osm_polygon 176204 110 MB 17 MB 66 MB
public planet_osm_roads 8824 14 MB 696 kB 5776 kB

Import utilizing default style with slim

Parameter s (“slim”) forces the tool to store temporary node information in the database instead of in the memory. It is an optional parameter intended to enable huge imports and avoid out-of-memory exceptions. The parameter is mandatory if you want to enable incremental updates instead of full ones.  The parameter can be complemented with

to store this information outside the database as a file.

The command was executed in ~37 seconds. The table below shows generated objects and cardinalities.

table_schema table_name row_estimate total index table
public planet_osm_line 142681 102 MB 18 MB 67 MB
public planet_osm_nodes 6100392 388 MB 131 MB 258 MB
public planet_osm_point 145017 23 MB 11 MB 12 MB
public planet_osm_polygon 176204 117 MB 24 MB 66 MB
public planet_osm_rels 9141 9824 kB 4872 kB 4736 kB
public planet_osm_roads 8824 15 MB 1000 kB 5776 kB
public planet_osm_ways 325545 399 MB 306 MB 91 MB

Import utilizing default style, tag configuration

By default, tags referenced by a column are exposed as separate columns. Parameter hstore forces the tool to store unreferenced tags in a separate hstore column.

Note: Database extension hstore must be installed beforehand.

For the sake of completeness

  • --hstore-match-only and -hstore-all should be mentioned as well:
  • --hstore-all pushes standard tags to individual columns and the hstore column as well

The command was executed in ~35 seconds. The table below shows generated objects and cardinalities.

table_schema table_name row_estimate total index table
public planet_osm_line 142875 106 MB 18 MB 71 MB
public planet_osm_nodes 6100560 388 MB 131 MB 258 MB
public planet_osm_point 151041 27 MB 12 MB 15 MB
public planet_osm_polygon 176342 122 MB 24 MB 72 MB
public planet_osm_rels 9141 9824 kB 4872 kB 4736 kB
public planet_osm_roads 8824 16 MB 1000 kB 6240 kB
public planet_osm_ways 325545 399 MB 306 MB 91 MB

Import utilizing default style, multi-geometry

By default, objects containing multiple disjoint geometries are stored as separate features within the database. Think of Vienna and its districts, which could be represented as 23 individual polygons or one multi-polygon. Parameter -G forces the tool to store geometries belonging to the same object as multi-polygon.

The impact emerges most clearly during spatial operations, since the spatial index utilizes the feature to its fullextent, in order to decide which features must be considered.

The command was executed in ~36 seconds. The table below shows generated objects and cardinalities.

table_schema table_name row_estimate total index table
public planet_osm_line 142681 102 MB 18 MB 67 MB
public planet_osm_nodes 6100392 388 MB 131 MB 258 MB
public planet_osm_point 145017 23 MB 11 MB 12 MB
public planet_osm_polygon 174459 117 MB 24 MB 65 MB
public planet_osm_rels 9141 9824 kB 4872 kB 4736 kB
public planet_osm_roads 8824 15 MB 1000 kB 5776 kB
public planet_osm_ways 325545 399 MB 306 MB 91 MB

Results and next steps

The table highlights the influence of osm2pgsql parameters on execution time, generated objects, cardinalities and subsequently sizing. In addition, it’s worth it to understand the impact of parameters like multi-geometry, which forces the tool to create multi-geometry features instead of single-geometry features. Preferring one over the other might lead to performance issues, especially when executing spatial operators (as those normally take advantage of the extent of the features).

The next posts will complement this post by inspecting and visualizing our import results and subsequently dealing with osm updates to stay up to date.

Figure 2 - QGIS-Map of iceland, SRID 3857 [7]

 

Figure 3 - QGIS-Map of Reykjavík, SRID 3857 [7]

References

[1] „PostGIS Reference,“ [Online]. Available: https://postgis.net/.
[2] „osm2pgsql GitHub Repository,“ [Online]. Available: https://github.com/openstreetmap/osm2pgsql.
[3] „PostGIS Compatiblity and Support,“ [Online]. Available: https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS.
[4] „Geofabrik Download Server,“ [Online]. Available: https://download.geofabrik.de/.
[5] „PBF Format,“ [Online]. Available: https://wiki.openstreetmap.org/wiki/PBF_Format.
[6] „Default Style - osm2pgsql,“ [Online]. Available: https://wiki.openstreetmap.org/wiki/Osm2pgsql/schema.
[7] „QGIS Osm styles,“ [Online]. Available: https://github.com/yannos/Beautiful_OSM_in_QGIS.
[8] „osm2pgsql Database schema,“ [Online]. Available: https://wiki.openstreetmap.org/w/images/b/bf/UMLclassOf-osm2pgsql-schema.png.

 

Appendix

Generated objects and cardinalities from statistics

Generated objects and cardinalities with count

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
CYBERTEC Logo white
CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0
office@cybertec.at

Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2024
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    0
    Would love your thoughts, please comment.x
    ()
    x
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram