Recently I was working on a project which had a need for geocoding. Normally we use PostGIS along with some free data for geocoding (http://postgis.net/docs/Geocode.html). However, to cross check data and to verify results I decided to write a little Python function to see, what Google would actually give me.

To come up with the Python function the first thing to do is to instally geopy, a module freely available on the web:

root@chantal:/tmp# git clone https://github.com/geopy/geopy

Cloning into 'geopy'...

remote: Counting objects: 2848, done.

remote: Total 2848 (delta 0), reused 0 (delta 0)

Receiving objects: 100% (2848/2848), 913.77 KiB | 504 KiB/s, done.

Resolving deltas: 100% (1841/1841), done.

The stuff can be compiled and deployed easily:

root@chantal:/tmp# cd geopy/

root@chantal:/tmp/geopy# python setup.py install

Then PL/Python has to be enabled:

test=# CREATE LANGUAGE plpythonu;

CREATE LANGUAGE

Then the function can already be written. The first thing I have done here is to come up with a data type returned by the function. In the example the position, the longitude and the latitude are relevant:

CREATE TYPE v_position AS

(

        location               text,

        lat                       numeric,

        lng                      numeric

);

Finally there is the function itself, which connects to Google and fetches some data. It is more or less trivial:

CREATE OR REPLACE FUNCTION geo (raw_string text)

        returns setof v_position AS

$$

from geopy import geocoders

geocoder = geocoders.GoogleV3()

 

try:

        for res in geocoder.geocode(raw_string, exactly_one=False):

                yield {'location': res[0], 'lat': res[1][0], 'lng': res[1][1]}

except:

        pass

$$ LANGUAGE plpythonu VOLATILE;

Now we can test the function with some sane input:

test=# \x

Expanded display is on.

test=# SELECT * FROM geo('groehrmuehlgasse 26, wiener neustadt');

-[ RECORD 1 ]----------------------------------------------

location | Gröhrmühlgasse 26, 2700 Wiener Neustadt, Austria

lat      | 47.8178812

lng      | 16.2384326

The result gives us the position of our Cybertec headquarter here in Austria. So far so good …

Unexpected results …

To see if my function works properly I tried to come up with a string, which seemed pretty unlikely to me:

test=# SELECT * FROM geo('dirt hole');

-[ RECORD 1 ]--------------------------------------------------

location | Dirt Hole Creek Road, Tambaroora NSW 2795, Australia

lat      | -32.9826131

lng      | 149.4264436

Well, obviously some places are not blessed with proper names. There seem to be dirt holes in Australia after all 😉

Let us focus our attention on some more “religious” question:

test=# SELECT * FROM geo('infidel');

-[ RECORD 1 ]-----------------------------------------------

location | Infidel Place, Torbay, Auckland 0630, New Zealand

lat      | -36.7024209

lng      | 174.746813

It seems that even this leads to a result …

But, this is pointless.

test=# SELECT * FROM geo('pointless');

-[ RECORD 1 ]--------------------------------------------

location | Truax SE5, Squamish-Lillooet A, BC V0N, Canada

lat      | 50.7844444

lng      | -122.6816666

… or maybe just insane …

test=# SELECT * FROM geo('insane');

-[ RECORD 1 ]--------------------------------------------------------

location | Insane, 115 21 Stockholm, Sweden

lat      | 59.3236986

lng      | 18.095511

-[ RECORD 2 ]--------------------------------------------------------

location | Insane, The Tongass National Forest, Juneau, AK 99801, USA

lat      | 58.2691792

lng      | -134.5122587

Maybe we should not trust everything returned by a webservice