Recently I was working on a project which had a need for geocoding. Normally we use PostGIS along with some free data for geocoding. However, to cross check data and to verify results I decided to write a little Python function to see, what Google would actually give me.
To come up with the Python function the first thing to do is to install geopy, a module freely available on the web:
root@chantal:/tmp# git clone https://github.com/geopy/geopy Cloning into 'geopy'... remote: Counting objects: 2848, done. remote: Total 2848 (delta 0), reused 0 (delta 0) Receiving objects: 100% (2848/2848), 913.77 KiB | 504 KiB/s, done. Resolving deltas: 100% (1841/1841), done.
The stuff can be compiled and deployed easily:
root@chantal:/tmp# cd geopy/ root@chantal:/tmp/geopy# python setup.py install
Then PL/Python has to be enabled:
test=# CREATE LANGUAGE plpythonu; CREATE LANGUAGE
Then the function can already be written. The first thing I have done here is to come up with a data type returned by the function. In the example the position, the longitude and the latitude are relevant:
CREATE TYPE v_position AS ( location text, lat numeric, lng numeric );
Finally there is the function itself, which connects to Google and fetches some data. It is more or less trivial:
CREATE OR REPLACE FUNCTION geo (raw_string text) returns setof v_position AS $$ from geopy import geocoders geocoder = geocoders.GoogleV3() try: for res in geocoder.geocode(raw_string, exactly_one=False): yield {'location': res[0], 'lat': res[1][0], 'lng': res[1][1]} except: pass $$ LANGUAGE plpythonu VOLATILE;
Now we can test the function with some sane input:
test=# \x Expanded display is on. test=# SELECT * FROM geo('groehrmuehlgasse 26, wiener neustadt'); -[ RECORD 1 ]---------------------------------------------- location | Gröhrmühlgasse 26, 2700 Wiener Neustadt, Austria lat | 47.8178812 lng | 16.2384326
The result gives us the position of our Cybertec headquarter here in Austria. So far so good …
Unexpected results …
To see if my function works properly I tried to come up with a string, which seemed pretty unlikely to me:
test=# SELECT * FROM geo('dirt hole'); -[ RECORD 1 ]-------------------------------------------------- location | Dirt Hole Creek Road, Tambaroora NSW 2795, Australia lat | -32.9826131 lng | 149.4264436
Well, obviously some places are not blessed with proper names. There seem to be dirt holes in Australia after all 😉
Let us focus our attention on some more “religious” question:
test=# SELECT * FROM geo('infidel'); -[ RECORD 1 ]----------------------------------------------- location | Infidel Place, Torbay, Auckland 0630, New Zealand lat | -36.7024209 lng | 174.746813
It seems that even this leads to a result …
But, this is pointless.
test=# SELECT * FROM geo('pointless'); -[ RECORD 1 ]-------------------------------------------- location | Truax SE5, Squamish-Lillooet A, BC V0N, Canada lat | 50.7844444 lng | -122.6816666
… or maybe just insane …
test=# SELECT * FROM geo('insane'); -[ RECORD 1 ]-------------------------------------------------------- location | Insane, 115 21 Stockholm, Sweden lat | 59.3236986 lng | 18.095511 -[ RECORD 2 ]-------------------------------------------------------- location | Insane, The Tongass National Forest, Juneau, AK 99801, USA lat | 58.2691792 lng | -134.5122587
Maybe we should not trust everything returned by a webservice