Entries in "geopy"

September Projects

I'm in the situation Ian Bicking was in not long ago—I'm really tired of this blog design and software and it's making me not want to post any of the entries I have pending. This blog will soon redirect to something better.

Pagoda should have a Developer Preview in October. Check out my presentation from the September meeting of the Cleveland Python interest group.

Remember how Ian and I spent months thinking up hundreds of names for our company? We are now incorporated as Unstoppable Rocket—one of the first names that was suggested.

If you've been following the geopy list, you've heard about the new release coming out. It should make things much more flexible and extendable, and fix all the issues from the past year or so. geopy 0.99 will be out this week.

The geopy update is also getting me back into the campus crime map and my Case geocoder service, which is going to be really smart. Updates there soon.

I started a new project called Revisionist, which is like Pagoda's revision model except generalized and using SQLAlchemy 0.4. I'm hoping other people will be interested in using and improving such a project. With the right helpers it should make revisioning complex models really easy.

If anyone has any neat suggestions for what Gary or I should talk about at the October Clepy meeting, let me know.

Geocoding and Python's decimal module

Python has an awesome decimal module for decimal floating point arithmetic. It has configurable precision and keeps track of significant digits and does some other neat stuff.

While I was adding the geopy distance module, I began to wonder if it would be worth the effort to switch everything in geopy over to use Decimals instead of floats. After checking out the decimal module (I had never used it before), I decided that I had nothing to lose, so I went for it...

I quickly ran into some snags when I realized that I'd have to code my own trigonometric functions for use with Decimals, since those that come with Python are for complex or floating point numbers. The decimal recipes page in the documentation has functions for sin and cos, but distance uses asin, acos, atan, and atan2. Don Peterson has a nice decimalfuncs module with most of these, but it's GPL (and would be an uncommon dependency) — geopy is MIT/X11. So I went ahead and started on these...

I decided it would be easiest to define asin and acos in terms of atan, and it turns out there is a (relatively) quickly converging algorithm for that. Here's what I came up with for a Decimal-compatible atan:

def atan(x):
    if x == D('-Inf'):
        return pi() / -2
    elif x == 0:
        return D(0)
    elif x == D('Inf'):
        return pi() / 2
    
    if x < -1:
        c = pi() / -2
        x = 1 / x
    elif x > 1:
        c = pi() / 2
        x = 1 / x
    else:
        c = 0
    
    getcontext().prec += 2
    x_squared = x ** 2
    y = x_squared / (1 + x_squared)
    y_over_x = y / x
    i, lasts, s, coeff, num = D(0), 0, y_over_x, 1, y_over_x
    while s != lasts:
        lasts = s    
        i += 2
        coeff *= i / (i + 1)
        num *= y
        s += num * coeff
    if c:
        s = c - s
    getcontext().prec -= 2
    return +s

It depends on the pi function from the decimal recipes page, which calculates pi to the currently configured precision.

Upon finishing this, Chris came home and I told him what I was doing. Immediately, he tried to talk me out of it, asserting that floating point was good enough for geocoding. I tried to counter by explaining all the floating point calculations being performed in distance, but in the end he won. I no longer think it would be a very important change to convert everything in geopy to use the Decimal type.

What finally convinced me was this quote from the Vincenty distance page I used for reference:

Vincenty’s formula is accurate to within 0.5mm, or 0.000015″ (!), on the ellipsoid being used.

0.000015 arcseconds is about 4.16667e-9 degrees. Well, if floating point is good to about 10 decimal places, I guess Chris wins this time...

Still, if anyone wants Decimal support in the future, maybe I'll just ask Don Peterson for permission to include decimalfuncs with geopy...

Update: On second thought, maybe I will just continue implementing my own trig functions for Decimals. Chris and I just spent a while investigating the precision of my atan vs. decimalfunc's, and mine seems to be faster and more precise.

geopy gets distance and util modules

If you check out geopy trunk right now you'll notice a few changes.

I introduced two modules: util and distance.

util now contains the parse_geo and arc_angle functions, and will grow more in the future.

distance is a bigger addition and contains helpful functions for calculating geodesic distances. I planned to add this eventually, but development was sparked by a request from Chris Mulligan.

There are two distance formulas: Great-circle (aka haversine, aka spherical law of cosines) distance and Vincenty distance.

Great-circle distance uses a spherical model of the earth, using the average great-circle radius of 6372.795 kilometers (this is configurable). This results in an error of up to about 0.5%.

Vincenty distance uses a more accurate ellipsoidal model of the earth. This is the default distance formula, and is thus aliased as distance.distance — so you can easily swap out distance formulas just by changing distance.distance at the top of your code. There are multiple popular ellipsoidal models, and which one will be the most accurate depends on where your points are located on the earth. geopy includes a few good models in the distance.ELLIPSOIDS dictionary:

#             model             major (km)   minor (km)     flattening
ELLIPSOIDS = {'WGS-84':        (6378.137,    6356.7523142,  1 / 298.257223563),
              'GRS-80':        (6378.137,    6356.7523141,  1 / 298.257222101),
              'Airy (1830)':   (6377.563396, 6356.256909,   1 / 299.3249646),
              'Intl 1924':     (6378.388,    6356.911946,   1 / 297.0),
              'Clarke (1880)': (6378.249145, 6356.51486955, 1 / 293.465),
              'GRS-67':        (6378.1600,   6356.774719,   1 / 298.25),
              }

Here's an example usage of distance.distance:

>>> from geopy import distance
>>> import Case
>>> wiki = Case.Geocode.CaseWikiGeocoder()
>>> _, a = wiki.geocode('Wade')
>>> _, b = wiki.geocode('Fribley')
>>> distance.distance(a, b).kilometers
1.342250272726943
>>> distance.distance(a, b).miles
0.83403565192666562

Using Great-circle distance:

>>> distance.distance = distance.GreatCircleDistance
>>> distance.distance(a, b).miles
0.835175984734287

You can change the ellipsoid model used by the Vincenty formula like so:

>>> distance.VincentyDistance.ELLIPSOID = 'Intl 1924'

The above model name will automatically be retrieved from the ELLIPSOIDS dictionary. Alternatively, you can specify the model values directly:

>>> distance.VincentyDistance.ELLIPSOID = (6377., 6356., 1 / 297.)

Oh yeah, you can add distances too (for paths and such). Here's the distance from Fribley to Wade to Phi Kappa Theta:

>>> _, c = wiki.geocode('Phi Kappa Theta')
>>> (distance.distance(b, a) + distance.distance(a, c)).miles
1.0596624112817861

Also included in the distance module are functions for converting between length units (kilometers, miles, feet, nautical miles), and calculating a destination given a starting point, initial bearing, and distance.

This stuff is still just in trunk, no egg or updated documentation yet...

geopy: Now on Google Code, More Geocoders

I decided to try out the Google's Project Hosting feature for geopy. You can find the hosted page at code.google.com/p/geopy. So far it seems pretty sweet and very easy to administer.

I added a geocoder for Microsoft's Windows Live Local (powered by Virtual Earth) to the geocoders module. Sadly, they don't actually have a non-JavaScript geocoding API, so I had to reverse-engineer it.

Norman Khine and I have been investigating issues geocoding UK addresses with the Google Maps API. Due to contractual reasons, they can't offer geocoded addresses with their HTTP geocoder. So instead I again had to reverse-engineer their JavaScript to get it to work. The geocoded results aren't always accurate, but this is Google's problem and not geopy's.

I also tried to add a geocoder for MapQuest's OpenAPI. It is possible to get geocoded results over HTTP (although they don't tell you how, you have to look at their JavaScript or guess), but unfortunately they require you to parse the input location first. This is totally lame. You're telling me they can't parse the address into street, city, country for me? I didn't want to have to do this, but I now plan to add address parsing methods to geopy.