The Math of Geohashing

May 21st, 2008

Although the recent xkcd comic about geohashing was supposed to be part joke, part computer geek, I decided to take it to the next level of computer geekiness. It took me a minute to understand geohashing when I first read that comic and I started to think about it for a while and I once I figured it out, I asked myself an interesting question: What is the maximum distance that I can be from a Munroe Point?

Perhaps I should back up for a second. I need to define some terminology to make this a little easier to talk about. The way he set up the geohashing algorithm creates zones on the map, which I shall call Munroe Zones. Each Munroe Zone has exactly one Munroe Point which is a statistically random point within the Munroe Zone. The zone I live in the rectangle spanning from (42.000000, -72.000000) to (43.000000, -71.000000). This spans an area of 3542.96 square miles. I will spare you the gory details of this calculation. Verify it for yourself :)

I can calculate the maximum distance that I can be from a Munroe Point within my zone with the cunning use of middle school geometry. The distance between (42.000000, -72.000000) and (43.000000, -72.000000) is 69 miles, approximately. The distance between one longitude measurement (42.000000, -72.000000) and (42.000000, -71.000000) is 51 miles. This means that the maximum distance one can be away from the Munroe Point in their Munroe Zone is the hypotenuse of the triangle assuming that the Munroe Point is in the far NW and I live in the far SE. This distance is about 86 miles. However, if I lived in the far SE and the Munroe Point is in the NW, that means I live directly on top of the Munroe Point for another Zone.

Because of this inter-zone relationship, the maximum distance I can be away turns into half of the maximum zone distance, or about 43 miles. The proof is pretty simple.

I thought that since Munroe Points were meant to be meeting points, one might be interested in finding the closest Munroe Point which may or may not be within the ‘zone’ they live in. I wrote a little Python script to help out.

import math, md5, struct, urllib, datetime
 
# Your coordinates!
me = [42.18719304724962, -71.5403938293457]
 
# Dow Jones Industrial Average for today
djia = float(urllib.urlopen(datetime.date.today().strftime("http://irc.peeron.com/xkcd/map/data/%Y/%m/%d")).read())
 
# Returns distance, in miles, between two coordinates
def gps_distance(gps1, gps2):
   r = 6367442.5 * 0.000621371192 # earth radius, miles.
   dLat = math.radians(gps1[0]-gps2[0])
   dLon = math.radians(gps1[1]-gps2[1])
   x = math.sin(dLat / 2) ** 2 + math.cos(math.radians(gps1[0])) * math.cos(math.radians(gps2[0])) * math.sin(dLon / 2) ** 2
   y = 2 * math.atan2(math.sqrt(x), math.sqrt(1-x))
   d = r * y
   return d
 
# Determines if coord is NW, SW, NE, or SE relative to my_munroe_point
def position(coord, my_munroe_point):
   ns = 'N' if coord[0] > my_munroe_point[0] else 'S'
   ew = 'E' if coord[1] > my_munroe_point[1] else 'W'
   return ns + ew
 
# Determines the Munroe point within your zone, as the comic does.
def my_zone_munroe_point(coord):
    day = datetime.date.today().strftime("%Y-%m-%d")
    munroe_coord = [0,0]
    md5_hash = md5.new("%s-%5.2f" % (day, djia)).hexdigest()
    adjust = [long(md5_hash[:16], 16) / 2.**64, 
              long(md5_hash[16:], 16) / 2.**64]
    munroe_coord[0] = math.floor(coord[0]) + adjust[0] if coord[0] > 0 else math.ceil(coord[0]) - adjust[0]
    munroe_coord[1] = math.floor(coord[1]) + adjust[1] if coord[1] > 0 else math.ceil(coord[1]) - adjust[1]
    return munroe_coord
 
# Calculates the nearest Munroe Point, which may or may not be the one
# in your zone.
def nearest_munroe_point(coord):
    my_munroe = my_zone_munroe_point(coord)
    pos = position(coord, my_munroe)
    points = []
 
    if (pos == 'NW'):
        points.append(my_zone_munroe_point([coord[0] + 1, coord[1] - 1]))
        points.append(my_zone_munroe_point([coord[0] + 1, coord[1]])) 
        points.append(my_zone_munroe_point([coord[0], coord[1] - 1])) 
    elif (pos == 'SW'):
        points.append(my_zone_munroe_point([coord[0] - 1, coord[1] - 1]))
        points.append(my_zone_munroe_point([coord[0] - 1, coord[1]])) 
        points.append(my_zone_munroe_point([coord[0], coord[1] - 1]))
    elif (pos == 'NE'):
        points.append(my_zone_munroe_point([coord[0] + 1, coord[1] + 1]))
        points.append(my_zone_munroe_point([coord[0] + 1, coord[1]])) 
        points.append(my_zone_munroe_point([coord[0], coord[1] + 1]))
    elif (pos == 'SE'):
        points.append(my_zone_munroe_point([coord[0] - 1, coord[1] + 1]))
        points.append(my_zone_munroe_point([coord[0] - 1, coord[1]])) 
        points.append(my_zone_munroe_point([coord[0], coord[1] + 1]))
 
    dis = gps_distance(coord, my_munroe)
    nearest = my_munroe
 
    for point in points:
        tmp = gps_distance(coord, point)
        if (tmp < dis):
            nearest = point
            dis = tmp
 
    return nearest
 
my_munroe = my_zone_munroe_point(me)
my_munroe_dis = gps_distance(me, my_munroe)
nearest_munroe = nearest_munroe_point(me)
nearest_munroe_dis = gps_distance(me, nearest_munroe)
 
print "Munroe Point coordinates  (my zone):", my_munroe
print "Distance to my Munroe Point (my zone):", round(my_munroe_dis, 2), "miles"
print "Nearest Munroe Point coordinates:", nearest_munroe
print "Distance to nearest Munroe Point:", round(nearest_munroe_dis, 2), "miles"

The output I got when I ran this for today’s data is as follows:

Munroe Point coordinates (my zone): [42.179467994697781, -71.86153601245762]
Distance to my Munroe Point (my zone): 42.77 miles
Nearest Munroe Point coordinates: [42.179467994697781, -70.86153601245762]
Distance to nearest Munroe Point: 14.94 miles