IBM Streams 4.2

Geohashes

The Geospatial toolkit includes support to encode and decode Geohashes. A Geohash is a unique identifier of a specific region on the Earth. The basic idea is that the Earth is divided into regions of user-defined size and each region is assigned a unique id, which is called its Geohash. For a given location on earth, the Geohash algorithm converts its latitude and longitude into a string. This string is the Geohash and will determine which of the predefined regions the point belongs to. Generally, points within close geographical proximity will have the same Geohash.

The Geospatial toolkit supports encoding points and arbitrary geometries as Geohashes, either as binary strings or in the more common, alphanumeric base 32 format. Functions are also provided to decode Geohashes into geometries. It can also perform such operations on arbitrary geometries such as polygons. However, encoding a geometry that spans the Prime (Greenwich) Meridian, where longitude is zero, the poles, where longitude is undefined, or the Equator, where latitude is 0, will return the empty string since no single Geohash would contain that geometry. For example, encoding a Polygon that spans the Prime Meridian will have an empty string as its result.

Geohash bit depth

The number of bits used to encode the geohash is referred to as the bit depth, and determines the north-south and east-west extent of each individual region into which the earth is divided, and the length of the generated Geohash string. More bits means a longer geohash and smaller individual regions. The following chart shows the relationship between the number of bits used to encode a Geohash, the number of characters in the resulting base 32 string, and the dimensions of each individual region. Note that this is with reference to a point at the Equator, on a spherical earth.

All the dimensions are in meters. This chart should be consulted when determining the number of bits to use with the Hangout or SpatialRouter operators. If computation is being performed at or near the Equator, simply choose the number of bits that corresponds to the desired dimensions. For example, if a 250m x 250m region is required, a bit depth of 33 would result in regions of 305m x 305 m in dimensions, which is a sufficient approximation.

Because this data is only with reference to the Equator, some computation is required when determining how many bits to use in your application, because changes in latitude affect the east-west extents of geohash cells. Use the following procedure to determine the appropriate bit depth for that area when Geohash regions with dimensions L x E are desired:
  1. Pick a latitude that will be within the geographical area of interest.
  2. Determine the width of that region at the Equator:
    	target = *E*;
    	sampleLatitudeInDegrees = latitude of a point within the area of interest;
    	adjustedExtent = target/cos(sampleLatitudeInDegrees);
    	double extentAtEquator = MAX(adjustedExtent, target);

For example, a target cell size of 300 m x 300 m is required, and computation will be around France. The target would be 300 and a sampleLatitudeInDegrees of 45.8 could be used. Computing the extentAtEquator using the above procedure results in an east-west extent of 424.26 at the equator. Consulting the above table, the next largest extent value is 611.5 meters which has a bit depth of 31. Thus, a bit depth of 31 at the Equator would be the proper value to use. Note that the above formula will not work if the absolute value of the sampleLatitude is 90.

For a detailed explanation how Geohashes are encoded and decoded, see the Wikipedia article about Geohashes at http://en.wikipedia.org/wiki/Geohash.