Location Clustering

🏑 Home πŸ“– Chapter Home πŸ‘ˆ Prev πŸ‘‰ Next
⚑  ElasticsearchBook is crafted by Jozef Sorocin (🟒 Book a consulting hour) and powered by:

Use Case

Given a full-text query of "New York City", I want to cluster the selected locations into buckets that I can render on a map. On top of that, I want to retrieve the bounds of the viewport that my map will "fly to".
The points of interest below are marked red:
Base map courtesy of geojson.io
Base map courtesy of geojson.io

Approach

We'll be utilizing:
  • a geohash_grid to group locations on the geohash grid
    • followed by a geo_centroid sub-aggregation to determine the positions of the weighted centroids β†’ the clusters,
  • and a geo_bounds aggregation to calculate the viewport bounds.
Here's the pseudo-code:
{
  "query": MATCH city
  "aggs": {
    "clusters": {
      AGGREGATE on geohash grid,
        COMPUTE weighted centroid
      }
    },
    "bounds": {
		  COMPUTE viewport bounds
    }
  }
}
Β 
It's tempting to use a geo_centroid aggregation without the parent geohash_grid but geo_centroid does not allow for adjusting the zoom-based cluster density. But of course, UI clustering only makes sense when we provide the map's current zoom level (β†’ "precision"):
Now, ES implements the 12 geohash precision levels.
πŸ”—
Geo-hashing deserves a chapter of its own but for now, check out this interactive map to gain an intuitive understanding of the geohash mechanism.
Back to the precision levels β€” these don't directly correspond to the zoom levels you know from Google Maps, Mapbox, or Leaflet, so an adjustment will be needed. Here's the formula that I use to translate a GoogleMaps zoom into the precision parameter:
// only integer values are allowed -- mapbox & leaflet support floats too so keep that in mind
let precision = int(map_zoom - 8);

if (precision < 1) {
  // Default to the highest "zoom-out".
  // Consider the whole earth & expect 1 - 3 clusters in total
	precision = 1; 
} else if (precision > 7) {
	// Don't go beyond 7 because we'll get too many 1-member
  // clusters which could be actual markers/pins instead
	precision = 7;
}

// pass the `precision` into the ES request body

Already purchased? Sign in here.