β‘Β ElasticsearchBook is crafted by Jozef Sorocin (π’Β Book a consulting hour) and powered by:
- Spatialized.io (Elasticsearch & Google Maps consulting)
- in cooperation with Garages-Near-Me.com (Effortless parking across Germany)
Metrics Aggregations
The aggregations in this family compute (numeric) metrics based on values extracted one way or another from the matched documents. Examples include:
geo_centroid
as discussed in Location Clustering
cardinality
as discussed in Aggregation Data Tables
Β
Generally speaking, ES aggregations are run in a distributed, map/reduce-like fashion whereby the intermediate results are collected from all available shards and combined into the final result.
Scripted metric aggregations allow us to glide along these iterative processes and compute our metrics of choice β be it a single numeric metric, a hash map of key-metric pairs, a list of sorted values, etc.
Scripted metrics are composed of the following scripts:
init_script?
β Executed prior to any collection of documents. Allows the aggregation to set up an initial state. It's the only script that's not required to be declared.
map_script
β Executed once per document collected.
combine_script
β Executed once on each shard after document collection is complete. Allows the aggregation to consolidate the state returned from each shard.
reduce_script
β Executed once on the coordinating node after all shards have returned their results.
Use Case: Distinct Sum & Average
Given this denormalized data where the
cost
is always the same for a given id
:{ "id" : "1", "cost" : 42 }
{ "id" : "2", "cost" : 67 }
{ "id" : "2", "cost" : 67 }
{ "id" : "2", "cost" : 67 }
{ "id" : "4", "cost" : 11 }
{ "id" : "4", "cost" : 11 }
{ "id" : "5", "cost" : 10 }
{ "id" : "6", "cost" : 99 }
How can I get the AVERAGE of the
cost
but DISTINCT by the id
β e.g. the following in SQL:SELECT AVG(T.cost)
FROM
(SELECT DISTINCT id, cost
FROM records_table) as T