โกย ElasticsearchBook is crafted by Jozef Sorocin (๐ขย Book a consulting hour) and powered by:
- Spatialized.io (Elasticsearch & Google Maps consulting)
- in cooperation with Garages-Near-Me.com (Effortless parking across Germany)
ย
In a variety of circumstances you may want to calculate trends โ how a metric is changing with respect to some baseline value. Let's look at a typical insurance use case.
Use Case: Percentual Changes in Aggregated Coverage
- I'm an insurance portfolio manager and want to know by how much the total contractual coverage (also called "sum insured") changed in 2020 compared to 2019 โ both in $$$ and in percent.
- On top of that, I want to explore these YoY trends in two separate bands: where the coverage is < $1M and where it's โฅ $1M.
At the end of the day, I'm requesting a table looking like this:
Aggregated Coverage Trends โ 2020 vs. 2019
Coverage Band
Coverage change in $
Coverage Change in %
< $1M
-$727,050.00
-62.41%
โฅ $1M
$413,000.00
7.84%
Approach: Bucket Scripts
Let's assume a highly simplified insurance contract of the form:
{
"created_at": "2019/03/12 07:21:17",
"coverage": 570000
}
The portfolio
index mapping could then look like:
PUT portfolio
{
"mappings": {
"properties": {
"created_at": {
"type": "date",
"format": "yyyy/MM/dd HH:mm:ss"
},
"coverage": {
"type": "float"
}
}
}
}
Let's then index five contracts from 2019 and five from 2020:
POST _bulk
{"index":{"_index":"portfolio"}}
{"created_at":"2019/03/12 07:21:17","coverage":570000}
{"index":{"_index":"portfolio"}}
{"created_at":"2019/06/21 13:43:58","coverage":324000}
{"index":{"_index":"portfolio"}}
{"created_at":"2019/09/07 21:32:38","coverage":1250000}
{"index":{"_index":"portfolio"}}
{"created_at":"2019/10/22 05:52:15","coverage":3600000}
{"index":{"_index":"portfolio"}}
{"created_at":"2019/12/28 14:24:43","coverage":998000}
POST _bulk
{"index":{"_index":"portfolio"}}
{"created_at":"2020/05/01 12:56:17","coverage":860000}
{"index":{"_index":"portfolio"}}
{"created_at":"2020/05/01 23:52:17","coverage":275000}
{"index":{"_index":"portfolio"}}
{"created_at":"2020/08/14 13:33:34","coverage":3600000}
{"index":{"_index":"portfolio"}}
{"created_at":"2020/11/06 10:49:26","coverage":29950}
{"index":{"_index":"portfolio"}}
{"created_at":"2020/11/07 17:51:27","coverage":1663000}
ย
Next, we'll leverage the facts that:
- row-based calculations can be achieved through
filters
aggregations as outlined here