IoT Devices Not Reporting

🏑 Home πŸ“– Chapter Home πŸ‘ˆ Prev πŸ‘‰ Next
⚑  ElasticsearchBook is crafted by Jozef Sorocin (🟒 Book a consulting hour) and powered by:
Β 
In the previous chapter we've discussed the usefulness of bucket_script aggregations which allow for per-bucket computations. When combined with a scripted_metric aggregation, other practical applications arise.
Let me illustrate.

Use Case: Device Fleet Health

I have a fleet of devices, each of which posts a message to ES every 10 minutes in the form of:
{
    "deviceId": "unique-device-id",
    "timestamp": "2021-01-19 06:54:00",
    "message" : "morning ping at 06:54 AM"
}
I'm trying to get a sense of the health of this fleet by finding devices that haven't reported anything in a given period of time. What I dream of is getting:
  1. the total count of distinct deviceIds seen in the last 7 days
  1. the total count of deviceIds NOT seen in the last hour
  1. the IDs of the devices that stopped reporting (β†’ reported in the last 2hrs but not the last 1h)

Approach: Bucket Scripts & Scripted Metrics

Let's first set up our fleet_messages index:
PUT fleet_messages
{
  "mappings": {
    "properties": {
      "timestamp": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "message": {
        "type": "text"
      },
      "deviceId": {
        "type": "keyword"
      }
    }
  }
}
and then ingest a few messages that occurred between Jan 13, 2021 and Jan 20, 2021.
POST fleet_messages/_doc
{
  "deviceId": "device#1",
  "timestamp": "2021-01-14 10:00:00",
  "message": "device#1 in the last week"
}

POST fleet_messages/_doc
{
  "deviceId": "device#1",
  "timestamp": "2021-01-20 15:40:00",
  "message": "device#1 in the last 2 hours"
}

POST fleet_messages/_doc
{
  "deviceId": "device#1",
  "timestamp": "2021-01-20 16:52:00",
  "message": "device#1 in the last hour"
}

POST fleet_messages/_doc
{
  "deviceId": "device#2",
  "timestamp": "2021-01-15 09:00:00",
  "message": "device#2 in the last week"
}

POST fleet_messages/_doc
{
  "deviceId": "device#2",
  "timestamp": "2021-01-20 15:58:00",
  "message": "device#2 in the last 2hrs"
}
After that, let's assume it's exactly 5 PM on Jan 20, 2021.
Β 

1. The total count of distinct deviceIds seen in the last 7 days

We're going to use a range filter to restrict the timestamp, plus a cardinality aggregation to obtain the unique device count. In pseudo-code:
"last7d": {
  "filter":
		"range": "2021-01-13 <= timestamp <= 2021-01-20"
  "aggs":
    "cardinality": "on the field deviceId"
}
Translated to query DSL:
"last7d": {
  "filter": {
    "range": {
      "timestamp": {
        "gte": "2021-01-13 00:00:00",
        "lte": "2021-01-20 17:00:00"
      }
    }
  },
  "aggs": {
    "uniq_device_count": {
      "cardinality": {
        "field": "deviceId"
      }
    }
  }
}
Β 

Already purchased? Sign in here.