Document types have been deprecated since v7.x so it's recommended to treat ES indices as single-purpose tables with single-type documents with shared fields whose data types can be defined in a straightforward manner. In this book we'll be focusing on single-type documents/indices only.
Elasticsearch's effectiveness is often attributed to its ability to efficiently index and store your documents' fields. Each field has its own data type and the resulting list of these field definitions is called a mapping. The mapping can be:
- defined up front
- as an explicit mapping (we know what we know)
- or as a dynamic template mapping (we know what we expect)
- or guessed by ES at ingest time (we don't know what we don't know; rather rare IRL)
Remember that if ES encounters any fields that aren't defined in the mapping, the system will still guess the data types and auto-add the definitions to the already defined mapping. This feature can be turned off using the
dynamicsetting which can keep your mapping locked in place, and even reject new documents if the
strictmode is activated.
In my experience, letting ES guess the mappings is not good enough because, by default,
- any epoch timestamp will be auto-mapped as a
longinstead of a
- every text field will be mapped as
keyword— but you may need just the keyword, esp. in short attributes like tags and categories
nestedfields will not be recognized as nested — ES will default to an
objectinstead. More on nested documents in Median Duration of a Project Build.
Plus there's the overhead of having to drop the index, adjust the mapping and reindex each time you've incrementally improved what you needed to improve (there are workarounds.) Yes, you can change the mappings after you've synced the docs but the updated mapping won't apply to the old docs — only to the ones that'll be indexed from that point onwards.