ā”Ā ElasticsearchBook is crafted by Jozef Sorocin (š¢Ā Book a consulting hour) and powered by:
- Spatialized.io (Elasticsearch & Google Maps consulting)
- in cooperation with Garages-Near-Me.com (Effortless parking across Germany)
Relevance is Relative
Arguably the most important part of search is relevance but there are dozens of strategies to asymptotically reach it and equally as many factors that affect it.
Everyone wants a better search but "success" means different things to different people:
- as few search-as-you-type keystrokes as possible
- more clicks on the first search result
- increased usage of the search box in general etc.
This topic is too broad and so I'm not going to go into the different techniques here but will rather refer you to this insightful article (section "Relevancy").
Ā
Scouting for Scores
You'll have noticed by now that the Search API response typically includes the
_score
attribute inside of each retrieved hit. By default, each hit has a score of 1.0
. This score is then affected by what queries matched a given doc and how good the match was. How good the match was introduces the concept of similarity scoring. Scoring in Elasticsearch is since v5.x governed by an algorithm called Okapi BM25 which is explained here in great detail.
Now, when you're completely lost as to why ES assigned a given score to a given doc, or wondered why the response hits are ordered the way they are, you can count on the Explain API to provide a great deal of feedback:
POST index_name/_search?explain=true
{
"query": {
"simple_query_string": {
"query": "abc"
}
}
}
Ā
Ordering & Sorting
Hits are ordered by their scores in the descending order by default.
If you ever need to randomize the search results (ā do the "opposite" of scoring), you can use a random function score query.
On the other hand, if you need to assign a constant score, use the constant score query.
More often than not, it's not the similarity scores that establish the relevance but rather the order stemming from actual fields in the index. Your site's visitors would like to sort by ā price,
ā sale %, ā alphabetically etc.
Consequently, a typical sort request targeting multiple fields would look like this: