šŸ”€

5. Relevance, Scoring, & Sorting

šŸ” Home šŸ‘ˆ Prev šŸ‘‰ Next
āš”Ā  ElasticsearchBook.com is powered by Notion-Paywall.com

Relevance is Relative

Arguably the most important part of search is relevance but there are dozens of strategies to asymptotically reach it and equally as many factors that affect it.
Everyone wants a better search but "success" means different things to different people:
  • as few search-as-you-type keystrokes as possible
  • more clicks on the first search result
  • increased usage of the search box in general etc.
This topic is too broad and so I'm not going to go into the different techniques here but will rather refer you to this insightful article (section "Relevancy").
Ā 

Scouting for Scores

You'll have noticed by now that the Search API response typically includes the _score attribute inside of each retrieved hit. By default, each hit has a score of 1.0. This score is then affected by what queries matched a given doc and how good the match was.
How good the match was introduces the concept of similarity scoring. Scoring in Elasticsearch is since v5.x governed by an algorithm called Okapi BM25 which is explained here in great detail.
Now, when you're completely lost as to why ES assigned a given score to a given doc, or wondered why the response hits are ordered the way they are, you can count on the Explain API to provide a great deal of feedback:
POST index_name/_search?explain=true
{
  "query": {
    "simple_query_string": {
      "query": "abc"
    }
  }
}
Ā 

Ordering & Sorting

Hits are ordered by their scores in the descending order by default.
šŸ”‘
If you ever need to randomize the search results (ā†’ do the "opposite" of scoring), you can use a random function score query. On the other hand, if you need to assign a constant score, use the constant score query.
More often than not, it's not the similarity scores that establish the relevance but rather the order stemming from actual fields in the index. Your site's visitors would like to sort by ā†“ price, ā†“ sale %, ā†‘ alphabetically etc.
Consequently, a typical sort request targeting multiple fields would look like this:
POST index_name/_search
{
  "sort" : [
    { "price" : {"order" : "desc"} },
    { "sale" : {"order" : "desc"} },
    { "name" : {"order" : "asc"} }
  ]
}

Already purchased? Sign in here.