Relevancy with Range Indexes
[Editor’s Note: Adapted from Adam Fowler’s blog:]
A new feature of MarkLogic 7’s search API is range index scoring – affecting relevancy based on a specific value(s) within a document. MarkLogic returns results in relevance order, and range index scoring allows you to determine relevancy based on the value of an element in a document, rather than just performing an exact value match. E.g. a relevance score may be ‘0.625’ if the search was “rating:4” as an exact match, but may be 0.5 using range scoring, with a separate document with a rating of 5/5 stars receiving 0.625 instead.
Here I detail a couple of use cases.
One is for ratings. A higher rating should show nearer the top of search results. A second use case is the distance from the center point of a geospatial query. Just like you get on hotel search websites.
We can now do these directly in MarkLogic without any special voodoo from a developer. Just set up the search options and perform a query. Easy!
Below is the feature in action:
This uses MLJS for rendering results, but the functionality is in core MarkLogic, not MLJS. MarkLogic also calculates a heatmap on the fly. This calculated data is passed to heatmap-openlayers.js – which is much more efficient than just sending lots of data to heatmap.js, especially for thousands of visible points.
Note that the MLJS widgets interact with each other – hovering over a marker on the map highlights it in the search results list with a different background color.
Isn’t this like sorting?
In a word, no.
Sorting is based purely on a value in a document. By changing relevancy scores you can combine different search terms. For example, you could have rating and distance and a word query all contributing to the relevancy score. A result that is a little further away but with a much higher rating may trump one that is dead center on the map, but which has a low rating.
How does it work?
Under the hood you provide a set of options and a query. I’ve documented the REST search options I’m using, and the search query I’m sending, and the results I’m getting back raw within a Gist. Go have a read, it’s pretty straight forward. (I tend to go overkill in setting search options though!)