I'm working with Elasticsearch and trying to come up with a way to filter a text field based on a phrase. I have basic searching working, but I also want to collapse "similar" results rather than duplicating them.
For example, given 5 objects with text content as
- Buy 1 car get one car free until March
- Buy 1 car get one car free until April
- 50% off your car insurance when you buy through us
- Get 50% off your oven
If searching for car
then I'd be looking for 2 results:
- 50% off your car insurance [...]
- EITHER of the 1st or 2nd one (with both showing in
inner_hits
)
I've tried to do this using collapse on the content
field but that will only collapse on exact matches.
'query' => [ 'match' => [ 'content' => 'car', ], ], 'collapse' => [ 'field' => 'content', 'inner_hits' => [ 'name' => 'recently_seen_on', 'size' => 3, 'sort' => [['seen_on' => 'desc']], ], ],
I've also tried creating adding a similarity property to the content
field but I couldn't figure out if it's possible to collapse using that.
I also come across this https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-significanttext-aggregation.html but when I tried something similar I got 0 result. I set the content
type to keywords
in the mappings:
[ 'content' => ['type' => 'keyword'], ]
And then using:
'query' => [ 'match' => [ 'content' => 'car', ], ], 'aggs' => [ 'keywords' => [ 'significant_text' => [ 'field' => 'content', 'filter_duplicate_text' => true, ], ], ],
Is achieving something like this possible without coming adding a field that groups fields based on content manually?
https://stackoverflow.com/questions/66501391/elasticsearch-collapsing-based-on-text-similarity March 06, 2021 at 09:06AM
没有评论:
发表评论