mirror of
https://github.com/meilisearch/meilisearch.git
synced 2025-11-02 08:56:26 +00:00
Fix search highlight for non-unicode chars
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
clusters to highlight.
In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?
Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
clusters from tokens
- `<mark>` tag is put around only the matched part
- before this change, the entire word was highlighted even if only a
part of it matched
This commit is contained in:
@@ -17,6 +17,7 @@ once_cell = "1.5.2"
|
||||
rayon = "1.5.0"
|
||||
structopt = { version = "0.3.21", default-features = false, features = ["wrap_help"] }
|
||||
tempfile = "3.2.0"
|
||||
unicode-segmentation = "1.6.0"
|
||||
|
||||
# http server
|
||||
askama = "0.10.5"
|
||||
|
||||
Reference in New Issue
Block a user