Commit Graph

1569 Commits

Author SHA1 Message Date
Samyak S Sarnayak
5ab505be33 Fix highlight by replacing num_graphemes_from_bytes
num_graphemes_from_bytes has been renamed in the tokenizer to
num_chars_from_bytes.

Highlight now works correctly!
2022-01-17 13:02:55 +05:30
Samyak S Sarnayak
e752bd06f7 Fix matching_words tests to compile successfully
The tests still fail due to a bug in https://github.com/meilisearch/tokenizer/pull/59
2022-01-17 11:37:45 +05:30
Samyak S Sarnayak
30247d70cd Fix search highlight for non-unicode chars
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
  clusters to highlight.

In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?

Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
  clusters from tokens
- `<mark>` tag is put around only the matched part
    - before this change, the entire word was highlighted even if only a
      part of it matched
2022-01-17 11:37:44 +05:30
Tamo
98a365aaae store the geopoint in three dimensions 2021-12-14 12:21:24 +01:00
Tamo
d671d6f0f1 remove an unused file 2021-12-13 19:27:34 +01:00
Clément Renault
25faef67d0 Remove the database setup in the filter_depth test 2021-12-09 11:57:53 +01:00
Clément Renault
65519bc04b Test that empty filters return a None 2021-12-09 11:57:53 +01:00
Clément Renault
ef59762d8e Prefer returning None instead of the Empty Filter state 2021-12-09 11:57:52 +01:00
Clément Renault
ee856a7a46 Limit the max filter depth to 2000 2021-12-07 17:36:45 +01:00
Clément Renault
32bd9f091f Detect the filters that are too deep and return an error 2021-12-07 17:20:11 +01:00
Clément Renault
90f49eab6d Check the filter max depth limit and reject the invalid ones 2021-12-07 16:32:48 +01:00
many
8970246bc4 Sort positions before iterating over them during word pair proximity extraction 2021-11-22 18:16:54 +01:00
Marin Postma
6e977dd8e8 change visibility of DocumentDeletionResult 2021-11-22 15:44:44 +01:00
many
35f9499638 Export tokenizer from milli 2021-11-18 16:57:12 +01:00
Marin Postma
6eb47ab792 remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
Marin Postma
09b4281cff improve document addition returned metaimprove document addition
returned metaimprove document addition returned metaimprove document
addition returned metaimprove document addition returned metaimprove
document addition returned metaimprove document addition returned
metaimprove document addition returned meta
2021-11-10 14:08:36 +01:00
Marin Postma
721fc294be improve document deletion returned meta
returns both the remaining number of documents and the number of deleted
documents.
2021-11-10 14:08:18 +01:00
Irevoire
0ea0146e04 implement deref &str on the tokens 2021-11-09 11:34:10 +01:00
Tamo
7483c7513a fix the filterable fields 2021-11-07 01:52:19 +01:00
Tamo
e5af3ac65c rename the filter_condition.rs to filter.rs 2021-11-06 16:37:55 +01:00
Tamo
6831c23449 merge with main 2021-11-06 16:34:30 +01:00
Tamo
b249989bef fix most of the tests 2021-11-06 01:32:12 +01:00
Tamo
27a6a26b4b makes the parse function part of the filter_parser 2021-11-05 10:46:54 +01:00
Tamo
76d961cc77 implements the last errors 2021-11-04 17:42:06 +01:00
Tamo
8234f9fdf3 recreate most filter error except for the geosearch 2021-11-04 17:24:55 +01:00
Tamo
07a5ffb04c update http-ui 2021-11-04 15:52:22 +01:00
Tamo
a58bc5bebb update milli with the new parser_filter 2021-11-04 15:02:36 +01:00
many
7b3bac46a0 Change Attribute and Ranking rules errors 2021-11-04 13:19:32 +01:00
many
0c0038488c Change last error messages 2021-11-03 11:24:06 +01:00
Tamo
76a2adb7c3 re-enable the tests in the parser and start the creation of an error type 2021-11-02 17:35:17 +01:00
bors[bot]
08ae47e475 Merge #405
405: Change some error messages r=ManyTheFish a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 13:35:55 +00:00
many
9f1e0d2a49 Refine asc/desc error messages 2021-10-28 14:47:17 +02:00
many
ed6db19681 Fix PR comments 2021-10-28 11:18:32 +02:00
marin postma
183d3dada7 return document count from builder 2021-10-28 10:33:04 +02:00
many
2be755ce75 Lower error check, already check in meilisearch 2021-10-27 19:50:41 +02:00
many
3599df77f0 Change some error messages 2021-10-27 19:33:01 +02:00
bors[bot]
d7943fe225 Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
marin postma
baddd80069 implement review suggestions 2021-10-25 18:29:12 +02:00
marin postma
f9445c1d90 return float parsing error context in csv 2021-10-25 17:27:10 +02:00
Clémentine Urquizar
208903ddde Revert "Replacing pest with nom " 2021-10-25 11:58:00 +02:00
marin postma
3fcccc31b5 add document builder example 2021-10-25 10:26:43 +02:00
marin postma
430e9b13d3 add csv builder tests 2021-10-25 10:26:43 +02:00
marin postma
53c79e85f2 document errors 2021-10-25 10:26:43 +02:00
marin postma
2e62925a6e fix tests 2021-10-25 10:26:42 +02:00
marin postma
0f86d6b28f implement csv serialization 2021-10-25 10:26:42 +02:00
marin postma
8d70b01714 optimize document deserialization 2021-10-25 10:26:42 +02:00
Tamo
1327807caa add some error messages 2021-10-22 19:00:33 +02:00
Tamo
c8d03046bf add a check on the fid in the geosearch 2021-10-22 18:08:18 +02:00
Tamo
3942b3732f re-implement the geosearch 2021-10-22 18:03:39 +02:00
Tamo
7cd9109e2f lowercase value extracted from Token 2021-10-22 17:50:15 +02:00