Commit Graph

983 Commits

Author SHA1 Message Date
Samyak S Sarnayak
2d7607734e Run cargo fmt on matching_words.rs 2022-01-17 13:04:33 +05:30
Samyak S Sarnayak
5ab505be33 Fix highlight by replacing num_graphemes_from_bytes
num_graphemes_from_bytes has been renamed in the tokenizer to
num_chars_from_bytes.

Highlight now works correctly!
2022-01-17 13:02:55 +05:30
Samyak S Sarnayak
c10f58b7bd Update tokenizer to v0.2.7 2022-01-17 13:02:00 +05:30
Samyak S Sarnayak
e752bd06f7 Fix matching_words tests to compile successfully
The tests still fail due to a bug in https://github.com/meilisearch/tokenizer/pull/59
2022-01-17 11:37:45 +05:30
Samyak S Sarnayak
30247d70cd Fix search highlight for non-unicode chars
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
  clusters to highlight.

In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?

Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
  clusters from tokens
- `<mark>` tag is put around only the matched part
    - before this change, the entire word was highlighted even if only a
      part of it matched
2022-01-17 11:37:44 +05:30
Tamo
0605c0ac68 apply review comments 2022-01-13 18:51:08 +01:00
Tamo
b22c80106f add some settings to the fuzzed milli and use the published version of arbitrary json 2022-01-13 15:35:24 +01:00
Tamo
c94952e25d update the readme + dependencies 2022-01-12 18:30:11 +01:00
Tamo
e1053989c0 add a fuzzer on milli 2022-01-12 17:57:54 +01:00
Tamo
98a365aaae store the geopoint in three dimensions 2021-12-14 12:21:24 +01:00
Tamo
d671d6f0f1 remove an unused file 2021-12-13 19:27:34 +01:00
Clément Renault
25faef67d0 Remove the database setup in the filter_depth test 2021-12-09 11:57:53 +01:00
Clément Renault
65519bc04b Test that empty filters return a None 2021-12-09 11:57:53 +01:00
Clément Renault
ef59762d8e Prefer returning None instead of the Empty Filter state 2021-12-09 11:57:52 +01:00
Clément Renault
ee856a7a46 Limit the max filter depth to 2000 2021-12-07 17:36:45 +01:00
Clément Renault
32bd9f091f Detect the filters that are too deep and return an error 2021-12-07 17:20:11 +01:00
Clément Renault
90f49eab6d Check the filter max depth limit and reject the invalid ones 2021-12-07 16:32:48 +01:00
many
1b3923b5ce Update all packages to 0.21.0 2021-11-29 12:17:59 +01:00
many
8970246bc4 Sort positions before iterating over them during word pair proximity extraction 2021-11-22 18:16:54 +01:00
Marin Postma
6e977dd8e8 change visibility of DocumentDeletionResult 2021-11-22 15:44:44 +01:00
many
35f9499638 Export tokenizer from milli 2021-11-18 16:57:12 +01:00
many
64ef5869d7 Update tokenizer v0.2.6 2021-11-18 16:56:05 +01:00
Marin Postma
6eb47ab792 remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
Marin Postma
09b4281cff improve document addition returned metaimprove document addition
returned metaimprove document addition returned metaimprove document
addition returned metaimprove document addition returned metaimprove
document addition returned metaimprove document addition returned
metaimprove document addition returned meta
2021-11-10 14:08:36 +01:00
Marin Postma
721fc294be improve document deletion returned meta
returns both the remaining number of documents and the number of deleted
documents.
2021-11-10 14:08:18 +01:00
Tamo
f28600031d Rename the filter_parser crate into filter-parser
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-11-09 16:41:10 +01:00
Irevoire
0ea0146e04 implement deref &str on the tokens 2021-11-09 11:34:10 +01:00
Tamo
7483c7513a fix the filterable fields 2021-11-07 01:52:19 +01:00
Tamo
e5af3ac65c rename the filter_condition.rs to filter.rs 2021-11-06 16:37:55 +01:00
Tamo
6831c23449 merge with main 2021-11-06 16:34:30 +01:00
Tamo
b249989bef fix most of the tests 2021-11-06 01:32:12 +01:00
Tamo
27a6a26b4b makes the parse function part of the filter_parser 2021-11-05 10:46:54 +01:00
Tamo
76d961cc77 implements the last errors 2021-11-04 17:42:06 +01:00
Tamo
8234f9fdf3 recreate most filter error except for the geosearch 2021-11-04 17:24:55 +01:00
Tamo
07a5ffb04c update http-ui 2021-11-04 15:52:22 +01:00
Tamo
a58bc5bebb update milli with the new parser_filter 2021-11-04 15:02:36 +01:00
many
743ed9f57f Bump milli version 2021-11-04 14:04:21 +01:00
many
7b3bac46a0 Change Attribute and Ranking rules errors 2021-11-04 13:19:32 +01:00
many
702589104d Update version for the next release (v0.20.1) 2021-11-03 14:20:01 +01:00
many
0c0038488c Change last error messages 2021-11-03 11:24:06 +01:00
Tamo
76a2adb7c3 re-enable the tests in the parser and start the creation of an error type 2021-11-02 17:35:17 +01:00
bors[bot]
5a6d22d4ec Merge #407
407: Update version for the next release (v0.20.0) r=curquiza a=curquiza

Breaking because of #405 and #406 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-28 13:43:48 +00:00
bors[bot]
08ae47e475 Merge #405
405: Change some error messages r=ManyTheFish a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 13:35:55 +00:00
Clémentine Urquizar
056ff13c4d Update version for the next release (v0.20.0) 2021-10-28 14:52:57 +02:00
many
9f1e0d2a49 Refine asc/desc error messages 2021-10-28 14:47:17 +02:00
many
ed6db19681 Fix PR comments 2021-10-28 11:18:32 +02:00
marin postma
183d3dada7 return document count from builder 2021-10-28 10:33:04 +02:00
many
2be755ce75 Lower error check, already check in meilisearch 2021-10-27 19:50:41 +02:00
many
3599df77f0 Change some error messages 2021-10-27 19:33:01 +02:00
bors[bot]
d7943fe225 Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00