Commit Graph

1877 Commits

Author SHA1 Message Date
23ea3ad738 Remove the useless threshold when computing the word prefix pair proximity 2022-01-25 17:04:23 +01:00
e3c34684c6 Fix a bug where we were skipping most of the prefix pairs 2022-01-25 17:04:23 +01:00
fd177b63f8 Merge #423
423: Remove an unused file r=irevoire a=irevoire

This empty file is not included anywhere

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-19 14:18:05 +00:00
0c84a40298 document batch support
reusable transform

rework update api

add indexer config

fix tests

review changes

Co-authored-by: Clément Renault <clement@meilisearch.com>

fmt
2022-01-19 12:40:20 +01:00
01968d7ca7 ensure we get no documents and no error when filtering on an empty db 2022-01-18 11:40:30 +01:00
8f4499090b Merge #433
433: fix(filter): Fix two bugs. r=Kerollmops a=irevoire

- Stop lowercasing the field when looking in the field id map
- When a field id does not exist it means there is currently zero
  documents containing this field thus we return an empty RoaringBitmap
  instead of throwing an internal error

Will fix https://github.com/meilisearch/MeiliSearch/issues/2082 once meilisearch is released

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-17 14:06:53 +00:00
d1ac40ea14 fix(filter): Fix two bugs.
- Stop lowercasing the field when looking in the field id map
- When a field id does not exist it means there is currently zero
  documents containing this field thus we returns an empty RoaringBitmap
  instead of throwing an internal error
2022-01-17 13:51:46 +01:00
2d7607734e Run cargo fmt on matching_words.rs 2022-01-17 13:04:33 +05:30
5ab505be33 Fix highlight by replacing num_graphemes_from_bytes
num_graphemes_from_bytes has been renamed in the tokenizer to
num_chars_from_bytes.

Highlight now works correctly!
2022-01-17 13:02:55 +05:30
e752bd06f7 Fix matching_words tests to compile successfully
The tests still fail due to a bug in https://github.com/meilisearch/tokenizer/pull/59
2022-01-17 11:37:45 +05:30
30247d70cd Fix search highlight for non-unicode chars
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
  clusters to highlight.

In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?

Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
  clusters from tokens
- `<mark>` tag is put around only the matched part
    - before this change, the entire word was highlighted even if only a
      part of it matched
2022-01-17 11:37:44 +05:30
98a365aaae store the geopoint in three dimensions 2021-12-14 12:21:24 +01:00
d671d6f0f1 remove an unused file 2021-12-13 19:27:34 +01:00
25faef67d0 Remove the database setup in the filter_depth test 2021-12-09 11:57:53 +01:00
65519bc04b Test that empty filters return a None 2021-12-09 11:57:53 +01:00
ef59762d8e Prefer returning None instead of the Empty Filter state 2021-12-09 11:57:52 +01:00
ee856a7a46 Limit the max filter depth to 2000 2021-12-07 17:36:45 +01:00
32bd9f091f Detect the filters that are too deep and return an error 2021-12-07 17:20:11 +01:00
90f49eab6d Check the filter max depth limit and reject the invalid ones 2021-12-07 16:32:48 +01:00
8970246bc4 Sort positions before iterating over them during word pair proximity extraction 2021-11-22 18:16:54 +01:00
6e977dd8e8 change visibility of DocumentDeletionResult 2021-11-22 15:44:44 +01:00
35f9499638 Export tokenizer from milli 2021-11-18 16:57:12 +01:00
6eb47ab792 remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
09b4281cff improve document addition returned metaimprove document addition
returned metaimprove document addition returned metaimprove document
addition returned metaimprove document addition returned metaimprove
document addition returned metaimprove document addition returned
metaimprove document addition returned meta
2021-11-10 14:08:36 +01:00
721fc294be improve document deletion returned meta
returns both the remaining number of documents and the number of deleted
documents.
2021-11-10 14:08:18 +01:00
0ea0146e04 implement deref &str on the tokens 2021-11-09 11:34:10 +01:00
7483c7513a fix the filterable fields 2021-11-07 01:52:19 +01:00
e5af3ac65c rename the filter_condition.rs to filter.rs 2021-11-06 16:37:55 +01:00
6831c23449 merge with main 2021-11-06 16:34:30 +01:00
b249989bef fix most of the tests 2021-11-06 01:32:12 +01:00
27a6a26b4b makes the parse function part of the filter_parser 2021-11-05 10:46:54 +01:00
76d961cc77 implements the last errors 2021-11-04 17:42:06 +01:00
8234f9fdf3 recreate most filter error except for the geosearch 2021-11-04 17:24:55 +01:00
07a5ffb04c update http-ui 2021-11-04 15:52:22 +01:00
a58bc5bebb update milli with the new parser_filter 2021-11-04 15:02:36 +01:00
7b3bac46a0 Change Attribute and Ranking rules errors 2021-11-04 13:19:32 +01:00
0c0038488c Change last error messages 2021-11-03 11:24:06 +01:00
76a2adb7c3 re-enable the tests in the parser and start the creation of an error type 2021-11-02 17:35:17 +01:00
08ae47e475 Merge #405
405: Change some error messages r=ManyTheFish a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 13:35:55 +00:00
9f1e0d2a49 Refine asc/desc error messages 2021-10-28 14:47:17 +02:00
ed6db19681 Fix PR comments 2021-10-28 11:18:32 +02:00
183d3dada7 return document count from builder 2021-10-28 10:33:04 +02:00
2be755ce75 Lower error check, already check in meilisearch 2021-10-27 19:50:41 +02:00
3599df77f0 Change some error messages 2021-10-27 19:33:01 +02:00
d7943fe225 Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
baddd80069 implement review suggestions 2021-10-25 18:29:12 +02:00
f9445c1d90 return float parsing error context in csv 2021-10-25 17:27:10 +02:00
208903ddde Revert "Replacing pest with nom " 2021-10-25 11:58:00 +02:00
3fcccc31b5 add document builder example 2021-10-25 10:26:43 +02:00
430e9b13d3 add csv builder tests 2021-10-25 10:26:43 +02:00