Commit Graph

99 Commits

Author SHA1 Message Date
07003704a8 Merge branch 'filter/field-exist' 2022-07-21 14:51:41 +02:00
1506683705 Avoid using too much memory when indexing facet-exists-docids 2022-07-19 14:42:35 +02:00
aed8c69bcb Refactor indexation of the "facet-id-exists-docids" database
The idea is to directly create a sorted and merged list of bitmaps
in the form of a BTreeMap<FieldId, RoaringBitmap> instead of creating
a grenad::Reader where the keys are field_id and the values are docids.

Then we send that BTreeMap to the thing that handles TypedChunks, which
inserts its content into the database.
2022-07-19 10:07:33 +02:00
80b962b4f4 Run cargo fmt 2022-07-19 10:07:33 +02:00
30bd4db0fc Simplify indexing task for facet_exists_docids database 2022-07-19 10:07:33 +02:00
392472f4bb Apply suggestions from code review
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-19 10:07:33 +02:00
453d593ce8 Add a database containing the docids where each field exists 2022-07-19 10:07:33 +02:00
2eec290424 Check the validity of the latitute and longitude numbers 2022-07-12 15:14:06 +02:00
d1a4da9812 Generate a real UUIDv4 when ids are auto-generated 2022-07-12 15:14:06 +02:00
fcfc4caf8c Move the Object type in the lib.rs file and use it everywhere 2022-07-12 14:55:51 +02:00
0146175fe6 Introduce the validate_documents_batch function 2022-07-12 14:55:51 +02:00
86ac8568e6 Use Charabia in milli 2022-06-02 16:59:11 +02:00
0af399a6d7 fix the mixed dataset geosearch indexing bug 2022-05-16 17:37:45 +02:00
c55368ddd4 apply code suggestion
Co-authored-by: Kerollmops <kero@meilisearch.com>
2022-05-04 14:11:03 +02:00
3cb1f6d0a1 improve geosearch error messages 2022-05-02 19:20:47 +02:00
4f3ce6d9cd nested fields 2022-04-07 16:58:46 +02:00
201fea0fda limit extract_word_docids memory usage 2022-04-05 14:14:15 +02:00
b85cd4983e remove field_id_from_position 2022-04-05 09:50:34 +02:00
b7694c34f5 remove println 2022-04-04 21:00:07 +02:00
6cabd47c32 fix typo in comment 2022-04-04 20:59:20 +02:00
6b2c2509b2 fix bug in exact search 2022-04-04 20:54:03 +02:00
8d46a5b0b5 extract exact word docids 2022-04-04 20:54:02 +02:00
0a77be4ec0 introduce exact_word_docids db 2022-04-04 20:54:02 +02:00
5f9f82757d refactor spawn_extraction_task 2022-04-04 20:54:02 +02:00
ff8d7a810d Change the behavior of the as_cloneable_grenad by taking a ref 2022-02-16 15:40:08 +01:00
f367cc2e75 Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
8970246bc4 Sort positions before iterating over them during word pair proximity extraction 2021-11-22 18:16:54 +01:00
c5a6075484 Make max_position_per_attributes changable 2021-10-12 10:10:50 +02:00
360c5ff3df Remove limit of 1000 position per attribute
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
2021-10-12 10:10:50 +02:00
3296bb243c Simplify word level position DB into a word position DB 2021-10-05 12:15:02 +02:00
a84f3a8b31 Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-09 15:09:35 +02:00
bad8ea47d5 edit the two lasts TODO comments 2021-09-08 18:24:09 +02:00
bd4c248292 improve the error handling in general and introduce the concept of reserved keywords 2021-09-08 18:24:09 +02:00
f73273d71c only call the extractor if needed 2021-09-08 17:51:08 +02:00
a21c854790 handle errors 2021-09-08 17:51:07 +02:00
70ab2c37c5 remove multiple bugs 2021-09-08 17:51:07 +02:00
b4b6ba6d82 rename all the ’long’ into ’lng’ like written in the specification 2021-09-08 17:51:07 +02:00
44d6b6ae9e Index the geo points 2021-09-08 17:51:07 +02:00
e54280fbfc Skip empty normalized words 2021-09-08 15:25:23 +02:00
db0c681bae Fix Pr comments 2021-09-02 15:17:52 +02:00
4860fd4529 Ignore empty facet values 2021-09-01 16:48:40 +02:00
b3a22f31f6 Fix memory consuption in word pair proximity extractor 2021-09-01 16:48:40 +02:00
8f702828ca Ignore errors comming from crossbeam channel senders 2021-09-01 16:48:40 +02:00
e09eec37bc Handle distance addition with hard separators 2021-09-01 16:48:40 +02:00
fc7cc770d4 Add logging timers 2021-09-01 16:48:40 +02:00
a2f59a28f7 Remove unwrap sending errors in channel 2021-09-01 16:48:40 +02:00
2d1727697d Take stop word in account 2021-09-01 16:48:40 +02:00
823da19745 Fix test and use progress callback 2021-09-01 16:48:39 +02:00
1d314328f0 Plug new indexer 2021-09-01 16:48:36 +02:00