Commit Graph

9037 Commits

Author SHA1 Message Date
Louis Dureuil
c534a1b687 Stop using delete documents pipeline in batch runner 2023-10-30 11:41:22 +01:00
Louis Dureuil
2263dff02b Stop using removed delete pipelines almost everywhere 2023-10-30 11:41:22 +01:00
Louis Dureuil
d651b3ef01 Remove delete documents files 2023-10-30 11:41:20 +01:00
ManyTheFish
762b0b47e6 Use deladd merging function in chunks mergers 2023-10-30 11:40:20 +01:00
Louis Dureuil
01d5eedf2f Remove some warnings 2023-10-30 11:40:20 +01:00
Louis Dureuil
073f89db79 Fix facet tests 2023-10-30 11:40:20 +01:00
Louis Dureuil
8370fbc92b Fix snaps 2023-10-30 11:40:20 +01:00
Louis Dureuil
85f42fbc03 Handle external to internal id mapping from TypedChunk::Documents 2023-10-30 11:40:20 +01:00
Louis Dureuil
c6b3c18c85 WIP: Comment out document deletion in other pipelines than update
TODO: fix calls to DELETE route
2023-10-30 11:40:20 +01:00
Louis Dureuil
bafeb892a7 Modify Index after changes to ExternalDocumentsIds 2023-10-30 11:40:20 +01:00
Louis Dureuil
8fb221dae3 Refactor ExternalDocumentsIds
- Remove soft deleted
- Add apply method that takes a list of operations to encapsulate modifications to the external -> internal mapping
2023-10-30 11:40:20 +01:00
Louis Dureuil
5be569e3e2 Update obkv 2023-10-30 11:40:20 +01:00
Louis Dureuil
946c762d28 WIP: reset documents in TypedChunk::Documents 2023-10-30 11:40:20 +01:00
Louis Dureuil
cda6ca1ee6 Remove TypedChunk::NewDocumentIds 2023-10-30 11:40:18 +01:00
Louis Dureuil
696fcf4d18 Fix document insertion into LMDB 2023-10-30 11:39:31 +01:00
ManyTheFish
476e4d3dbe Use value buffer instead of the initial value when writting the final result in the sorter 2023-10-30 11:39:31 +01:00
Clément Renault
576fa9c6da Remove useless comment 2023-10-30 11:39:31 +01:00
Kerollmops
77dcbff6b2 Remove and Insert the DelAdd geo points 2023-10-30 11:39:31 +01:00
Kerollmops
544440c363 Ignore geo fields when the Del and Add content is the same 2023-10-30 11:39:31 +01:00
Clément Renault
a3dae4db9b Extract the geo fields DelAdd and generate a new DelAdd obkv with it 2023-10-30 11:39:31 +01:00
ManyTheFish
ba90a5ec0e update extract fid word count docids 2023-10-30 11:39:31 +01:00
Louis Dureuil
b26dc9aabe Explanatory code comment 2023-10-30 11:39:31 +01:00
Louis Dureuil
66abac9364 Use specialized KvReaderDelAdd type
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-10-30 11:39:31 +01:00
Louis Dureuil
59f88c14b3 Simplify facet update after removing Index::faceted_documents_ids 2023-10-30 11:39:29 +01:00
Louis Dureuil
14832cb324 Remove Index::faceted_documents_ids 2023-10-30 11:37:32 +01:00
Louis Dureuil
04ec293024 Facet Incremental update 2023-10-30 11:37:30 +01:00
Louis Dureuil
f67ff3a738 Facets Bulk update 2023-10-30 11:36:40 +01:00
Clément Renault
560e8f5613 Introduce the CboRoaringBitmapCodec merge_deladd_into and use it 2023-10-30 11:34:55 +01:00
Clément Renault
2d3f15f82c Introduce a function to only serialize the Add side of a DelAdd obkv 2023-10-30 11:34:55 +01:00
Clément Renault
40186bf403 Rename FieldIdWordCountDocids correctly 2023-10-30 11:34:50 +01:00
ManyTheFish
87e3d27878 update extract word pair proximity to support deladd obkvs 2023-10-30 11:34:02 +01:00
ManyTheFish
6bcf8b4f8c update extract word position docids 2023-10-30 11:34:02 +01:00
ManyTheFish
46aa75abdb update extract word docids 2023-10-30 11:34:02 +01:00
ManyTheFish
2597bbd107 Make script language docids map taking a tuple of roaring bitmaps expressing the deletions and the additions 2023-10-30 11:34:00 +01:00
Clément Renault
e2bc054604 Update extract_facet_string_docids to support deladd obkvs 2023-10-30 11:32:36 +01:00
Clément Renault
fcd3a1434d Update extract_facet_number_docids to support deladd obkvs 2023-10-30 11:31:04 +01:00
Clément Renault
a82dee21e0 Rename docid_fid into fid_docid 2023-10-30 11:31:02 +01:00
Clément Renault
bc45c1206d Implement all the facet extraction paths and simplify them 2023-10-30 11:29:08 +01:00
Clément Renault
6ae4100f07 Generate the DelAdd for is_null, is_empty, and exists 2023-10-30 11:29:08 +01:00
Clément Renault
0c47defeee Work on fid docid facet values rewrite 2023-10-30 11:29:06 +01:00
ManyTheFish
313b16bec2 Support diff indexing on extract_docid_word_positions 2023-10-30 11:24:19 +01:00
ManyTheFish
1dd97578a8 Make the transform struct return diff-based documents obkvs 2023-10-30 11:22:07 +01:00
ManyTheFish
f5ef69293b deactivate prefix dbs 2023-10-30 11:22:07 +01:00
ManyTheFish
1c5705c164 clean PR warnings 2023-10-30 11:22:05 +01:00
ManyTheFish
66c2c82a18 Split wpp in several sorters 2023-10-30 11:15:02 +01:00
ManyTheFish
28a8d0ccda Fix word pair proximity 2023-10-30 11:15:02 +01:00
ManyTheFish
96be85396d Use a vecDeque in wpp database 2023-10-30 11:15:02 +01:00
ManyTheFish
df9e5c8651 Generalize usage of CboRoaringBitmap codec to ease the use 2023-10-30 11:15:02 +01:00
ManyTheFish
b541d48847 Add buffer to the obkv writter 2023-10-30 11:15:02 +01:00
ManyTheFish
8ccf32d1a0 Compute word_fid_docids before word_docids and exact_word_docids 2023-10-30 11:15:02 +01:00