Commit Graph

24 Commits

Author SHA1 Message Date
610e62c0e1 plug in the document deletion in cellulite 2025-09-17 10:18:13 +02:00
aa85f0dde8 update to the latest version of cellulite and steppe 2025-09-17 10:18:13 +02:00
d7be43494c fmt 2025-09-17 10:18:13 +02:00
21af00a27b finish plugin cellulite to the new indexer 2025-09-17 10:18:13 +02:00
1e8f4fdf8a Cellulite is almost in the new indexer. We must add the documentID to the geojson pipeline 2025-09-17 10:18:13 +02:00
31cb960992 Make clippy happy 2025-08-26 10:19:55 +02:00
cb16baab18 Add more progress levels to measure merging 2025-03-17 10:13:29 +01:00
4aa7c8f7b1 Remove unused FacetFieldIdOperation 2025-01-09 10:36:37 +01:00
c14967eeac Use new incremental facet indexing and enable sanity checks in debug 2025-01-09 10:36:35 +01:00
71f59749dc Reduce union impact in merging 2024-12-09 15:44:06 +01:00
cac355bfa7 Merge #5124
5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops

In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache:

 - Optimize the prefix generation for word position docids (`@manythefish)`
 - Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)`
 
## Benchmarks on 1cpu 2gb gpo3 (5k IOps)
 
Before on the tag meilisearch-v1.12.0-rc.3.

```
word_position_docids:merge_and_send_docids: 988s
compute_word_fst: 23.3s
word_pair_proximity_docids:merge_and_send_docids: 428s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s
```

After sorting the whole `HashMap`s in a `Vec` on this branch.

```
word_position_docids:merge_and_send_docids: 202s
compute_word_fst: 20.4s
word_pair_proximity_docids:merge_and_send_docids: 427s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s
```

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-05 09:35:52 +00:00
52843123d4 Clean up and remove the non-sorted merge_caches function 2024-12-05 10:03:05 +01:00
5f896b1050 Fix geo when spilling 2024-12-04 17:51:12 +01:00
be411435f5 Use the merge_caches_alt function in the docids merging 2024-12-04 16:37:29 +01:00
76d0623b11 Reduce the number of unwraps 2024-12-02 10:05:06 +01:00
a514ce472a Make clippy happy 2024-11-27 14:59:04 +01:00
8442db8101 Implement mostly all senders 2024-11-27 14:16:35 +01:00
e0864f1b21 Separate side effect and debug asserts 2024-11-20 16:25:17 +01:00
7cb8732b45 Introduce a new bincode internal error 2024-11-20 13:23:11 +01:00
9e8367f1e6 Move the rayon thread pool outside the extract method 2024-11-14 10:40:32 +01:00
8e5b1a3ec1 Compute the field distribution and convert _geo into an f64s 2024-11-13 17:44:05 +01:00
b17896d899 Finialize the GeoExtractor 2024-11-13 17:43:02 +01:00
1477b81d38 Support cancelation in merge and send 2024-11-07 11:23:49 +01:00
10feeb88f2 Merge branch 'main' into indexer-edition-2024 2024-11-06 15:19:18 +01:00