Commit Graph

350 Commits

Author SHA1 Message Date
Clément Renault
00a03742ff Prefer using extend when merging bitmaps than unions (less allocations) 2025-01-09 10:42:38 +01:00
Louis Dureuil
d11e359244 When spilling on the next fid, no longer ignore children 2025-01-09 10:36:38 +01:00
Louis Dureuil
09d45439c7 Check valid_facet_value as part of a filter of the iterator 2025-01-09 10:36:38 +01:00
Louis Dureuil
5d92da0c73 No longer ignore the first child without parent 2025-01-09 10:36:38 +01:00
Louis Dureuil
677bb39e73 Modernize valid_lmdb_key 2025-01-09 10:36:38 +01:00
Louis Dureuil
85ea77de0b Switch to an iterative algorithm for find_changed_parents 2025-01-09 10:36:38 +01:00
Louis Dureuil
03317be0bd Update after review 2025-01-09 10:36:38 +01:00
Louis Dureuil
4aa7c8f7b1 Remove unused FacetFieldIdOperation 2025-01-09 10:36:37 +01:00
Louis Dureuil
ce57a342a3 center groups 2025-01-09 10:36:37 +01:00
Louis Dureuil
1cc6cd78e0 Fix uselessly deep stack trace 2025-01-09 10:36:37 +01:00
Louis Dureuil
c204afdc79 Update snapshot 2025-01-09 10:36:37 +01:00
Louis Dureuil
c14967eeac Use new incremental facet indexing and enable sanity checks in debug 2025-01-09 10:36:35 +01:00
Louis Dureuil
f38db86120 Add new incremental facet indexing 2025-01-09 10:24:36 +01:00
Louis Dureuil
50b155fa2d add valid_facet_value utility function 2025-01-09 10:24:36 +01:00
Louis Dureuil
a533c8e041 Add sanity checks for facet values 2025-01-09 10:24:36 +01:00
Tamo
908adee6fc Fix the addition of empty payload 2025-01-09 10:24:36 +01:00
Clément Renault
71e5605daa Make clippy happy 2025-01-08 18:24:39 +01:00
Clément Renault
68333424c6 Remove a useless script test 2025-01-08 15:59:43 +01:00
Clément Renault
5e8144b0e1 Remove fuzzing feature 2025-01-08 15:59:03 +01:00
Louis Dureuil
4275833bab Rename compute.rs to post_process.rs 2025-01-07 15:31:20 +01:00
Louis Dureuil
de7f8c4406 refactor indexer mod 2025-01-07 15:29:02 +01:00
Gnosnay
525e67ba93 Fix the format and linter error 2024-12-28 20:35:55 +08:00
Gnosnay
44eb153619 Replace hardcoded string with constants 2024-12-28 20:35:55 +08:00
meili-bors[bot]
ba11121cfc Merge #5159
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 11s
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 22s
Test suite / Run Rustfmt (push) Successful in 1m18s
Test suite / Run Clippy (push) Successful in 5m30s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
5159: Fix the New Indexer Spilling r=irevoire a=Kerollmops

Fix two bugs in the merging of the spilled caches. Thanks to `@ManyTheFish` and `@irevoire` 👏

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-12-12 17:16:53 +00:00
ManyTheFish
acdd5aa6ea Use the thread source id instead of the destination id
when filtering on the cache to merge
2024-12-12 18:12:00 +01:00
Kerollmops
2f3cc8cdd2 Fix the merge_caches_sorted function 2024-12-12 16:15:37 +01:00
ManyTheFish
961de4d34e Fix facet fst 2024-12-12 15:12:28 +01:00
meili-bors[bot]
1fc90fbacb Merge #5147
5147: Batch progress r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5068

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-12-12 09:15:54 +00:00
Tamo
08fd026ebd fix warning 2024-12-11 18:18:13 +01:00
Tamo
fa885e75b4 rename the send_progress in progress 2024-12-11 18:13:12 +01:00
Tamo
f1beb60204 make the progress use payload instead of documents 2024-12-11 18:07:45 +01:00
Tamo
867e6a8f1d rename the send_progress field to progress since it s not sending anything 2024-12-11 16:25:01 +01:00
Tamo
6f4823fc97 make the number of document in the document tasks more incremental 2024-12-11 16:25:01 +01:00
Tamo
df9b68f8ed inital implementation of the progress 2024-12-11 16:25:01 +01:00
Louis Dureuil
bfca54cc2c Return docid in case of errors while rendering the document template 2024-12-11 15:26:18 +01:00
Kerollmops
a751972c57 Prefer using a stable than a random hash builder 2024-12-10 14:25:53 +01:00
Kerollmops
89637bcaaf Use bumparaw-collections in Meilisearch/milli 2024-12-10 11:52:20 +01:00
ManyTheFish
07f42e8057 Do not index a filed count when no word is counted 2024-12-09 15:45:12 +01:00
ManyTheFish
71f59749dc Reduce union impact in merging 2024-12-09 15:44:06 +01:00
Kerollmops
f5dd8dfc3e Rollback max memory usage changes 2024-12-09 10:26:30 +01:00
Louis Dureuil
bd5110a2fe Fix clippy warnings 2024-12-05 16:13:07 +01:00
Louis Dureuil
fa8b9acdf6 Ignore documents that didn't change in facets 2024-12-05 16:12:52 +01:00
Louis Dureuil
2b74d1824b Ignore documents that didn't change any field in word pair proximity 2024-12-05 15:56:22 +01:00
Louis Dureuil
c77b00d3ac Don't extract word docids when no searchable changed 2024-12-05 15:51:58 +01:00
Louis Dureuil
c77073efcc Update::has_changed_for_fields 2024-12-05 15:50:12 +01:00
meili-bors[bot]
cac355bfa7 Merge #5124
5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops

In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache:

 - Optimize the prefix generation for word position docids (`@manythefish)`
 - Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)`
 
## Benchmarks on 1cpu 2gb gpo3 (5k IOps)
 
Before on the tag meilisearch-v1.12.0-rc.3.

```
word_position_docids:merge_and_send_docids: 988s
compute_word_fst: 23.3s
word_pair_proximity_docids:merge_and_send_docids: 428s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s
```

After sorting the whole `HashMap`s in a `Vec` on this branch.

```
word_position_docids:merge_and_send_docids: 202s
compute_word_fst: 20.4s
word_pair_proximity_docids:merge_and_send_docids: 427s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s
```

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-05 09:35:52 +00:00
Kerollmops
52843123d4 Clean up and remove the non-sorted merge_caches function 2024-12-05 10:03:05 +01:00
meili-bors[bot]
6298db5bea Merge #5113
5113: Fix the Minimum BBQueue channel threshold r=Kerollmops a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-12-05 09:01:02 +00:00
Louis Dureuil
3a11e39c01 Force max_memory to a min of 100MiB 2024-12-04 17:53:30 +01:00
Louis Dureuil
5f896b1050 Fix geo when spilling 2024-12-04 17:51:12 +01:00