Commit Graph

10302 Commits

Author SHA1 Message Date
6d16230f17 Refactor 2024-10-01 17:19:15 +03:00
b7a5ba100e Move the ParallelIteratorExt into the parallel_iterator_ext module 2024-10-01 11:11:52 +02:00
dead7a56a3 Keep the caches in the AppendOnlyVec 2024-10-01 11:11:39 +02:00
0a8cb471df Introduce the AppendOnlyVec struct for the parallel computing 2024-10-01 11:11:25 +02:00
00e045b249 Rename and use the try_arc_for_each_try_init method 2024-10-01 11:11:25 +02:00
d83c9a4074 Introduce the try_for_each_try_init method to be used with Arced Errors 2024-10-01 11:11:25 +02:00
f3356ddaa4 Fix the errors when using the try_map_try_init method 2024-10-01 11:11:10 +02:00
31de5c747e WIP using try_map_try_init 2024-10-01 11:10:53 +02:00
3843240940 Prefer using Ars instead of Options 2024-10-01 11:10:53 +02:00
8cb5e7437d try using try_map_try_init 2024-10-01 11:10:53 +02:00
5b776556fe Add ParallelIteratorExt 2024-10-01 11:10:53 +02:00
bb7a503e5d Compute prefix databases
We are now computing the prefix FST and a prefix delta in the Merger thread,
after all the databases are written, the main thread will recompute the prefix databases based on the prefix delta without needing any grenad temporary file anymore
2024-10-01 09:57:06 +02:00
eabc14c268 Refactor, handle more cases for phrases 2024-09-30 21:24:41 +03:00
e78da35287 Merge #4930
4930: Return `UserError::InvalidDocumentId` for primary keys with a length greater than 512 bytes r=curquiza a=flevi29

# Pull Request

## Related issue
Fixes #4843

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: F. Levi <55688616+flevi29@users.noreply.github.com>
2024-09-30 15:55:05 +00:00
64589278ac Appease *some* of clippy warnings 2024-09-30 16:08:29 +02:00
8df6daf308 Remove fid_wordcount_docids.rs 2024-09-30 11:52:31 +02:00
5b552caf42 Fix position in insertions 2024-09-30 11:46:32 +02:00
2b51a63418 Remove dead code 2024-09-30 11:42:36 +02:00
3d8024fb2b write the weighted fields ids map 2024-09-30 11:35:03 +02:00
4b0da0ff24 Fix inversion of field_id and position 2024-09-30 11:34:50 +02:00
079f2b5de0 Format error messages consistently 2024-09-30 11:34:31 +02:00
84b4219a4f test: improve delete_index.rs 2024-09-29 10:16:31 +02:00
5539a1904a test: improve performance of create_index.rs 2024-09-28 11:05:52 +02:00
00ccf53ffa Merge branch 'main' into change-matches-position-phrase-search 2024-09-27 15:52:05 +03:00
d20a39b959 Refactor find_best_match_interval 2024-09-27 15:44:30 +03:00
71b364286b Merge #4957
4957: Update charabia feature flags r=dureuill a=ManyTheFish

# Pull Request

Add charabia's `turkish` feature flag into Meilisearch default tokenization flag



[All tests pipeline](https://github.com/meilisearch/meilisearch/actions/runs/11030036031)

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-26 20:19:21 +00:00
86183e0807 Merge #4960
4960: Update rhai r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4956

A fix has been implemented in https://github.com/rhaiscript/rhai/issues/916

## What does this PR do?
- Use the latest version of rhai containing the fix

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-26 15:03:01 +00:00
78a4b7949d update rhai to a version that shouldn’t panic 2024-09-26 15:04:03 +02:00
960060ebdf Fix fst builder when their is no previous FST 2024-09-25 16:53:00 +02:00
3d244451df Reduce the lru key size from 8 to 12 bytes 2024-09-25 16:14:13 +02:00
5f53935c8a Fix a bug in the Lru 2024-09-25 16:09:34 +02:00
29a7623c3f Fxi some logs 2024-09-25 15:57:50 +02:00
e97041f7d0 Replace the Lru free list by a simple increment 2024-09-25 15:55:52 +02:00
52d7f3ed1c Reduce the lru key size from 20 to 8 bytes 2024-09-25 15:37:13 +02:00
86d5e6d9ff Use the new Lru 2024-09-25 14:54:56 +02:00
759b9b1546 Introduce a new custom Lru 2024-09-25 14:49:12 +02:00
3f7a500f3b Build prefix fst 2024-09-25 14:36:06 +02:00
dc2cb58cf1 use charabia default for all-tokenization 2024-09-25 11:12:30 +02:00
e9580fe619 Add turkish normalization 2024-09-25 11:03:17 +02:00
8205254f4c Merge #4955
4955: Upgrade "batch failed" log to error level r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4916 


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-25 08:18:44 +00:00
974272f2e9 Merge branch 'main' into indexer-edition-2024 2024-09-25 07:41:16 +02:00
7ad037841f Move the tracing info to eprintln 2024-09-24 18:21:58 +02:00
e0c7067355 Expose an IndexedParallelIterator to the index function 2024-09-24 17:24:59 +02:00
efdc5739d7 Merge #4953
4953: Move the multi arroy index logic to the arroy wrapper r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4948

## What does this PR do?
- Make the `ArroyWrapper` we introduced in the last PR handle all the embedded for a specific docid itself.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-24 15:02:24 +00:00
b31e9bea26 while retrieving the readers on an arroywrapper, stops at the first empty reader 2024-09-24 16:33:17 +02:00
6e87332410 Change the way the FST is built 2024-09-24 16:28:31 +02:00
2d1caf27df Use eprintln to log 2024-09-24 15:59:50 +02:00
92678383d6 Update charabia 2024-09-24 15:37:56 +02:00
7f148c127c Measure the SmallVec efficacity 2024-09-24 15:32:15 +02:00
7f048b9732 early exit in the clear and contains 2024-09-24 15:02:38 +02:00