Commit Graph

760 Commits

Author SHA1 Message Date
meili-bors[bot]
885710a07b Merge #5341
5341: Embeddings stats r=ManyTheFish a=ManyTheFish

# Pull Request

## Related issue
Fixes #5321

## What does this PR do?
- Add embedding stats
- force dumpless upgrade to recompute stats
- add tests


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-12 15:46:37 +00:00
ManyTheFish
c55fdad2c3 Fix dumpless upgrade target version 2025-02-12 16:35:05 +01:00
ManyTheFish
1caad4c4b0 Add multiple embeddings for the same embedder in tests 2025-02-12 16:13:34 +01:00
ManyTheFish
8419ed52a1 fix clippy 2025-02-12 14:38:51 +01:00
ManyTheFish
a65c52cc97 Convert dump test into snapshots 2025-02-12 14:14:10 +01:00
ManyTheFish
49e9655c24 Update snapshots 2025-02-12 14:05:32 +01:00
meili-bors[bot]
fa763ca5dc Merge #5339
5339: Add back timeout from v1.11.3 r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #5337

## What does this PR do?
- Fix regression compared with v1.11 by reintroducing the 30s timeout on all REST API calls.

Thanks to `@migueltarga` for reporting the issue


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-12 12:50:27 +00:00
ManyTheFish
c7aeb554b2 Add tests 2025-02-12 13:37:41 +01:00
Louis Dureuil
8e0d8d31f9 Add back timeout from v1.11.3 2025-02-12 11:53:00 +01:00
meili-bors[bot]
81a38099ec Merge #5336
5336: Meilitool Hair Dryer r=dureuill a=Kerollmops

This pull request introduces a new subcommand to hair dry a specific part of specific indexes. It is useful when [the memory-mapped pages are not hot in the cache](https://arc.net/l/quote/ixhcdwcq) and must be. Hair drying those interesting pages makes the search requests using the vector store much faster.

The previous technique used the "cat method," which consists of reading the whole LMDB data file and pipping it into the null file descriptor. By doing that, the whole LMDB data file becomes hot in the cache. However, when the database is large, at least 30% of it is free, and unused pages and many other pages don't need to be hot, e.g., raw JSON documents or uninteresting parts of the inverted index.

This new subcommand reads all the Arroy pages of a given index to make them hot, and only those. More coming...

The current algorithm is single-threaded and takes a lot of time. I am in the process of multithreading it. This is the time it takes to hair dry a 305GiB database with a single thread.

```
real    21m51.054s
user    0m3.155s
sys     0m19.393s
```

## To Do
- [ ] (optional) Do the reads in parallel.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-12 10:45:16 +00:00
ManyTheFish
bd27fe7d02 force dumpless upgrade to recompute stats 2025-02-12 11:45:02 +01:00
ManyTheFish
41203f0931 Add embedders stats 2025-02-12 11:37:47 +01:00
Kerollmops
803a699b15 Remove unsafes 2025-02-12 10:46:45 +01:00
Kerollmops
246ad3b06e Display a progress percentage 2025-02-12 09:56:05 +01:00
meili-bors[bot]
225af069a9 Merge #5149
5149: Ensure the settings routes are now configurated when a new field is added to the Settings struct  r=curquiza a=MichaScant

# Pull Request
## Related issue
Fixes #5126 

## What does this PR do?
Ensures the settings routes are properly configured before a new field is added to the settings structure. Changes were made based on what was proposed in the original issue, any new field for settings struct is added in the [make_settings_route! macro list](6298db5bea/crates/meilisearch/src/routes/indexes/settings.rs (L182-L403)) 

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: michascant <89426143+MichaScant@users.noreply.github.com>
2025-02-11 20:10:29 +00:00
meili-bors[bot]
70305b9f71 Merge #5332
5332: Fix geo update r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #5331

## What does this PR do?
- use the merged version that contains all fields instead of the updated version that contains only updated fields
- add test that detects the problem
- As it is the second time that `changes.updated` is causing a bug, I'm changing its name to `only_changed_fields`, hopefully better communicating that old fields are not there


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-11 18:51:33 +00:00
Kerollmops
5dab435d13 Add more logs about read txns 2025-02-11 18:14:48 +01:00
Kerollmops
c83c1a3c51 Introduce the Hair Dryer meilitool sucommand 2025-02-11 18:01:53 +01:00
Louis Dureuil
b83275c9c5 Change the updated* functions to only_new functions, hopefully better communicating what they do 2025-02-11 15:27:10 +01:00
Louis Dureuil
d7f35ee3ba Use merged document instead of updated 2025-02-11 15:27:10 +01:00
Louis Dureuil
1dce341bfb Add test 2025-02-11 15:27:10 +01:00
Tamo
43c8d54501 fix test after rebase 2025-02-11 11:19:13 +01:00
Tamo
84e2a1f836 rename the atomic to something more meaningful 2025-02-11 11:14:49 +01:00
Tamo
00eb47d42e use serde_json::to_writer instead of serializing + writing 2025-02-11 11:14:49 +01:00
Tamo
9293e7f2c1 fix tests after rebase 2025-02-11 11:14:49 +01:00
Tamo
80198aa855 add a dump test with batches and enqueued tasks 2025-02-11 11:14:49 +01:00
Tamo
fa00b42c93 fix the missing batch in the dumps in meilisearch and meilitools 2025-02-11 11:14:49 +01:00
Clément Renault
acb06cb3e6 Improve the error message when missing documents
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-10 16:53:50 +01:00
Kerollmops
7d0d8f4445 Make the feature experimental 2025-02-10 16:11:32 +01:00
Kerollmops
491d115c3c Change the route to get the task documents 2025-02-10 14:55:07 +01:00
Kerollmops
55fa2dda00 Update the Open API example 2025-02-10 14:52:48 +01:00
Kerollmops
c71eea8023 Improve error message when update file has been processed 2025-02-10 14:33:01 +01:00
Kerollmops
df40533741 Expose a route to get the update file content of a task 2025-02-10 14:05:32 +01:00
meili-bors[bot]
0c3e7fe963 Merge #5316
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 2s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 16s
Test suite / Run Clippy (push) Failing after 12s
Test suite / Run Rustfmt (push) Failing after 32s
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5316: Fix the dumpless upgrade corruption r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5280

## What does this PR do?
- Add a test that ensure we write the version in the index-scheduler even if we have a bug while writing the VERSION file
- Do what was described in the issue


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-10 09:53:57 +00:00
Tamo
45f843ccb9 fmt 2025-02-10 10:46:42 +01:00
Tamo
35b6bca598 remove the failing test 2025-02-10 10:20:14 +01:00
Tamo
7f82d33597 update the version file atomically 2025-02-06 18:23:28 +01:00
Tamo
8c5856007c flush+sync the version file just in case 2025-02-06 18:04:43 +01:00
Tamo
ae1d7f4d9b Improve the test and disable it on windows and linux since they don't work on the CI 2025-02-06 17:54:12 +01:00
meili-bors[bot]
792be63567 Merge #5323
5323: exclude network time from processingMs r=Kerollmops a=dureuill



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-06 16:35:44 +00:00
Kerollmops
ca1ad51564 Put the Ollama tests under a feature 2025-02-06 17:27:47 +01:00
Louis Dureuil
70aac71c63 exclude network time from processingMs 2025-02-06 17:18:36 +01:00
Louis Dureuil
56438bdea4 Introduce an Ollama integration test 2025-02-06 17:12:17 +01:00
meili-bors[bot]
a562d6abc1 Merge #5322
5322: Make sure arroy is using the rayon thread-pool r=dureuill a=Kerollmops

This PR fixes #5249 by ensuring arroy uses the rayon thread pool.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-06 15:28:47 +00:00
michascant
33b67b82e1 fixed rustfmt errors 2025-02-06 09:57:39 -05:00
Kerollmops
5f2a1a4fd1 Skip the documents before fetching them 2025-02-06 15:40:22 +01:00
Kerollmops
2b0e17ede0 Make sure arroy is using the rayon thread-pool 2025-02-06 15:28:10 +01:00
Kerollmops
37092adc71 Show a bit of progress 2025-02-06 10:37:05 +01:00
Kerollmops
86fcad788e Introduce a parameter to skip the first documents 2025-02-06 10:32:50 +01:00
Kerollmops
2ea5c57871 Create a new export documents meilitool subcommand based on v1.12 2025-02-06 10:32:39 +01:00