Commit Graph

10455 Commits

Author SHA1 Message Date
4acfb9a1ec Refactor 2024-11-26 16:56:59 +02:00
3ed3188f54 Add Sequence impl, other changes/adjustments 2024-11-26 16:56:59 +02:00
69a51d4a70 Add Serialize impl 2024-11-26 16:56:59 +02:00
c52f9f8c16 Add manual serde Deserialize impl 2024-11-26 16:56:59 +02:00
0848635efc Change Actions enum to bitflags 2024-11-26 16:56:53 +02:00
dd76eaaaec Merge #5076
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 11s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 24s
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Run Clippy (push) Successful in 6m35s
Test suite / Run Rustfmt (push) Successful in 1m52s
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m8s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
Indexing bench (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of indexing (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Has been cancelled
5076: Update version for the next release (v1.12.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
v1.12.0-rc.0
2024-11-21 17:51:32 +00:00
98a785b0d7 Merge #5080
Some checks failed
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m43s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch, meilisearch-macos-amd64, macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Waiting to run
Look for flaky tests / flaky (push) Failing after 21s
Test suite / Tests on ubuntu-20.04 (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 22s
Test suite / Tests almost all features (push) Failing after 7s
Test suite / Test disabled tokenization (push) Failing after 7s
Test suite / Run tests in debug (push) Failing after 9s
Test suite / Run Rustfmt (push) Successful in 1m24s
Test suite / Run Clippy (push) Successful in 6m14s
Publish binaries to GitHub release / Check the version validity (push) Successful in 7s
Publish binaries to GitHub release / Publish binary for Linux (push) Failing after 9s
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch.exe, meilisearch-windows-amd64.exe, windows-2022) (push) Failing after 19s
Publish binaries to GitHub release / Publish binary for aarch64 (meilisearch-linux-aarch64, aarch64-unknown-linux-gnu) (push) Failing after 9s
5080: Fix getting a single batch through the GET route r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes a bug where getting a single batch does not work

Related to #5070 


fix by `@Kerollmops` 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 17:08:46 +00:00
ba7500998e Fix getting a single batch through the GET route 2024-11-21 17:59:31 +01:00
19e6f675b3 Merge #4900
4900: Indexer edition 2024 r=Kerollmops a=dureuill

This PR is implementing the indexer edition 2024, largely inspired by [the ideas from this blog post](https://blog.kerollmops.com/meilisearch-is-too-slow).

Fixes https://github.com/meilisearch/meilisearch/issues/4985

## Features
- Stream-first approach to reading documents.
- Minimum disk write operations.
- RAM usage-first approach to avoid modifying common bitmaps on disk but in memory.
- Reduced LMDB fragmentation by writing entries only once...
- ...computing the final version of the entries in parallel...
- ...and storing them in write-optimized data structures before sending them to the BTree (LMDB).
- Indexing in multiple transactions to improve large dataset support (dumps).


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 16:19:10 +00:00
323ecbb885 Add span on document operation 2024-11-21 17:01:10 +01:00
ffb60cb885 Add comment explaining why we fixed the version of insta 2024-11-21 16:56:56 +01:00
dcc3caef0d Remove TopLevelMap 2024-11-21 16:56:46 +01:00
221e547e86 Slight changes 2024-11-21 16:47:44 +01:00
61d0615253 Document the geo point extractor 2024-11-21 16:47:08 +01:00
5727e00374 Remove useless geo skipped 2024-11-21 16:47:08 +01:00
9b60843831 Remove commented lines 2024-11-21 16:47:07 +01:00
36962b943b First batch of PR comment 2024-11-21 16:38:11 +01:00
32bcacefd5 Changes Document::len to Document::top_level_fields_count 2024-11-21 15:01:07 +01:00
4ed195426c remove unused stuff in global.rs 2024-11-21 15:01:07 +01:00
ff38f29981 Update crates/index-scheduler/src/batch.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 14:18:39 +01:00
5899861ff0 Update version for the next release (v1.12.0) in Cargo.toml 2024-11-21 11:21:18 +00:00
94b260fd25 Remove orphan span 2024-11-21 12:12:07 +01:00
03ab6b39e7 Revert the change in run count for movies workload 2024-11-21 11:17:34 +01:00
ab2c83f868 Use the disk less when computing prefixes 2024-11-21 10:45:37 +01:00
9a08757a70 Merge #5070
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 12s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 11s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Run Clippy (push) Successful in 6m18s
Test suite / Run Rustfmt (push) Successful in 1m34s
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h4m33s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
5070: Improve the details and stats of the current batch processing r=Kerollmops a=irevoire

Small improvement we missed over https://github.com/meilisearch/meilisearch/pull/5060

The current batch processing had empty details and stats.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-11-20 16:56:01 +00:00
1f9692cd04 Increase map size for tests 2024-11-20 17:52:21 +01:00
1e694ae432 improve the count of the number of tasks in a batch 2024-11-20 17:48:26 +01:00
71807cac6d makes clippy happy 2024-11-20 17:40:58 +01:00
21a2264782 improve the details and stats of the current batch processing 2024-11-20 17:25:55 +01:00
bda2b41d11 update snaps after merge 2024-11-20 17:08:30 +01:00
6e6acfcf1b Merge branch 'main' into indexer-edition-2024 2024-11-20 16:59:58 +01:00
e0864f1b21 Separate side effect and debug asserts 2024-11-20 16:25:17 +01:00
a38344acb3 Replace eprintlns by tracing 2024-11-20 15:29:51 +01:00
4d616f8794 Parse every attributes and filter before tokenization 2024-11-20 15:15:25 +01:00
ff9c92c409 rename documents -> substep 2024-11-20 15:12:02 +01:00
8380ddbdcd Fix progress of into_changes 2024-11-20 15:10:09 +01:00
d4d8becfa7 Merge #5060
Some checks failed
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Publish binaries to GitHub release / Check the version validity (push) Successful in 11s
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch, meilisearch-macos-amd64, macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch.exe, meilisearch-windows-amd64.exe, windows-2022) (push) Failing after 21s
Publish binaries to GitHub release / Publish binary for Linux (push) Failing after 12s
Publish binaries to GitHub release / Publish binary for aarch64 (meilisearch-linux-aarch64, aarch64-unknown-linux-gnu) (push) Failing after 10s
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m1s
Test suite / Tests on ubuntu-20.04 (push) Failing after 12s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Failing after 9s
Test suite / Test disabled tokenization (push) Failing after 8s
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 40s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Run Clippy (push) Successful in 5m29s
5060: Batch route r=Kerollmops a=irevoire

# Pull Request

See [usage](https://www.notion.so/meilisearch/Enhance-visibility-on-batched-tasks-1194b06b651f810b8fe0fab5d72846a8).

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4977

## What does this PR do?
- For more detailed information, see the PRD.
- Added a `batchUid` to the tasks (that's the cause of all the updates of the dumps):
  - For all enqueued tasks, it's set to `None`
  - For every other tasks it must be set to something
  - ⚠️ For all the tasks imported in a dump, the `batchUid` will be set to `None` as well.
- Add two new routes:
  - `GET /batches/:uid` - to query a batch by its id
  - `GET /batches` - to retrieve a list of batches. It accepts all the same query parameters that are available on the `GET /tasks` route
- Adds new databases to query the batches directly:
  - When doing a query against the batches, the rule of thumb is that we want to return a batch iif **at least one** task in it matches the provided filter.
  - We don't need a `canceledBy` batch specific database because we can just retrieve the task and if it's a `taskCancelation` retrieve its `batchUid`
- The task cancelation has been updated and simplified a bit:
  - Instead of updating the matching tasks on disk while processing the cancelation task, we instead retrieve the task and let the `tick` function do the work afterward.
  - In the `tick` function, we now have to take care of not missing any tasks
- All the tests applied to the tasks were duplicated and updated to works with the new batches routes
- The deletion of batches doesn't contain any tests because it's already tested in the deletion of tasks (and especially highlighted in the snapshots)


Currently, one part of the PRD is not implemented: it's the progress.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-11-20 14:07:48 +00:00
867138f166 Add SP to into_changes 2024-11-20 15:07:05 +01:00
567bd4538b Fxi the into_changes stop processing 2024-11-20 14:58:25 +01:00
84600a10d1 Add MSP to document_update.into_changes() 2024-11-20 14:53:37 +01:00
35bbe1c2a2 Add failing test on settings changes 2024-11-20 14:48:12 +01:00
7d64e8dbd3 Fix Windows compilation 2024-11-20 14:40:38 +01:00
ec06879d28 apply review changes 2024-11-20 14:40:36 +01:00
83d1f858c1 Update crates/index-scheduler/src/lib.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-11-20 14:36:05 +01:00
cae8c89467 "fix" last warnings 2024-11-20 14:03:52 +01:00
a7ac590e9e implements the reverse query parameter for the batches 2024-11-20 13:29:52 +01:00
7cb8732b45 Introduce a new bincode internal error 2024-11-20 13:23:11 +01:00
8ad68dd708 stop leaking the update files of the canceled tasks 2024-11-20 13:17:54 +01:00
fe5d50969a Fix filed selector in extrators 2024-11-20 13:16:44 +01:00
56c7c5d5f0 Fix comments 2024-11-20 13:16:44 +01:00