Commit Graph

11402 Commits

Author SHA1 Message Date
Louis Dureuil
dcc3caef0d Remove TopLevelMap 2024-11-21 16:56:46 +01:00
Louis Dureuil
221e547e86 Slight changes 2024-11-21 16:47:44 +01:00
Clément Renault
61d0615253 Document the geo point extractor 2024-11-21 16:47:08 +01:00
Clément Renault
5727e00374 Remove useless geo skipped 2024-11-21 16:47:08 +01:00
Clément Renault
9b60843831 Remove commented lines 2024-11-21 16:47:07 +01:00
ManyTheFish
36962b943b First batch of PR comment 2024-11-21 16:38:11 +01:00
Louis Dureuil
32bcacefd5 Changes Document::len to Document::top_level_fields_count 2024-11-21 15:01:07 +01:00
Louis Dureuil
4ed195426c remove unused stuff in global.rs 2024-11-21 15:01:07 +01:00
Many the fish
ff38f29981 Update crates/index-scheduler/src/batch.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 14:18:39 +01:00
curquiza
5899861ff0 Update version for the next release (v1.12.0) in Cargo.toml 2024-11-21 11:21:18 +00:00
ManyTheFish
94b260fd25 Remove orphan span 2024-11-21 12:12:07 +01:00
Louis Dureuil
03ab6b39e7 Revert the change in run count for movies workload 2024-11-21 11:17:34 +01:00
Clément Renault
ab2c83f868 Use the disk less when computing prefixes 2024-11-21 10:45:37 +01:00
meili-bors[bot]
9a08757a70 Merge #5070
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 12s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 11s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Run Clippy (push) Successful in 6m18s
Test suite / Run Rustfmt (push) Successful in 1m34s
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h4m33s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
5070: Improve the details and stats of the current batch processing r=Kerollmops a=irevoire

Small improvement we missed over https://github.com/meilisearch/meilisearch/pull/5060

The current batch processing had empty details and stats.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-11-20 16:56:01 +00:00
Louis Dureuil
1f9692cd04 Increase map size for tests 2024-11-20 17:52:21 +01:00
Tamo
1e694ae432 improve the count of the number of tasks in a batch 2024-11-20 17:48:26 +01:00
Tamo
71807cac6d makes clippy happy 2024-11-20 17:40:58 +01:00
Tamo
21a2264782 improve the details and stats of the current batch processing 2024-11-20 17:25:55 +01:00
Louis Dureuil
bda2b41d11 update snaps after merge 2024-11-20 17:08:30 +01:00
Louis Dureuil
6e6acfcf1b Merge branch 'main' into indexer-edition-2024 2024-11-20 16:59:58 +01:00
Louis Dureuil
e0864f1b21 Separate side effect and debug asserts 2024-11-20 16:25:17 +01:00
Clément Renault
a38344acb3 Replace eprintlns by tracing 2024-11-20 15:29:51 +01:00
ManyTheFish
4d616f8794 Parse every attributes and filter before tokenization 2024-11-20 15:15:25 +01:00
Louis Dureuil
ff9c92c409 rename documents -> substep 2024-11-20 15:12:02 +01:00
Clément Renault
8380ddbdcd Fix progress of into_changes 2024-11-20 15:10:09 +01:00
meili-bors[bot]
d4d8becfa7 Merge #5060
Some checks failed
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Publish binaries to GitHub release / Check the version validity (push) Successful in 11s
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch, meilisearch-macos-amd64, macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch.exe, meilisearch-windows-amd64.exe, windows-2022) (push) Failing after 21s
Publish binaries to GitHub release / Publish binary for Linux (push) Failing after 12s
Publish binaries to GitHub release / Publish binary for aarch64 (meilisearch-linux-aarch64, aarch64-unknown-linux-gnu) (push) Failing after 10s
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m1s
Test suite / Tests on ubuntu-20.04 (push) Failing after 12s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Failing after 9s
Test suite / Test disabled tokenization (push) Failing after 8s
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 40s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Run Clippy (push) Successful in 5m29s
5060: Batch route r=Kerollmops a=irevoire

# Pull Request

See [usage](https://www.notion.so/meilisearch/Enhance-visibility-on-batched-tasks-1194b06b651f810b8fe0fab5d72846a8).

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4977

## What does this PR do?
- For more detailed information, see the PRD.
- Added a `batchUid` to the tasks (that's the cause of all the updates of the dumps):
  - For all enqueued tasks, it's set to `None`
  - For every other tasks it must be set to something
  - ⚠️ For all the tasks imported in a dump, the `batchUid` will be set to `None` as well.
- Add two new routes:
  - `GET /batches/:uid` - to query a batch by its id
  - `GET /batches` - to retrieve a list of batches. It accepts all the same query parameters that are available on the `GET /tasks` route
- Adds new databases to query the batches directly:
  - When doing a query against the batches, the rule of thumb is that we want to return a batch iif **at least one** task in it matches the provided filter.
  - We don't need a `canceledBy` batch specific database because we can just retrieve the task and if it's a `taskCancelation` retrieve its `batchUid`
- The task cancelation has been updated and simplified a bit:
  - Instead of updating the matching tasks on disk while processing the cancelation task, we instead retrieve the task and let the `tick` function do the work afterward.
  - In the `tick` function, we now have to take care of not missing any tasks
- All the tests applied to the tasks were duplicated and updated to works with the new batches routes
- The deletion of batches doesn't contain any tests because it's already tested in the deletion of tasks (and especially highlighted in the snapshots)


Currently, one part of the PRD is not implemented: it's the progress.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-11-20 14:07:48 +00:00
Louis Dureuil
867138f166 Add SP to into_changes 2024-11-20 15:07:05 +01:00
Clément Renault
567bd4538b Fxi the into_changes stop processing 2024-11-20 14:58:25 +01:00
Louis Dureuil
84600a10d1 Add MSP to document_update.into_changes() 2024-11-20 14:53:37 +01:00
ManyTheFish
35bbe1c2a2 Add failing test on settings changes 2024-11-20 14:48:12 +01:00
Louis Dureuil
7d64e8dbd3 Fix Windows compilation 2024-11-20 14:40:38 +01:00
Tamo
ec06879d28 apply review changes 2024-11-20 14:40:36 +01:00
Tamo
83d1f858c1 Update crates/index-scheduler/src/lib.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-11-20 14:36:05 +01:00
Louis Dureuil
cae8c89467 "fix" last warnings 2024-11-20 14:03:52 +01:00
Tamo
a7ac590e9e implements the reverse query parameter for the batches 2024-11-20 13:29:52 +01:00
Clément Renault
7cb8732b45 Introduce a new bincode internal error 2024-11-20 13:23:11 +01:00
Tamo
8ad68dd708 stop leaking the update files of the canceled tasks 2024-11-20 13:17:54 +01:00
ManyTheFish
fe5d50969a Fix filed selector in extrators 2024-11-20 13:16:44 +01:00
Clément Renault
56c7c5d5f0 Fix comments 2024-11-20 13:16:44 +01:00
Louis Dureuil
4cdfdddd6d Fix one more 2024-11-20 13:16:43 +01:00
Louis Dureuil
2afa33011a Fix tokenize_document 2024-11-20 13:16:43 +01:00
Louis Dureuil
61feca1f41 More tests pass 2024-11-20 13:16:43 +01:00
Louis Dureuil
f893b5153e Don't mark [""] as empty facet 2024-11-20 13:16:42 +01:00
Louis Dureuil
ca779c21f9 facets: Handle boolean and skip empty strings 2024-11-20 13:16:42 +01:00
Louis Dureuil
477077bdc2 Remove _vectors from fid map when there are no vectors in sight 2024-11-20 13:16:42 +01:00
ManyTheFish
b1f8aec348 Fix index_documents_check_exists_database 2024-11-20 13:16:41 +01:00
ManyTheFish
ba7f091db3 Use tokenizer on numbers and booleans 2024-11-20 13:16:41 +01:00
Louis Dureuil
8049df125b Add depth to facet extraction so that null inside an array doesn't mark the entire field as null 2024-11-20 13:16:40 +01:00
Clément Renault
50d1bd01df We no longer index geo lat and lng 2024-11-20 13:16:40 +01:00
Louis Dureuil
a28d4f5d0c Fix setup_search_index_with_criteria 2024-11-20 13:16:40 +01:00