Commit Graph

312 Commits

Author SHA1 Message Date
42577403d8 Authentication: Directly pass the authfilter to the index scheduler 2023-02-22 16:35:52 +01:00
b08a49a16e Merge #3319 #3470
3319: Transparently resize indexes on MaxDatabaseSizeReached errors r=Kerollmops a=dureuill

# Pull Request

## Related issue
Related to https://github.com/meilisearch/meilisearch/discussions/3280, depends on https://github.com/meilisearch/milli/pull/760

## What does this PR do?

### User standpoint

- Meilisearch no longer fails tasks that encounter the `milli::UserError(MaxDatabaseSizeReached)` error.
- Instead, these tasks are retried after increasing the maximum size allocated to the index where the failure occurred.

### Implementation standpoint

- Add `Batch::index_uid` to get the `index_uid` of a batch of task if there is one
- `IndexMapper::create_or_open_index` now takes an additional `size` argument that allows to (re)open indexes with a size different from the base `IndexScheduler::index_size` field
- `IndexScheduler::tick` now returns a `Result<TickOutcome>` instead of a `Result<usize>`. This offers more explicit control over what the behavior should be wrt the next tick.
- Add `IndexStatus::BeingResized` that contains a handle that a thread can use to await for the resize operation to complete and the index to be available again.
- Add `IndexMapper::resize_index` to increase the size of an index.
- In `IndexScheduler::tick`, intercept task batches that failed due to `MaxDatabaseSizeReached` and resize the index that caused the error, then request a new tick that will eventually handle the still enqueued task.

## Testing the PR

The following diff can be applied to this branch to make testing the PR easier:

<details>


```diff
diff --git a/index-scheduler/src/index_mapper.rs b/index-scheduler/src/index_mapper.rs
index 553ab45a..022b2f00 100644
--- a/index-scheduler/src/index_mapper.rs
+++ b/index-scheduler/src/index_mapper.rs
`@@` -228,13 +228,15 `@@` impl IndexMapper {
 
         drop(lock);
 
+        std:🧵:sleep_ms(2000);
+
         let current_size = index.map_size()?;
         let closing_event = index.prepare_for_closing();
-        log::info!("Resizing index {} from {} to {} bytes", name, current_size, current_size * 2);
+        log::error!("Resizing index {} from {} to {} bytes", name, current_size, current_size * 2);
 
         closing_event.wait();
 
-        log::info!("Resized index {} from {} to {} bytes", name, current_size, current_size * 2);
+        log::error!("Resized index {} from {} to {} bytes", name, current_size, current_size * 2);
 
         let index_path = self.base_path.join(uuid.to_string());
         let index = self.create_or_open_index(&index_path, None, 2 * current_size)?;
`@@` -268,8 +270,10 `@@` impl IndexMapper {
             match index {
                 Some(Available(index)) => break index,
                 Some(BeingResized(ref resize_operation)) => {
+                    log::error!("waiting for resize end");
                     // Deadlock: no lock taken while doing this operation.
                     resize_operation.wait();
+                    log::error!("trying our luck again!");
                     continue;
                 }
                 Some(BeingDeleted) => return Err(Error::IndexNotFound(name.to_string())),
diff --git a/index-scheduler/src/lib.rs b/index-scheduler/src/lib.rs
index 11b17d05..242dc095 100644
--- a/index-scheduler/src/lib.rs
+++ b/index-scheduler/src/lib.rs
`@@` -908,6 +908,7 `@@` impl IndexScheduler {
     ///
     /// Returns the number of processed tasks.
     fn tick(&self) -> Result<TickOutcome> {
+        log::error!("ticking!");
         #[cfg(test)]
         {
             *self.run_loop_iteration.write().unwrap() += 1;
diff --git a/meilisearch/src/main.rs b/meilisearch/src/main.rs
index 050c825a..63f312f6 100644
--- a/meilisearch/src/main.rs
+++ b/meilisearch/src/main.rs
`@@` -25,7 +25,7 `@@` fn setup(opt: &Opt) -> anyhow::Result<()> {
 
 #[actix_web::main]
 async fn main() -> anyhow::Result<()> {
-    let (opt, config_read_from) = Opt::try_build()?;
+    let (mut opt, config_read_from) = Opt::try_build()?;
 
     setup(&opt)?;
 
`@@` -56,6 +56,8 `@@` We generated a secure master key for you (you can safely copy this token):
         _ => (),
     }
 
+    opt.max_index_size = byte_unit::Byte::from_str("1MB").unwrap();
+
     let (index_scheduler, auth_controller) = setup_meilisearch(&opt)?;
 
     #[cfg(all(not(debug_assertions), feature = "analytics"))]
```
</details>

Mainly, these debug changes do the following:

- Set the default index size to 1MiB so that index resizes are initially frequent
- Turn some logs from info to error so that they can be displayed with `--log-level ERROR` (hiding the other infos)
- Add a long sleep between the beginning and the end of the resize so that we can observe the `BeingResized` index status (otherwise it would never come up in my tests)

## Open questions

- Is the growth factor of x2 the correct solution? For a `Vec` in memory it makes sense, but here we're manipulating quantities that are potentially in the order of 500GiBs. For bigger indexes it may make more sense to add at most e.g. 100GiB on each resize operation, avoiding big steps like 500GiB -> 1TiB.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


3470: Autobatch addition and deletion r=irevoire a=irevoire

This PR adds the capability to meilisearch to batch document addition and deletion together.

Fix https://github.com/meilisearch/meilisearch/issues/3440

--------------

Things to check before merging;

- [x] What happens if we delete multiple time the same documents -> add a test
- [x] If a documentDeletion gets batched with a documentAddition but the index doesn't exist yet? It should not work

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 15:00:19 +00:00
35f6c624bc Make sure we don't leave the in memory hashmap in an inconsistent state 2023-02-20 13:55:32 +01:00
1116788475 Resize indexes when they're full 2023-02-20 13:55:32 +01:00
951a5b5832 Add IndexMapper::resize_index fn 2023-02-20 13:55:32 +01:00
1c670d7fa0 Add IndexStatus::BeingResized 2023-02-20 13:55:32 +01:00
6cc3797aa1 IndexScheduler::tick returns a TickOutcome 2023-02-20 13:55:31 +01:00
faf1e17a27 create_or_open_index takes a map_size argument 2023-02-20 13:55:31 +01:00
4c519c2ab3 Add Batch::index_uid 2023-02-20 13:55:31 +01:00
74d1a67a99 Use the workspace inheritance feature of rust 1.64 2023-02-15 13:51:07 +01:00
29d14bed90 get rids of the let/else syntax 2023-02-14 17:45:46 +01:00
4570d5bf3a Merge remote-tracking branch 'origin/main' into temp-wildcard 2023-02-09 13:14:05 +01:00
eaad84bd1d fix the test to handle the document deletion correctly 2023-02-09 11:29:13 +01:00
ea9ac46f28 stop autobatching the deletion without the index creation right with the addition 2023-02-08 21:24:27 +01:00
93f130a400 fix all warnings 2023-02-08 20:57:35 +01:00
860c993ef7 Handle the autobatching of deletion and addition in the scheduler 2023-02-08 20:53:19 +01:00
67dda0678f cleanup the autobatcher a little bit 2023-02-08 18:10:59 +01:00
2db6347686 update the autobatcher to batch the addition and deletion together 2023-02-08 18:07:59 +01:00
a36b1dbd70 Fix the tasks with the new patterns 2023-02-01 18:21:45 +01:00
924d5d4c11 clippy: remove needless lifetimes 2023-01-31 10:40:48 +01:00
a858531574 apply review comments 2023-01-25 14:51:36 +01:00
bf94f89035 Update index-scheduler/src/lib.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-01-25 11:31:50 +01:00
3bcff60d1c makes clippy happy 2023-01-25 11:31:48 +01:00
c92948b143 Compute the size of the auth-controller, index-scheduler and all update files in the global stats 2023-01-25 11:25:02 +01:00
c7b2e3be87 apply review comments 2023-01-24 17:54:43 +01:00
ea3b269b77 reformat 2023-01-23 23:59:34 +01:00
a4be4c49e8 Update index-scheduler/src/batch.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-01-23 23:58:03 +01:00
7d1ebb7295 add test on the autobatcher layer 2023-01-23 20:56:12 +01:00
767cb725a5 reimplement the batching of task with or without primary key in the autobatcher 2023-01-23 20:18:22 +01:00
5672118bfa When adding documents, trying to update the primary-key now throw an error
While updating the test suite I also noticed an issue with the indexed_documents value of failed task and had to update it.
I also named a bunch of snapshots that had no name sorry 😬
2023-01-23 17:32:13 +01:00
72e2b220ed Fix tests 2023-01-19 15:48:20 +01:00
e8e7070cc6 improve the error message when no task filter are specified for the cancelation or deletion of tasks 2023-01-19 12:42:08 +01:00
3e5b3df487 Merge #3370 #3373 #3375
3370: make the swap indexes not found errors return an IndexNotFound error-code r=irevoire a=irevoire

Fix https://github.com/meilisearch/meilisearch/issues/3368

3373: fix a wrong error code and add tests on the document resource r=irevoire a=irevoire

Fix https://github.com/meilisearch/meilisearch/issues/3371

3375: Avoid deleting all task invalid canceled by r=irevoire a=Kerollmops

Fixes #3369 by making sure that at least one `canceledBy` task filter parameter matches something.

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-01-18 15:21:11 +00:00
e89973f1bf Do not delete all tasks when no canceled-by matches 2023-01-18 15:50:46 +01:00
57da80900d make the swap indexes not found errors return an IndexNotFound error code 2023-01-18 14:16:00 +01:00
2bc2e99ff3 Simplify declaration of the error codes 2023-01-11 19:08:39 +01:00
e706628bb1 fix the error code of the swap index route 2023-01-06 14:48:25 +01:00
50ce0409bc Integrate deserr on the most important routes 2023-01-05 20:48:29 +01:00
2d74678b51 Replace underscores with hyphens in doc link to error code 2023-01-05 10:09:02 +01:00
233372abea Remove --max-index-size and --max-task-db-size 2023-01-04 17:20:01 +01:00
9a39c4e40d Get date from IndexMetaData 2022-12-22 11:46:17 +01:00
0893b175dc Merge branch 'main' into 2983-forward-date-to-milli 2022-12-21 14:31:19 +01:00
d5978d11e1 Refactor 2022-12-21 14:28:00 +01:00
d8fb506c92 handle most io error instead of tagging everything as an internal 2022-12-19 20:50:40 +01:00
aa03e02fdc Apply Rustfmt 2022-12-19 19:24:56 +01:00
869d331680 Clippy fixes after updating Rust to v1.66 2022-12-19 14:17:12 +01:00
b4a73f2d74 Remove redundant date-setting 2022-12-16 08:32:44 +01:00
4e175ae882 Replace Index::new_with_creation_dates(...) with Index::new(...) 2022-12-16 08:20:13 +01:00
5a0a0468df Combine created and added into date 2022-12-16 08:11:12 +01:00
d3eb8d2d5c Enable create_raw_index(...) to specify time 2022-12-14 10:44:25 +01:00