Commit Graph

414 Commits

Author SHA1 Message Date
f544cfa444 Remove tasks and content file on the s3 2023-09-12 15:19:45 +02:00
a53a0fdb77 Store content files into the S3 2023-09-11 18:17:22 +02:00
719fdd701b Fix and crash when the tasks path is unknown 2023-09-07 11:31:18 +02:00
01c13c98ac Mastering minio 2023-09-06 17:54:21 +02:00
5b89276fcc starts using s3 2023-09-05 19:25:09 +02:00
41697c4d65 Introduce the zk-tasks folder 2023-09-04 18:24:34 +02:00
7d85753573 Make the snapshot download work 2023-09-04 17:38:56 +02:00
76657af1f9 Add the options into the IndexScheduler 2023-09-04 16:38:05 +02:00
966cbdab69 make the tests compile again 2023-09-04 15:39:54 +02:00
0c68b9ed4c WIP making the final snapshot swap 2023-08-31 15:56:42 +02:00
d7233ecdb8 Make things to compile again 2023-08-31 14:55:14 +02:00
95a011af13 Wrap the IndexScheduler fields into an inner struct 2023-08-31 10:36:33 +02:00
e257710961 WIP fix the tests 2023-08-30 18:03:24 +02:00
8c3ad57ef9 React to changes towards the cluster members 2023-08-30 17:40:12 +02:00
2d1434da81 Keep the ZK flow when enqueuing tasks 2023-08-30 17:15:15 +02:00
c488a4a351 Fixup a lot of small issues on the ZK config 2023-08-30 16:42:55 +02:00
0c7d7c68bc WIP moving to the sync zookeeper API 2023-08-30 15:06:12 +02:00
854745c670 wip: starts working on importing the snapshots 2023-08-16 18:41:05 +02:00
777eebb759 starts creating snapshot, the import is still missing 2023-08-10 15:00:25 +02:00
61ccfaf9bc wake up after registering a task 2023-08-10 09:39:39 +02:00
f0c4d36ff7 implement the deletion of tasks after processing a batch
add a lot of comments and logs
2023-08-10 09:36:43 +02:00
8c20d6e2fe fix the leader election 2023-08-09 17:23:13 +02:00
8e437ed76c Start leader election and task processing (WIP) 2023-08-09 16:52:38 +02:00
1191ec5939 fix the register task watcher 2023-08-08 13:18:55 +02:00
0d20d08daf fix a few warnings 2023-08-08 11:39:48 +02:00
b66bf049b5 Create a task on zookeeper side when a task is created locally 2023-08-07 17:02:51 +02:00
b45c36cd71 Merge branch 'main' into tmp-release-v1.3.0 2023-08-01 15:05:17 +02:00
eef95de30e First iteration on exposing puffin profiling 2023-07-18 17:38:13 +02:00
22762808ab Fix the tests 2023-07-06 12:13:29 +02:00
86b834c9e4 Display the total number of tasks in the tasks route 2023-07-06 10:05:18 +02:00
aae099e330 Merge #3851
3851: Expose lastUpdate and isIndexing in /stats endpoint r=dureuill a=gentcys

# Pull Request

## Related issue
Fixes #3843

## What does this PR do?
- expose lastUpdate in `/stats` endpoint
- expose isIndex in `stats` endpoint
- add a method `is_task_processing` in index-scheduler/src/lib.rs.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Cong Chen <cong.chen@ocrlabs.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-07-03 13:41:04 +00:00
71500a4e15 Update tests 2023-07-03 11:20:43 +02:00
324d448236 Format let-else ❤️ 🎉 2023-07-03 10:20:28 +02:00
9859e65d2f fix tests 2023-07-01 09:32:50 +08:00
3bdf01bc1c Fix failed test 2023-06-30 17:39:23 +08:00
a5a31667b0 fix converse result of is_task_processing() 2023-06-30 11:28:18 +08:00
e3fc7112bc use RoaringBitmap::is_empty instead 2023-06-29 11:46:47 +08:00
816d7ed174 Update the Vector Store product feature link 2023-06-27 12:32:42 +02:00
13e9b4c2e5 Add dump support 2023-06-26 16:29:43 +02:00
072d81843f Persistently save to DB the status of experimental features 2023-06-26 16:29:43 +02:00
6d4981ec25 Expose lastUpdate and isIndexing in /stats endpoint 2023-06-23 07:24:25 +08:00
040b5a5b6f Merge #3842
3842: fix some typos r=dureuill a=cuishuang

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- fix some typos

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: cui fliter <imcusg@gmail.com>
2023-06-22 18:01:10 +00:00
530a3e2df3 fix some typos
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-06-22 21:59:00 +08:00
45636d315c Merge #3670
3670: Fix addition deletion bug r=irevoire a=irevoire

The first commit of this PR is a revert of https://github.com/meilisearch/meilisearch/pull/3667. It re-enable the auto-batching of addition and deletion of tasks. No new changes have been introduced outside of `milli`. So all the changes you see on the autobatcher have actually already been reviewed.

It fixes https://github.com/meilisearch/meilisearch/issues/3440.

### What was happening?

The issue was that the `external_documents_ids` generated in the `transform` were used in a very strange way that wasn’t compatible with the deletion of documents.
Instead of doing a clear merge between the external document IDs of the DB and the one returned by the transform + writing it on disk, we were doing some weird tricks with the soft-deleted to avoid writing the fst on disk as much as possible.
The new algorithm may be a bit slower but is way more straightforward and doesn’t change depending on if the soft deletion was used or not. Here is a list of the changes introduced:
1. We now do a clear distinction between the `new_external_documents_ids` coming from the transform and only held on RAM and the `external_documents_ids` coming from the DB.
2. The `new_external_documents_ids` (coming out of the transform) are now represented as an `fst`. We don't need to struggle with the hard, soft distinction + the soft_deleted => That's easier to understand
3. When indexing documents, we merge the `external_documents_ids` coming from the DB and the `new_external_documents_ids` coming from the transform.

### Other things introduced in this  PR

Since we constantly have to write small, very specialized fuzzers for this kind of bug, we decided to push the one used to reproduce this bug.
It's not perfect, but it's easy to improve in the future.
It'll also run for as long as possible on every merge on the main branch.

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Loïc Lecrenier <loic.lecrenier@icloud.com>
2023-06-19 09:09:30 +00:00
c1e3cc04b0 Merge #3811
3811: Bring back changes from `release-v1.2.0` to `main` r=Kerollmops a=curquiza



Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Filip Bachul <filipbachul@gmail.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-06-06 13:10:24 +00:00
4a3405afec comment the stats method 2023-06-06 12:59:58 +02:00
3cfd653db1 Apply suggestions from code review
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-06 11:38:41 +02:00
2acc3ec5ee fix the type of the document deletion by filter tasks 2023-05-30 15:18:52 +02:00
c9b65677bf return the on disk size actually used by meilisearch 2023-05-25 18:30:30 +02:00
c433bdd1cd add a view for the task queue in the metrics 2023-05-25 12:58:13 +02:00