Compare commits

..

18 Commits

Author SHA1 Message Date
83616bc03e Expose a new indexer parameter to enable the creation of a document dictionary 2025-01-21 14:44:07 +01:00
bbbc4410ac Fix the dump creation process 2025-01-21 14:44:07 +01:00
46dfa9f7c1 Remove unused code 2025-01-21 14:44:07 +01:00
5dcd5d8797 Introduce a new experimental document compression parameter 2025-01-21 14:44:07 +01:00
bc62cb0801 Remove span that is called too many times 2025-01-21 14:44:06 +01:00
9109fbaeb0 Add more spans to debug compression 2025-01-21 14:44:06 +01:00
78c9f67550 Generate the dictionary only when necessary 2025-01-21 14:44:06 +01:00
523733db0a Clean up the tests 2025-01-21 14:44:06 +01:00
6d7415a25f Remove last warning by storing rtxn and compressor on each thread 2025-01-21 14:44:05 +01:00
3a32a58d6c Remove TODO and rely on the PR checklist 2025-01-21 14:44:05 +01:00
ecc7741212 Fix some issues after a rebase 2025-01-21 14:44:05 +01:00
d43ddd7205 Fetch the compression dictionary only once to decompress documents 2025-01-21 14:44:05 +01:00
0dcbd2fe07 Fix the usage of compressed documents 2025-01-21 14:44:04 +01:00
df80aaefc9 Compress and send compressed documents to the writer 2025-01-21 14:44:04 +01:00
afec94d1f3 Compress the right documents when a new dictionary is computed 2025-01-21 14:44:04 +01:00
e122970570 Move the compression extractor into a dedicated module 2025-01-21 14:44:03 +01:00
19b0bf7121 Allocate the decompressed documents in the extractor allocator 2025-01-21 14:44:03 +01:00
beef5b5f98 Squash in a single commit and rebase 2025-01-21 14:44:03 +01:00
738 changed files with 17691 additions and 57438 deletions

View File

@ -22,20 +22,6 @@ Related product discussion:
<!---If necessary, create a list with technical/product steps--> <!---If necessary, create a list with technical/product steps-->
### Are you modifying a database?
- [ ] If not, add the `no db change` label to your PR, and you're good to merge.
- [ ] If yes, add the `db change` label to your PR. You'll receive a message explaining you what to do.
### Reminders when modifying the API
- [ ] Update the openAPI file with utoipa:
- [ ] If a new module has been introduced, create a new structure deriving [the OpenAPI proc-macro](https://docs.rs/utoipa/latest/utoipa/derive.OpenApi.html) and nest it in the main [openAPI structure](https://github.com/meilisearch/meilisearch/blob/f2185438eed60fa32d25b15480c5ee064f6fba4a/crates/meilisearch/src/routes/mod.rs#L64-L78).
- [ ] If a new route has been introduced, add the [path decorator](https://docs.rs/utoipa/latest/utoipa/attr.path.html) to it and add the route at the top of the file in its openAPI structure.
- [ ] If a structure which is deserialized or serialized in the API has been introduced or modified, it must derive the [`schema`](https://docs.rs/utoipa/latest/utoipa/macro.schema.html) or the [`IntoParams`](https://docs.rs/utoipa/latest/utoipa/derive.IntoParams.html) proc-macro.
If it's a **new** structure you must also add it to the big list of structures [in the main `OpenApi` structure](https://github.com/meilisearch/meilisearch/blob/f2185438eed60fa32d25b15480c5ee064f6fba4a/crates/meilisearch/src/routes/mod.rs#L88).
- [ ] Once everything is done, start Meilisearch with the swagger flag: `cargo run --features swagger`, open `http://localhost:7700/scalar` on your browser, and ensure everything works as expected.
- For more info, refer to [this presentation](https://pitch.com/v/generating-the-openapi-file-jrn3nh).
### Reminders when modifying the Setting API ### Reminders when modifying the Setting API
<!--- Special steps to remind when adding a new index setting --> <!--- Special steps to remind when adding a new index setting -->

View File

@ -1,27 +1,28 @@
name: Bench (manual) name: Bench (manual)
on: on:
workflow_dispatch: workflow_dispatch:
inputs: inputs:
workload: workload:
description: "The path to the workloads to execute (workloads/...)" description: 'The path to the workloads to execute (workloads/...)'
required: true required: true
default: "workloads/movies.json" default: 'workloads/movies.json'
env: env:
WORKLOAD_NAME: ${{ github.event.inputs.workload }} WORKLOAD_NAME: ${{ github.event.inputs.workload }}
jobs: jobs:
benchmarks: benchmarks:
name: Run and upload benchmarks name: Run and upload benchmarks
runs-on: benchmarks runs-on: benchmarks
timeout-minutes: 180 # 3h timeout-minutes: 180 # 3h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}

View File

@ -1,82 +1,82 @@
name: Bench (PR) name: Bench (PR)
on: on:
issue_comment: issue_comment:
types: [created] types: [created]
permissions: permissions:
issues: write issues: write
env: env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }} GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs: jobs:
run-benchmarks-on-comment: run-benchmarks-on-comment:
if: startsWith(github.event.comment.body, '/bench') if: startsWith(github.event.comment.body, '/bench')
name: Run and upload benchmarks name: Run and upload benchmarks
runs-on: benchmarks runs-on: benchmarks
timeout-minutes: 180 # 3h timeout-minutes: 180 # 3h
steps: steps:
- name: Check permissions - name: Check permissions
id: permission id: permission
env: env:
PR_AUTHOR: ${{github.event.issue.user.login }} PR_AUTHOR: ${{github.event.issue.user.login }}
COMMENT_AUTHOR: ${{github.event.comment.user.login }} COMMENT_AUTHOR: ${{github.event.comment.user.login }}
REPOSITORY: ${{github.repository}} REPOSITORY: ${{github.repository}}
PR_ID: ${{github.event.issue.number}} PR_ID: ${{github.event.issue.number}}
run: | run: |
PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name) PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name)
if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push) if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push)
then then
echo "::notice title=Authentication success::PR author authenticated" echo "::notice title=Authentication success::PR author authenticated"
else else
echo "::error title=Authentication error::PR author doesn't have push permission on this repository" echo "::error title=Authentication error::PR author doesn't have push permission on this repository"
exit 1 exit 1
fi fi
if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push) if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push)
then then
echo "::notice title=Authentication success::Comment author authenticated" echo "::notice title=Authentication success::Comment author authenticated"
else else
echo "::error title=Authentication error::Comment author doesn't have push permission on this repository" echo "::error title=Authentication error::Comment author doesn't have push permission on this repository"
exit 1 exit 1
fi fi
if [ "$PR_REPOSITORY" = "$REPOSITORY" ] if [ "$PR_REPOSITORY" = "$REPOSITORY" ]
then then
echo "::notice title=Authentication success::PR started from main repository" echo "::notice title=Authentication success::PR started from main repository"
else else
echo "::error title=Authentication error::PR started from a fork" echo "::error title=Authentication error::PR started from a fork"
exit 1 exit 1
fi fi
- name: Check for Command - name: Check for Command
id: command id: command
uses: xt0rted/slash-command-action@v2 uses: xt0rted/slash-command-action@v2
with: with:
command: bench command: bench
reaction-type: "rocket" reaction-type: "rocket"
repo-token: ${{ env.GH_TOKEN }} repo-token: ${{ env.GH_TOKEN }}
- uses: xt0rted/pull-request-comment-branch@v3 - uses: xt0rted/pull-request-comment-branch@v3
id: comment-branch id: comment-branch
with: with:
repo_token: ${{ env.GH_TOKEN }} repo_token: ${{ env.GH_TOKEN }}
- uses: actions/checkout@v3 - uses: actions/checkout@v3
if: success() if: success()
with: with:
fetch-depth: 0 # fetch full history to be able to get main commit sha fetch-depth: 0 # fetch full history to be able to get main commit sha
ref: ${{ steps.comment-branch.outputs.head_ref }} ref: ${{ steps.comment-branch.outputs.head_ref }}
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
- name: Run benchmarks on PR ${{ github.event.issue.id }} - name: Run benchmarks on PR ${{ github.event.issue.id }}
run: | run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" \ cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" \
--dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" \ --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" \
--reason "[Comment](${{ github.event.comment.html_url }}) on [#${{ github.event.issue.number }}](${{ github.event.issue.html_url }})" \ --reason "[Comment](${{ github.event.comment.html_url }}) on [#${{ github.event.issue.number }}](${{ github.event.issue.html_url }})" \
-- ${{ steps.command.outputs.command-arguments }} > benchlinks.txt -- ${{ steps.command.outputs.command-arguments }} > benchlinks.txt
- name: Send comment in PR - name: Send comment in PR
run: | run: |
gh pr comment ${{github.event.issue.number}} --body-file benchlinks.txt gh pr comment ${{github.event.issue.number}} --body-file benchlinks.txt

View File

@ -1,22 +1,23 @@
name: Indexing bench (push) name: Indexing bench (push)
on: on:
push: push:
branches: branches:
- main - main
jobs: jobs:
benchmarks: benchmarks:
name: Run and upload benchmarks name: Run and upload benchmarks
runs-on: benchmarks runs-on: benchmarks
timeout-minutes: 180 # 3h timeout-minutes: 180 # 3h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
# Run benchmarks
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json
# Run benchmarks
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json

View File

@ -4,9 +4,9 @@ on:
workflow_dispatch: workflow_dispatch:
inputs: inputs:
dataset_name: dataset_name:
description: "The name of the dataset used to benchmark (search_songs, search_wiki, search_geo or indexing)" description: 'The name of the dataset used to benchmark (search_songs, search_wiki, search_geo or indexing)'
required: false required: false
default: "search_songs" default: 'search_songs'
env: env:
BENCH_NAME: ${{ github.event.inputs.dataset_name }} BENCH_NAME: ${{ github.event.inputs.dataset_name }}
@ -18,7 +18,7 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
@ -67,7 +67,7 @@ jobs:
out_dir: critcmp_results out_dir: critcmp_results
# Helper # Helper
- name: "README: compare with another benchmark" - name: 'README: compare with another benchmark'
run: | run: |
echo "${{ steps.file.outputs.basename }}.json has just been pushed." echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?' echo 'How to compare this benchmark with another one?'

View File

@ -44,7 +44,7 @@ jobs:
exit 1 exit 1
fi fi
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal

View File

@ -16,7 +16,7 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
@ -69,7 +69,7 @@ jobs:
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
# Helper # Helper
- name: "README: compare with another benchmark" - name: 'README: compare with another benchmark'
run: | run: |
echo "${{ steps.file.outputs.basename }}.json has just been pushed." echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?' echo 'How to compare this benchmark with another one?'

View File

@ -15,7 +15,7 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
@ -68,7 +68,7 @@ jobs:
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
# Helper # Helper
- name: "README: compare with another benchmark" - name: 'README: compare with another benchmark'
run: | run: |
echo "${{ steps.file.outputs.basename }}.json has just been pushed." echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?' echo 'How to compare this benchmark with another one?'

View File

@ -15,7 +15,7 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
@ -68,7 +68,7 @@ jobs:
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
# Helper # Helper
- name: "README: compare with another benchmark" - name: 'README: compare with another benchmark'
run: | run: |
echo "${{ steps.file.outputs.basename }}.json has just been pushed." echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?' echo 'How to compare this benchmark with another one?'

View File

@ -15,7 +15,7 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
@ -68,7 +68,7 @@ jobs:
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
# Helper # Helper
- name: "README: compare with another benchmark" - name: 'README: compare with another benchmark'
run: | run: |
echo "${{ steps.file.outputs.basename }}.json has just been pushed." echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?' echo 'How to compare this benchmark with another one?'

View File

@ -1,100 +0,0 @@
name: PR Milestone Check
on:
pull_request:
types: [opened, reopened, edited, synchronize, milestoned, demilestoned]
branches:
- "main"
- "release-v*.*.*"
jobs:
check-milestone:
name: Check PR Milestone
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Validate PR milestone
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
// Get PR number directly from the event payload
const prNumber = context.payload.pull_request.number;
// Get PR details
const { data: prData } = await github.rest.pulls.get({
owner: 'meilisearch',
repo: 'meilisearch',
pull_number: prNumber
});
// Get base branch name
const baseBranch = prData.base.ref;
console.log(`Base branch: ${baseBranch}`);
// Get PR milestone
const prMilestone = prData.milestone;
if (!prMilestone) {
core.setFailed('PR must have a milestone assigned');
return;
}
console.log(`PR milestone: ${prMilestone.title}`);
// Validate milestone format: vx.y.z
const milestoneRegex = /^v\d+\.\d+\.\d+$/;
if (!milestoneRegex.test(prMilestone.title)) {
core.setFailed(`Milestone "${prMilestone.title}" does not follow the required format vx.y.z`);
return;
}
// For main branch PRs, check if the milestone is the highest one
if (baseBranch === 'main') {
// Get all milestones
const { data: milestones } = await github.rest.issues.listMilestones({
owner: 'meilisearch',
repo: 'meilisearch',
state: 'open',
sort: 'due_on',
direction: 'desc'
});
// Sort milestones by version number (vx.y.z)
const sortedMilestones = milestones
.filter(m => milestoneRegex.test(m.title))
.sort((a, b) => {
const versionA = a.title.substring(1).split('.').map(Number);
const versionB = b.title.substring(1).split('.').map(Number);
// Compare major version
if (versionA[0] !== versionB[0]) return versionB[0] - versionA[0];
// Compare minor version
if (versionA[1] !== versionB[1]) return versionB[1] - versionA[1];
// Compare patch version
return versionB[2] - versionA[2];
});
if (sortedMilestones.length === 0) {
core.setFailed('No valid milestones found in the repository. Please create at least one milestone with the format vx.y.z');
return;
}
const highestMilestone = sortedMilestones[0];
console.log(`Highest milestone: ${highestMilestone.title}`);
if (prMilestone.title !== highestMilestone.title) {
core.setFailed(`PRs targeting the main branch must use the highest milestone (${highestMilestone.title}), but this PR uses ${prMilestone.title}`);
return;
}
} else {
// For release branches, the milestone should match the branch version
const branchVersion = baseBranch.substring(8); // remove 'release-'
if (prMilestone.title !== branchVersion) {
core.setFailed(`PRs targeting release branch "${baseBranch}" must use the matching milestone "${branchVersion}", but this PR uses "${prMilestone.title}"`);
return;
}
}
console.log('PR milestone validation passed!');

View File

@ -1,57 +0,0 @@
name: Comment when db change labels are added
on:
pull_request:
types: [labeled]
env:
MESSAGE: |
### Hello, I'm a bot 🤖
You are receiving this message because you declared that this PR make changes to the Meilisearch database.
Depending on the nature of the change, additional actions might be required on your part. The following sections detail the additional actions depending on the nature of the change, please copy the relevant section in the description of your PR, and make sure to perform the required actions.
Thank you for contributing to Meilisearch :heart:
## This PR makes forward-compatible changes
*Forward-compatible changes are changes to the database such that databases created in an older version of Meilisearch are still valid in the new version of Meilisearch. They usually represent additive changes, like adding a new optional attribute or setting.*
- [ ] Detail the change to the DB format and why they are forward compatible
- [ ] Forward-compatibility: A database created before this PR and using the features touched by this PR was able to be opened by a Meilisearch produced by the code of this PR.
## This PR makes breaking changes
*Breaking changes are changes to the database such that databases created in an older version of Meilisearch need changes to remain valid in the new version of Meilisearch. This typically happens when the way to store the data changed (change of database, new required key, etc). This can also happen due to breaking changes in the API of an experimental feature. ⚠️ This kind of changes are more difficult to achieve safely, so proceed with caution and test dumpless upgrade right before merging the PR.*
- [ ] Detail the changes to the DB format,
- [ ] which are compatible, and why
- [ ] which are not compatible, why, and how they will be fixed up in the upgrade
- [ ] /!\ Ensure all the read operations still work!
- If the change happened in milli, you may need to check the version of the database before doing any read operation
- If the change happened in the index-scheduler, make sure the new code can immediately read the old database
- If the change happened in the meilisearch-auth database, reach out to the team; we don't know yet how to handle these changes
- [ ] Write the code to go from the old database to the new one
- If the change happened in milli, the upgrade function should be written and called [here](https://github.com/meilisearch/meilisearch/blob/3fd86e8d76d7d468b0095d679adb09211ca3b6c0/crates/milli/src/update/upgrade/mod.rs#L24-L47)
- If the change happened in the index-scheduler, we've never done it yet, but the right place to do it should be [here](https://github.com/meilisearch/meilisearch/blob/3fd86e8d76d7d468b0095d679adb09211ca3b6c0/crates/index-scheduler/src/scheduler/process_upgrade/mod.rs#L13)
- [ ] Write an integration test [here](https://github.com/meilisearch/meilisearch/blob/main/crates/meilisearch/tests/upgrade/mod.rs) ensuring you can read the old database, upgrade to the new database, and read the new database as expected
jobs:
add-comment:
runs-on: ubuntu-latest
if: github.event.label.name == 'db change'
steps:
- name: Add comment
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const message = process.env.MESSAGE;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: message
})

View File

@ -1,28 +0,0 @@
name: Check db change labels
on:
pull_request:
types: [opened, synchronize, reopened, labeled, unlabeled]
jobs:
check-labels:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Check db change labels
id: check_labels
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
URL=/repos/meilisearch/meilisearch/pulls/${{ github.event.pull_request.number }}/labels
echo ${{ github.event.pull_request.number }}
echo $URL
LABELS=$(gh api -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" /repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels -q .[].name)
echo "Labels: $LABELS"
if [[ ! "$LABELS" =~ "db change" && ! "$LABELS" =~ "no db change" ]]; then
echo "::error::Pull request must contain either the 'db change' or 'no db change' label."
exit 1
else
echo "The label is set"
fi

View File

@ -9,22 +9,22 @@ jobs:
flaky: flaky:
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Install needed dependencies - name: Install needed dependencies
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-flaky - name: Install cargo-flaky
run: cargo install cargo-flaky run: cargo install cargo-flaky
- name: Run cargo flaky in the dumps - name: Run cargo flaky in the dumps
run: cd crates/dump; cargo flaky -i 100 --release run: cd crates/dump; cargo flaky -i 100 --release
- name: Run cargo flaky in the index-scheduler - name: Run cargo flaky in the index-scheduler
run: cd crates/index-scheduler; cargo flaky -i 100 --release run: cd crates/index-scheduler; cargo flaky -i 100 --release
- name: Run cargo flaky in the auth - name: Run cargo flaky in the auth
run: cd crates/meilisearch-auth; cargo flaky -i 100 --release run: cd crates/meilisearch-auth; cargo flaky -i 100 --release
- name: Run cargo flaky in meilisearch - name: Run cargo flaky in meilisearch
run: cd crates/meilisearch; cargo flaky -i 100 --release run: cd crates/meilisearch; cargo flaky -i 100 --release

View File

@ -12,7 +12,7 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal

View File

@ -5,7 +5,6 @@ name: Milestone's workflow
# For each Milestone created (not opened!), and if the release is NOT a patch release (only the patch changed) # For each Milestone created (not opened!), and if the release is NOT a patch release (only the patch changed)
# - the roadmap issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/roadmap-issue.md # - the roadmap issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/roadmap-issue.md
# - the changelog issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/changelog-issue.md # - the changelog issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/changelog-issue.md
# - update the ruleset to add the current release version to the list of allowed versions and be able to use the merge queue.
# For each Milestone closed # For each Milestone closed
# - the `release_version` label is created # - the `release_version` label is created
@ -22,9 +21,10 @@ env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }} GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs: jobs:
# -----------------
# MILESTONE CREATED # -----------------
# ----------------- # MILESTONE CREATED
# -----------------
get-release-version: get-release-version:
if: github.event.action == 'created' if: github.event.action == 'created'
@ -148,41 +148,9 @@ jobs:
--body-file $ISSUE_TEMPLATE \ --body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION --milestone $MILESTONE_VERSION
update-ruleset: # ----------------
runs-on: ubuntu-latest # MILESTONE CLOSED
if: github.event.action == 'created' # ----------------
steps:
- uses: actions/checkout@v3
- name: Install jq
run: |
sudo apt-get update
sudo apt-get install -y jq
- name: Update ruleset
env:
# gh api repos/meilisearch/meilisearch/rulesets --jq '.[] | {name: .name, id: .id}'
RULESET_ID: 4253297
BRANCH_NAME: ${{ github.event.inputs.branch_name }}
run: |
echo "RULESET_ID: ${{ env.RULESET_ID }}"
echo "BRANCH_NAME: ${{ env.BRANCH_NAME }}"
# Get current ruleset conditions
CONDITIONS=$(gh api repos/meilisearch/meilisearch/rulesets/${{ env.RULESET_ID }} --jq '{ conditions: .conditions }')
# Update the conditions by appending the milestone version
UPDATED_CONDITIONS=$(echo $CONDITIONS | jq '.conditions.ref_name.include += ["refs/heads/release-'${{ env.MILESTONE_VERSION }}'"]')
# Update the ruleset from stdin (-)
echo $UPDATED_CONDITIONS |
gh api repos/meilisearch/meilisearch/rulesets/${{ env.RULESET_ID }} \
--method PUT \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
--input -
# ----------------
# MILESTONE CLOSED
# ----------------
create-release-label: create-release-label:
if: github.event.action == 'closed' if: github.event.action == 'closed'

View File

@ -18,28 +18,28 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
needs: check-version needs: check-version
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
steps: steps:
- name: Install needed dependencies - name: Install needed dependencies
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-deb - name: Install cargo-deb
run: cargo install cargo-deb run: cargo install cargo-deb
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Build deb package - name: Build deb package
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
- name: Upload debian pkg to release - name: Upload debian pkg to release
uses: svenstaro/upload-release-action@2.11.1 uses: svenstaro/upload-release-action@2.7.0
with: with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }} repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/debian/meilisearch.deb file: target/debian/meilisearch.deb
asset_name: meilisearch.deb asset_name: meilisearch.deb
tag: ${{ github.ref }} tag: ${{ github.ref }}
- name: Upload debian pkg to apt repository - name: Upload debian pkg to apt repository
run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/ run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/
homebrew: homebrew:
name: Bump Homebrew formula name: Bump Homebrew formula

View File

@ -3,7 +3,7 @@ name: Publish binaries to GitHub release
on: on:
workflow_dispatch: workflow_dispatch:
schedule: schedule:
- cron: "0 2 * * *" # Every day at 2:00am - cron: '0 2 * * *' # Every day at 2:00am
release: release:
types: [published] types: [published]
@ -37,26 +37,26 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
needs: check-version needs: check-version
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Install needed dependencies - name: Install needed dependencies
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Build - name: Build
run: cargo build --release --locked run: cargo build --release --locked
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
- name: Upload binaries to release - name: Upload binaries to release
if: github.event_name == 'release' if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.11.1 uses: svenstaro/upload-release-action@2.7.0
with: with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }} repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/meilisearch file: target/release/meilisearch
asset_name: meilisearch-linux-amd64 asset_name: meilisearch-linux-amd64
tag: ${{ github.ref }} tag: ${{ github.ref }}
publish-macos-windows: publish-macos-windows:
name: Publish binary for ${{ matrix.os }} name: Publish binary for ${{ matrix.os }}
@ -74,19 +74,19 @@ jobs:
artifact_name: meilisearch.exe artifact_name: meilisearch.exe
asset_name: meilisearch-windows-amd64.exe asset_name: meilisearch-windows-amd64.exe
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Build - name: Build
run: cargo build --release --locked run: cargo build --release --locked
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
- name: Upload binaries to release - name: Upload binaries to release
if: github.event_name == 'release' if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.11.1 uses: svenstaro/upload-release-action@2.7.0
with: with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }} repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/${{ matrix.artifact_name }} file: target/release/${{ matrix.artifact_name }}
asset_name: ${{ matrix.asset_name }} asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }} tag: ${{ github.ref }}
publish-macos-apple-silicon: publish-macos-apple-silicon:
name: Publish binary for macOS silicon name: Publish binary for macOS silicon
@ -101,7 +101,7 @@ jobs:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v3 uses: actions/checkout@v3
- name: Installing Rust toolchain - name: Installing Rust toolchain
uses: dtolnay/rust-toolchain@1.85 uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
target: ${{ matrix.target }} target: ${{ matrix.target }}
@ -113,7 +113,7 @@ jobs:
- name: Upload the binary to release - name: Upload the binary to release
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
if: github.event_name == 'release' if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.11.1 uses: svenstaro/upload-release-action@2.7.0
with: with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }} repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/${{ matrix.target }}/release/meilisearch file: target/${{ matrix.target }}/release/meilisearch
@ -127,8 +127,8 @@ jobs:
env: env:
DEBIAN_FRONTEND: noninteractive DEBIAN_FRONTEND: noninteractive
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
strategy: strategy:
matrix: matrix:
include: include:
@ -148,7 +148,7 @@ jobs:
add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update -y && apt-get install -y docker-ce apt-get update -y && apt-get install -y docker-ce
- name: Installing Rust toolchain - name: Installing Rust toolchain
uses: dtolnay/rust-toolchain@1.85 uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
target: ${{ matrix.target }} target: ${{ matrix.target }}
@ -178,7 +178,7 @@ jobs:
- name: Upload the binary to release - name: Upload the binary to release
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
if: github.event_name == 'release' if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.11.1 uses: svenstaro/upload-release-action@2.7.0
with: with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }} repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/${{ matrix.target }}/release/meilisearch file: target/${{ matrix.target }}/release/meilisearch

View File

@ -104,22 +104,3 @@ jobs:
repository: meilisearch/meilisearch-cloud repository: meilisearch/meilisearch-cloud
event-type: cloud-docker-build event-type: cloud-docker-build
client-payload: '{ "meilisearch_version": "${{ github.ref_name }}", "stable": "${{ steps.check-tag-format.outputs.stable }}" }' client-payload: '{ "meilisearch_version": "${{ github.ref_name }}", "stable": "${{ steps.check-tag-format.outputs.stable }}" }'
# Send notification to Swarmia to notify of a deployment: https://app.swarmia.com
# - name: 'Setup jq'
# uses: dcarbone/install-jq-action
# - name: Send deployment to Swarmia
# if: github.event_name == 'push' && success()
# run: |
# JSON_STRING=$( jq --null-input --compact-output \
# --arg version "${{ github.ref_name }}" \
# --arg appName "meilisearch" \
# --arg environment "production" \
# --arg commitSha "${{ github.sha }}" \
# --arg repositoryFullName "${{ github.repository }}" \
# '{"version": $version, "appName": $appName, "environment": $environment, "commitSha": $commitSha, "repositoryFullName": $repositoryFullName}' )
# curl -H "Authorization: ${{ secrets.SWARMIA_DEPLOYMENTS_AUTHORIZATION }}" \
# -H "Content-Type: application/json" \
# -d "$JSON_STRING" \
# https://hook.swarmia.com/deployments

View File

@ -22,7 +22,7 @@ jobs:
outputs: outputs:
docker-image: ${{ steps.define-image.outputs.docker-image }} docker-image: ${{ steps.define-image.outputs.docker-image }}
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
- name: Define the Docker image we need to use - name: Define the Docker image we need to use
id: define-image id: define-image
run: | run: |
@ -46,13 +46,13 @@ jobs:
MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }} MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }}
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-dotnet repository: meilisearch/meilisearch-dotnet
- name: Setup .NET Core - name: Setup .NET Core
uses: actions/setup-dotnet@v4 uses: actions/setup-dotnet@v4
with: with:
dotnet-version: "8.0.x" dotnet-version: "6.0.x"
- name: Install dependencies - name: Install dependencies
run: dotnet restore run: dotnet restore
- name: Build - name: Build
@ -75,7 +75,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-dart repository: meilisearch/meilisearch-dart
- uses: dart-lang/setup-dart@v1 - uses: dart-lang/setup-dart@v1
@ -103,7 +103,7 @@ jobs:
uses: actions/setup-go@v5 uses: actions/setup-go@v5
with: with:
go-version: stable go-version: stable
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-go repository: meilisearch/meilisearch-go
- name: Get dependencies - name: Get dependencies
@ -129,7 +129,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-java repository: meilisearch/meilisearch-java
- name: Set up Java - name: Set up Java
@ -156,7 +156,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-js repository: meilisearch/meilisearch-js
- name: Setup node - name: Setup node
@ -191,7 +191,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-php repository: meilisearch/meilisearch-php
- name: Install PHP - name: Install PHP
@ -220,7 +220,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-python repository: meilisearch/meilisearch-python
- name: Set up Python - name: Set up Python
@ -245,7 +245,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-ruby repository: meilisearch/meilisearch-ruby
- name: Set up Ruby 3 - name: Set up Ruby 3
@ -270,7 +270,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-rust repository: meilisearch/meilisearch-rust
- name: Build - name: Build
@ -291,7 +291,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-swift repository: meilisearch/meilisearch-swift
- name: Run tests - name: Run tests
@ -314,7 +314,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-js-plugins repository: meilisearch/meilisearch-js-plugins
- name: Setup node - name: Setup node
@ -345,7 +345,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-rails repository: meilisearch/meilisearch-rails
- name: Set up Ruby 3 - name: Set up Ruby 3
@ -369,7 +369,7 @@ jobs:
ports: ports:
- '7700:7700' - '7700:7700'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v4
with: with:
repository: meilisearch/meilisearch-symfony repository: meilisearch/meilisearch-symfony
- name: Install PHP - name: Install PHP

View File

@ -4,9 +4,13 @@ on:
workflow_dispatch: workflow_dispatch:
schedule: schedule:
# Everyday at 5:00am # Everyday at 5:00am
- cron: "0 5 * * *" - cron: '0 5 * * *'
pull_request: pull_request:
merge_group: push:
# trying and staging branches are for Bors config
branches:
- trying
- staging
env: env:
CARGO_TERM_COLOR: always CARGO_TERM_COLOR: always
@ -15,11 +19,11 @@ env:
jobs: jobs:
test-linux: test-linux:
name: Tests on ubuntu-22.04 name: Tests on ubuntu-20.04
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Install needed dependencies - name: Install needed dependencies
@ -27,9 +31,9 @@ jobs:
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- name: Setup test with Rust stable - name: Setup test with Rust stable
uses: dtolnay/rust-toolchain@1.85 uses: dtolnay/rust-toolchain@1.81
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.8.0 uses: Swatinem/rust-cache@v2.7.7
- name: Run cargo check without any default features - name: Run cargo check without any default features
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -51,8 +55,8 @@ jobs:
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.8.0 uses: Swatinem/rust-cache@v2.7.7
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Run cargo check without any default features - name: Run cargo check without any default features
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -68,8 +72,8 @@ jobs:
name: Tests almost all features name: Tests almost all features
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
@ -77,51 +81,19 @@ jobs:
run: | run: |
apt-get update apt-get update
apt-get install --assume-yes build-essential curl apt-get install --assume-yes build-essential curl
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Run cargo build with almost all features - name: Run cargo build with almost all features
run: | run: |
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda,test-ollama)" cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
- name: Run cargo test with almost all features - name: Run cargo test with almost all features
run: | run: |
cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda,test-ollama)" cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
ollama-ubuntu:
name: Test with Ollama
runs-on: ubuntu-latest
env:
MEILI_TEST_OLLAMA_SERVER: "http://localhost:11434"
steps:
- uses: actions/checkout@v3
- name: Install Ollama
run: |
curl -fsSL https://ollama.com/install.sh | sudo -E sh
- name: Start serving
run: |
# Run it in the background, there is no way to daemonise at the moment
ollama serve &
# A short pause is required before the HTTP port is opened
sleep 5
# This endpoint blocks until ready
time curl -i http://localhost:11434
- name: Pull nomic-embed-text & all-minilm
run: |
ollama pull nomic-embed-text
ollama pull all-minilm
- name: Run cargo test
uses: actions-rs/cargo@v1
with:
command: test
args: --locked --release --all --features test-ollama ollama
test-disabled-tokenization: test-disabled-tokenization:
name: Test disabled tokenization name: Test disabled tokenization
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
image: ubuntu:22.04 image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
@ -129,7 +101,7 @@ jobs:
run: | run: |
apt-get update apt-get update
apt-get install --assume-yes build-essential curl apt-get install --assume-yes build-essential curl
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Run cargo tree without default features and check lindera is not present - name: Run cargo tree without default features and check lindera is not present
run: | run: |
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then
@ -145,17 +117,17 @@ jobs:
name: Run tests in debug name: Run tests in debug
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
# Use ubuntu-22.04 to compile with glibc 2.35 # Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:22.04 image: ubuntu:20.04
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Install needed dependencies - name: Install needed dependencies
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.8.0 uses: Swatinem/rust-cache@v2.7.7
- name: Run tests in debug - name: Run tests in debug
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -167,12 +139,12 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
components: clippy components: clippy
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.8.0 uses: Swatinem/rust-cache@v2.7.7
- name: Run cargo clippy - name: Run cargo clippy
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -184,14 +156,14 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
toolchain: nightly-2024-07-09 toolchain: nightly-2024-07-09
override: true override: true
components: rustfmt components: rustfmt
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.8.0 uses: Swatinem/rust-cache@v2.7.7
- name: Run cargo fmt - name: Run cargo fmt
# Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file. # Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file.
# Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate # Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate

View File

@ -4,7 +4,7 @@ on:
workflow_dispatch: workflow_dispatch:
inputs: inputs:
new_version: new_version:
description: "The new version (vX.Y.Z)" description: 'The new version (vX.Y.Z)'
required: true required: true
env: env:
@ -18,7 +18,7 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.85 - uses: dtolnay/rust-toolchain@1.81
with: with:
profile: minimal profile: minimal
- name: Install sd - name: Install sd

11
.gitignore vendored
View File

@ -11,21 +11,12 @@
/bench /bench
/_xtask_benchmark.ms /_xtask_benchmark.ms
/benchmarks /benchmarks
.DS_Store
# Snapshots # Snapshots
## ... large ## ... large
*.full.snap *.full.snap
## ... unreviewed ## ... unreviewed
*.snap.new *.snap.new
## ... pending
*.pending-snap
# Tmp files
.tmp*
# Database snapshot
crates/meilisearch/db.snapshot
# Fuzzcheck data for the facet indexing fuzz test # Fuzzcheck data for the facet indexing fuzz test
crates/milli/fuzz/update::facet::incremental::fuzz::fuzz/ crates/milli/fuzz/update::facet::incremental::fuzz::fuzz/

View File

@ -57,17 +57,9 @@ This command will be triggered to each PR as a requirement for merging it.
You can set the `LINDERA_CACHE` environment variable to speed up your successive builds by up to 2 minutes. You can set the `LINDERA_CACHE` environment variable to speed up your successive builds by up to 2 minutes.
It'll store some built artifacts in the directory of your choice. It'll store some built artifacts in the directory of your choice.
We recommend using the `$HOME/.cache/meili/lindera` directory: We recommend using the standard `$HOME/.cache/lindera` directory:
```sh ```sh
export LINDERA_CACHE=$HOME/.cache/meili/lindera export LINDERA_CACHE=$HOME/.cache/lindera
```
You can set the `MILLI_BENCH_DATASETS_PATH` environment variable to further speed up your builds.
It'll store some big files used for the benchmarks in the directory of your choice.
We recommend using the `$HOME/.cache/meili/benches` directory:
```sh
export MILLI_BENCH_DATASETS_PATH=$HOME/.cache/meili/benches
``` ```
Furthermore, you can improve incremental compilation by setting the `MEILI_NO_VERGEN` environment variable. Furthermore, you can improve incremental compilation by setting the `MEILI_NO_VERGEN` environment variable.
@ -103,11 +95,6 @@ Meilisearch follows the [cargo xtask](https://github.com/matklad/cargo-xtask) wo
Run `cargo xtask --help` from the root of the repository to find out what is available. Run `cargo xtask --help` from the root of the repository to find out what is available.
#### Update the openAPI file if the API changed
To update the openAPI file in the code, see [sprint_issue.md](https://github.com/meilisearch/meilisearch/blob/main/.github/ISSUE_TEMPLATE/sprint_issue.md#reminders-when-modifying-the-api).
If you want to update the openAPI file on the [open-api repository](https://github.com/meilisearch/open-api), see [update-openapi-issue.md](https://github.com/meilisearch/engine-team/blob/main/issue-templates/update-openapi-issue.md).
### Logging ### Logging
Meilisearch uses [`tracing`](https://lib.rs/crates/tracing) for logging purposes. Tracing logs are structured and can be displayed as JSON to the end user, so prefer passing arguments as fields rather than interpolating them in the message. Meilisearch uses [`tracing`](https://lib.rs/crates/tracing) for logging purposes. Tracing logs are structured and can be displayed as JSON to the end user, so prefer passing arguments as fields rather than interpolating them in the message.
@ -158,7 +145,7 @@ Some notes on GitHub PRs:
- The PR title should be accurate and descriptive of the changes. - The PR title should be accurate and descriptive of the changes.
- [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br> - [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br>
The draft PRs are recommended when you want to show that you are working on something and make your work visible. The draft PRs are recommended when you want to show that you are working on something and make your work visible.
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [GitHub Merge Queues](https://github.blog/news-insights/product-news/github-merge-queue-is-generally-available/) to automatically enforce this requirement without the PR author having to rebase manually. - The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [Bors](https://github.com/bors-ng/bors-ng) to automatically enforce this requirement without the PR author having to rebase manually.
## Release Process (for internal team only) ## Release Process (for internal team only)
@ -166,7 +153,8 @@ Meilisearch tools follow the [Semantic Versioning Convention](https://semver.org
### Automation to rebase and Merge the PRs ### Automation to rebase and Merge the PRs
This project uses GitHub Merge Queues that helps us manage pull requests merging. This project integrates a bot that helps us manage pull requests merging.<br>
_[Read more about this](https://github.com/meilisearch/integration-guides/blob/main/resources/bors.md)._
### How to Publish a new Release ### How to Publish a new Release

3531
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -22,7 +22,7 @@ members = [
] ]
[workspace.package] [workspace.package]
version = "1.16.0" version = "1.12.2"
authors = [ authors = [
"Quentin de Quelen <quentin@dequelen.me>", "Quentin de Quelen <quentin@dequelen.me>",
"Clément Renault <clement@meilisearch.com>", "Clément Renault <clement@meilisearch.com>",
@ -36,12 +36,6 @@ license = "MIT"
[profile.release] [profile.release]
codegen-units = 1 codegen-units = 1
# We now compile heed without the NDEBUG define for better performance.
# However, we still enable debug assertions for a better detection of
# disk corruption on the cloud or in OSS.
[profile.release.package.heed]
debug-assertions = true
[profile.dev.package.flate2] [profile.dev.package.flate2]
opt-level = 3 opt-level = 3

View File

@ -1,5 +1,5 @@
# Compile # Compile
FROM rust:1.85-alpine3.20 AS compiler FROM rust:1.81.0-alpine3.20 AS compiler
RUN apk add -q --no-cache build-base openssl-dev RUN apk add -q --no-cache build-base openssl-dev

View File

@ -20,7 +20,7 @@
<p align="center"> <p align="center">
<a href="https://deps.rs/repo/github/meilisearch/meilisearch"><img src="https://deps.rs/repo/github/meilisearch/meilisearch/status.svg" alt="Dependency status"></a> <a href="https://deps.rs/repo/github/meilisearch/meilisearch"><img src="https://deps.rs/repo/github/meilisearch/meilisearch/status.svg" alt="Dependency status"></a>
<a href="https://github.com/meilisearch/meilisearch/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-informational" alt="License"></a> <a href="https://github.com/meilisearch/meilisearch/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-informational" alt="License"></a>
<a href="https://github.com/meilisearch/meilisearch/queue"><img alt="Merge Queues enabled" src="https://img.shields.io/badge/Merge_Queues-enabled-%2357cf60?logo=github"></a> <a href="https://ms-bors.herokuapp.com/repositories/52"><img src="https://bors.tech/images/badge_small.svg" alt="Bors enabled"></a>
</p> </p>
<p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p> <p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p>
@ -41,7 +41,7 @@
- [**Movies**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=organization) — An application to help you find streaming platforms to watch movies using [hybrid search](https://www.meilisearch.com/solutions/hybrid-search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos). - [**Movies**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=organization) — An application to help you find streaming platforms to watch movies using [hybrid search](https://www.meilisearch.com/solutions/hybrid-search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos).
- [**Ecommerce**](https://ecommerce.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Ecommerce website using disjunctive [facets](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos), range and rating filtering, and pagination. - [**Ecommerce**](https://ecommerce.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Ecommerce website using disjunctive [facets](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos), range and rating filtering, and pagination.
- [**Songs**](https://music.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search through 47 million of songs. - [**Songs**](https://music.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search through 47 million of songs.
- [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application. - [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application.
See the list of all our example apps in our [demos repository](https://github.com/meilisearch/demos). See the list of all our example apps in our [demos repository](https://github.com/meilisearch/demos).
@ -99,7 +99,7 @@ If you want to know more about the kind of data we collect and what we use it fo
## đź“« Get in touch! ## đź“« Get in touch!
Meilisearch is a search engine created by [Meili](https://www.meilisearch.com/careers), a software development company headquartered in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact) Meilisearch is a search engine created by [Meili]([https://www.welcometothejungle.com/en/companies/meilisearch](https://www.meilisearch.com/careers)), a software development company headquartered in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
đź—ž [Subscribe to our newsletter](https://meilisearch.us2.list-manage.com/subscribe?u=27870f7b71c908a8b359599fb&id=79582d828e) if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months. đź—ž [Subscribe to our newsletter](https://meilisearch.us2.list-manage.com/subscribe?u=27870f7b71c908a8b359599fb&id=79582d828e) if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months.

View File

@ -1501,300 +1501,6 @@
"title": "Task queue latency", "title": "Task queue latency",
"type": "timeseries" "type": "timeseries"
}, },
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 15,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 11,
"w": 12,
"x": 12,
"y": 51
},
"id": 29,
"interval": "5s",
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "8.1.4",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_task_queue_used_size{instance=\"$instance\", job=\"$job\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
"refId": "A"
}
],
"title": "Task queue used size in bytes",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 15,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 11,
"w": 12,
"x": 12,
"y": 51
},
"id": 29,
"interval": "5s",
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "8.1.4",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_task_queue_size_until_stop_registering{instance=\"$instance\", job=\"$job\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
"refId": "A"
}
],
"title": "Task queue available size until it stop receiving tasks.",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 15,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 11,
"w": 12,
"x": 12,
"y": 51
},
"id": 29,
"interval": "5s",
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "8.1.4",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_task_queue_max_size{instance=\"$instance\", job=\"$job\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
"refId": "A"
}
],
"title": "Task queue maximum possible size",
"type": "stat"
},
{ {
"collapsed": true, "collapsed": true,
"datasource": { "datasource": {

Binary file not shown.

Before

Width:  |  Height:  |  Size: 578 KiB

11
bors.toml Normal file
View File

@ -0,0 +1,11 @@
status = [
'Tests on ubuntu-20.04',
'Tests on macos-13',
'Tests on windows-2022',
'Run Clippy',
'Run Rustfmt',
'Run tests in debug',
]
pr_status = ['Milestone Check']
# 3 hours timeout
timeout-sec = 10800

View File

@ -11,27 +11,27 @@ edition.workspace = true
license.workspace = true license.workspace = true
[dependencies] [dependencies]
anyhow = "1.0.98" anyhow = "1.0.95"
bumpalo = "3.18.1" bumpalo = "3.16.0"
csv = "1.3.1" csv = "1.3.1"
memmap2 = "0.9.5" memmap2 = "0.9.5"
milli = { path = "../milli" } milli = { path = "../milli" }
mimalloc = { version = "0.1.47", default-features = false } mimalloc = { version = "0.1.43", default-features = false }
serde_json = { version = "1.0.140", features = ["preserve_order"] } serde_json = { version = "1.0.135", features = ["preserve_order"] }
tempfile = "3.20.0" tempfile = "3.15.0"
[dev-dependencies] [dev-dependencies]
criterion = { version = "0.6.0", features = ["html_reports"] } criterion = { version = "0.5.1", features = ["html_reports"] }
rand = "0.8.5" rand = "0.8.5"
rand_chacha = "0.3.1" rand_chacha = "0.3.1"
roaring = "0.10.12" roaring = "0.10.10"
[build-dependencies] [build-dependencies]
anyhow = "1.0.98" anyhow = "1.0.95"
bytes = "1.10.1" bytes = "1.9.0"
convert_case = "0.8.0" convert_case = "0.6.0"
flate2 = "1.1.2" flate2 = "1.0.35"
reqwest = { version = "0.12.20", features = ["blocking", "rustls-tls"], default-features = false } reqwest = { version = "0.12.12", features = ["blocking", "rustls-tls"], default-features = false }
[features] [features]
default = ["milli/all-tokenizations"] default = ["milli/all-tokenizations"]
@ -51,8 +51,3 @@ harness = false
[[bench]] [[bench]]
name = "indexing" name = "indexing"
harness = false harness = false
[[bench]]
name = "sort"
harness = false

View File

@ -10,9 +10,9 @@ use milli::documents::PrimaryKey;
use milli::heed::{EnvOpenOptions, RwTxn}; use milli::heed::{EnvOpenOptions, RwTxn};
use milli::progress::Progress; use milli::progress::Progress;
use milli::update::new::indexer; use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings}; use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::RuntimeEmbedders; use milli::vector::EmbeddingConfigs;
use milli::{FilterableAttributesRule, Index}; use milli::Index;
use rand::seq::SliceRandom; use rand::seq::SliceRandom;
use rand_chacha::rand_core::SeedableRng; use rand_chacha::rand_core::SeedableRng;
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
@ -35,11 +35,10 @@ fn setup_dir(path: impl AsRef<Path>) {
fn setup_index() -> Index { fn setup_index() -> Index {
let path = "benches.mmdb"; let path = "benches.mmdb";
setup_dir(path); setup_dir(path);
let options = EnvOpenOptions::new(); let mut options = EnvOpenOptions::new();
let mut options = options.read_txn_without_tls();
options.map_size(100 * 1024 * 1024 * 1024); // 100 GB options.map_size(100 * 1024 * 1024 * 1024); // 100 GB
options.max_readers(100); options.max_readers(100);
Index::new(options, path, true).unwrap() Index::new(options, path).unwrap()
} }
fn setup_settings<'t>( fn setup_settings<'t>(
@ -58,14 +57,13 @@ fn setup_settings<'t>(
let searchable_fields = searchable_fields.iter().map(|s| s.to_string()).collect(); let searchable_fields = searchable_fields.iter().map(|s| s.to_string()).collect();
builder.set_searchable_fields(searchable_fields); builder.set_searchable_fields(searchable_fields);
let filterable_fields = let filterable_fields = filterable_fields.iter().map(|s| s.to_string()).collect();
filterable_fields.iter().map(|s| FilterableAttributesRule::Field(s.to_string())).collect();
builder.set_filterable_fields(filterable_fields); builder.set_filterable_fields(filterable_fields);
let sortable_fields = sortable_fields.iter().map(|s| s.to_string()).collect(); let sortable_fields = sortable_fields.iter().map(|s| s.to_string()).collect();
builder.set_sortable_fields(sortable_fields); builder.set_sortable_fields(sortable_fields);
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap(); builder.execute(|_| (), || false).unwrap();
} }
fn setup_index_with_settings( fn setup_index_with_settings(
@ -140,9 +138,10 @@ fn indexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -166,10 +165,9 @@ fn indexing_songs_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -207,9 +205,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -233,10 +232,9 @@ fn reindexing_songs_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -252,9 +250,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -278,10 +277,9 @@ fn reindexing_songs_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -321,9 +319,10 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -347,10 +346,9 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -398,9 +396,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_1_2, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS_1_2, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -424,10 +423,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -443,9 +441,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_3_4, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS_3_4, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -469,10 +468,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -484,9 +482,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_4_4, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS_4_4, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -510,10 +509,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -551,10 +549,11 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -578,10 +577,9 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -619,9 +617,10 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -645,10 +644,9 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -686,9 +684,10 @@ fn indexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -712,10 +711,9 @@ fn indexing_wiki(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -752,9 +750,10 @@ fn reindexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -778,10 +777,9 @@ fn reindexing_wiki(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -797,9 +795,10 @@ fn reindexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -823,10 +822,9 @@ fn reindexing_wiki(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -865,9 +863,10 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv"); let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -891,10 +890,9 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -941,10 +939,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_1_2, "csv"); utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_1_2, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -968,10 +967,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -987,10 +985,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_3_4, "csv"); utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_3_4, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1014,10 +1013,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1029,10 +1027,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_4_4, "csv"); utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_4_4, "csv");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1056,10 +1055,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1097,9 +1095,10 @@ fn indexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json"); let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1123,10 +1122,9 @@ fn indexing_movies_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1163,9 +1161,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json"); let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1189,10 +1188,9 @@ fn reindexing_movies_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1208,9 +1206,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json"); let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1234,10 +1233,9 @@ fn reindexing_movies_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1276,9 +1274,10 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json"); let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1302,10 +1301,9 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1351,10 +1349,9 @@ fn delete_documents_from_ids(index: Index, document_ids_to_delete: Vec<RoaringBi
new_fields_ids_map, new_fields_ids_map,
Some(primary_key), Some(primary_key),
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1390,9 +1387,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_1_2, "json"); let documents = utils::documents_from(datasets_paths::MOVIES_1_2, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1416,10 +1414,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1435,9 +1432,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_3_4, "json"); let documents = utils::documents_from(datasets_paths::MOVIES_3_4, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1461,10 +1459,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1476,9 +1473,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_4_4, "json"); let documents = utils::documents_from(datasets_paths::MOVIES_4_4, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1502,10 +1500,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1566,9 +1563,10 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json"); let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1592,10 +1590,9 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1657,9 +1654,10 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json"); let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1683,10 +1681,9 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1740,9 +1737,10 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json"); let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1766,10 +1764,9 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1807,9 +1804,10 @@ fn indexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl"); let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1833,10 +1831,9 @@ fn indexing_geo(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1873,9 +1870,10 @@ fn reindexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl"); let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1899,10 +1897,9 @@ fn reindexing_geo(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1918,9 +1915,10 @@ fn reindexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl"); let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -1944,10 +1942,9 @@ fn reindexing_geo(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -1986,9 +1983,10 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap(); let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl"); let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -2012,10 +2010,9 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();

View File

@ -3,7 +3,6 @@ mod utils;
use criterion::{criterion_group, criterion_main}; use criterion::{criterion_group, criterion_main};
use milli::update::Settings; use milli::update::Settings;
use milli::FilterableAttributesRule;
use utils::Conf; use utils::Conf;
#[cfg(not(windows))] #[cfg(not(windows))]
@ -22,10 +21,8 @@ fn base_conf(builder: &mut Settings) {
["name", "alternatenames", "elevation"].iter().map(|s| s.to_string()).collect(); ["name", "alternatenames", "elevation"].iter().map(|s| s.to_string()).collect();
builder.set_searchable_fields(searchable_fields); builder.set_searchable_fields(searchable_fields);
let filterable_fields = ["_geo", "population", "elevation"] let filterable_fields =
.iter() ["_geo", "population", "elevation"].iter().map(|s| s.to_string()).collect();
.map(|s| FilterableAttributesRule::Field(s.to_string()))
.collect();
builder.set_filterable_fields(filterable_fields); builder.set_filterable_fields(filterable_fields);
let sortable_fields = let sortable_fields =

View File

@ -3,7 +3,6 @@ mod utils;
use criterion::{criterion_group, criterion_main}; use criterion::{criterion_group, criterion_main};
use milli::update::Settings; use milli::update::Settings;
use milli::FilterableAttributesRule;
use utils::Conf; use utils::Conf;
#[cfg(not(windows))] #[cfg(not(windows))]
@ -23,7 +22,7 @@ fn base_conf(builder: &mut Settings) {
let faceted_fields = ["released-timestamp", "duration-float", "genre", "country", "artist"] let faceted_fields = ["released-timestamp", "duration-float", "genre", "country", "artist"]
.iter() .iter()
.map(|s| FilterableAttributesRule::Field(s.to_string())) .map(|s| s.to_string())
.collect(); .collect();
builder.set_filterable_fields(faceted_fields); builder.set_filterable_fields(faceted_fields);
} }

View File

@ -1,114 +0,0 @@
//! This benchmark module is used to compare the performance of sorting documents in /search VS /documents
//!
//! The tests/benchmarks were designed in the context of a query returning only 20 documents.
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use utils::Conf;
#[cfg(not(windows))]
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields =
["geonameid", "name", "asciiname", "alternatenames", "_geo", "population"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_displayed_fields(displayed_fields);
let sortable_fields =
["_geo", "name", "population", "elevation", "timezone", "modification-date"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_sortable_fields(sortable_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
dataset_format: "jsonl",
configure: base_conf,
primary_key: Some("geonameid"),
queries: &[""],
offsets: &[
Some((0, 20)), // The most common query in the real world
Some((0, 500)), // A query that ranges over many documents
Some((980, 20)), // The worst query that could happen in the real world
Some((800_000, 20)) // The worst query
],
get_documents: true,
..Conf::BASE
};
fn bench_sort(c: &mut criterion::Criterion) {
#[rustfmt::skip]
let confs = &[
utils::Conf {
group_name: "without sort",
sort: None,
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different values",
sort: Some(vec!["name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar values",
sort: Some(vec!["timezone:desc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar then different values",
sort: Some(vec!["timezone:desc", "name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different then similar values",
sort: Some(vec!["timezone:desc", "name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "geo sort",
sample_size: Some(10),
sort: Some(vec!["_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar values then geo sort",
sample_size: Some(50),
sort: Some(vec!["timezone:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different values then geo sort",
sample_size: Some(50),
sort: Some(vec!["name:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many fields",
sort: Some(vec!["population:asc", "name:asc", "elevation:asc", "timezone:asc"]),
..BASE_CONF
},
];
utils::run_benches(c, confs);
}
criterion_group!(benches, bench_sort);
criterion_main!(benches);

View File

@ -9,12 +9,11 @@ use anyhow::Context;
use bumpalo::Bump; use bumpalo::Bump;
use criterion::BenchmarkId; use criterion::BenchmarkId;
use memmap2::Mmap; use memmap2::Mmap;
use milli::documents::sort::recursive_sort;
use milli::heed::EnvOpenOptions; use milli::heed::EnvOpenOptions;
use milli::progress::Progress; use milli::progress::Progress;
use milli::update::new::indexer; use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings}; use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::RuntimeEmbedders; use milli::vector::EmbeddingConfigs;
use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy}; use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy};
use serde_json::Value; use serde_json::Value;
@ -36,12 +35,6 @@ pub struct Conf<'a> {
pub configure: fn(&mut Settings), pub configure: fn(&mut Settings),
pub filter: Option<&'a str>, pub filter: Option<&'a str>,
pub sort: Option<Vec<&'a str>>, pub sort: Option<Vec<&'a str>>,
/// set to skip documents (offset, limit)
pub offsets: &'a [Option<(usize, usize)>],
/// enable if you want to bench getting documents without querying
pub get_documents: bool,
/// configure the benchmark sample size
pub sample_size: Option<usize>,
/// enable or disable the optional words on the query /// enable or disable the optional words on the query
pub optional_words: bool, pub optional_words: bool,
/// primary key, if there is None we'll auto-generate docids for every documents /// primary key, if there is None we'll auto-generate docids for every documents
@ -59,9 +52,6 @@ impl Conf<'_> {
configure: |_| (), configure: |_| (),
filter: None, filter: None,
sort: None, sort: None,
offsets: &[None],
get_documents: false,
sample_size: None,
optional_words: true, optional_words: true,
primary_key: None, primary_key: None,
}; };
@ -75,11 +65,10 @@ pub fn base_setup(conf: &Conf) -> Index {
} }
create_dir_all(conf.database_name).unwrap(); create_dir_all(conf.database_name).unwrap();
let options = EnvOpenOptions::new(); let mut options = EnvOpenOptions::new();
let mut options = options.read_txn_without_tls();
options.map_size(100 * 1024 * 1024 * 1024); // 100 GB options.map_size(100 * 1024 * 1024 * 1024); // 100 GB
options.max_readers(100); options.max_readers(100);
let index = Index::new(options, conf.database_name, true).unwrap(); let index = Index::new(options, conf.database_name).unwrap();
let config = IndexerConfig::default(); let config = IndexerConfig::default();
let mut wtxn = index.write_txn().unwrap(); let mut wtxn = index.write_txn().unwrap();
@ -100,7 +89,7 @@ pub fn base_setup(conf: &Conf) -> Index {
(conf.configure)(&mut builder); (conf.configure)(&mut builder);
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap(); builder.execute(|_| (), || false).unwrap();
wtxn.commit().unwrap(); wtxn.commit().unwrap();
let config = IndexerConfig::default(); let config = IndexerConfig::default();
@ -110,8 +99,8 @@ pub fn base_setup(conf: &Conf) -> Index {
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let documents = documents_from(conf.dataset, conf.dataset_format); let documents = documents_from(conf.dataset, conf.dataset_format);
let mut indexer = indexer::DocumentOperation::new(); let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.replace_documents(&documents).unwrap(); indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer let (document_changes, _operation_stats, primary_key) = indexer
@ -135,10 +124,9 @@ pub fn base_setup(conf: &Conf) -> Index {
new_fields_ids_map, new_fields_ids_map,
primary_key, primary_key,
&document_changes, &document_changes,
RuntimeEmbedders::default(), EmbeddingConfigs::default(),
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
@ -155,79 +143,25 @@ pub fn run_benches(c: &mut criterion::Criterion, confs: &[Conf]) {
let file_name = Path::new(conf.dataset).file_name().and_then(|f| f.to_str()).unwrap(); let file_name = Path::new(conf.dataset).file_name().and_then(|f| f.to_str()).unwrap();
let name = format!("{}: {}", file_name, conf.group_name); let name = format!("{}: {}", file_name, conf.group_name);
let mut group = c.benchmark_group(&name); let mut group = c.benchmark_group(&name);
if let Some(sample_size) = conf.sample_size {
group.sample_size(sample_size);
}
for &query in conf.queries { for &query in conf.queries {
for offset in conf.offsets { group.bench_with_input(BenchmarkId::from_parameter(query), &query, |b, &query| {
let parameter = match offset { b.iter(|| {
None => query.to_string(), let rtxn = index.read_txn().unwrap();
Some((offset, limit)) => format!("{query}[{offset}:{limit}]"), let mut search = index.search(&rtxn);
}; search.query(query).terms_matching_strategy(TermsMatchingStrategy::default());
group.bench_with_input( if let Some(filter) = conf.filter {
BenchmarkId::from_parameter(parameter), let filter = Filter::from_str(filter).unwrap().unwrap();
&query, search.filter(filter);
|b, &query| { }
b.iter(|| { if let Some(sort) = &conf.sort {
let rtxn = index.read_txn().unwrap(); let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
let mut search = index.search(&rtxn); search.sort_criteria(sort);
search }
.query(query) let _ids = search.execute().unwrap();
.terms_matching_strategy(TermsMatchingStrategy::default());
if let Some(filter) = conf.filter {
let filter = Filter::from_str(filter).unwrap().unwrap();
search.filter(filter);
}
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
search.sort_criteria(sort);
}
if let Some((offset, limit)) = offset {
search.offset(*offset).limit(*limit);
}
let _ids = search.execute().unwrap();
});
},
);
}
}
if conf.get_documents {
for offset in conf.offsets {
let parameter = match offset {
None => String::from("get_documents"),
Some((offset, limit)) => format!("get_documents[{offset}:{limit}]"),
};
group.bench_with_input(BenchmarkId::from_parameter(parameter), &(), |b, &()| {
b.iter(|| {
let rtxn = index.read_txn().unwrap();
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
let all_docs = index.documents_ids(&rtxn).unwrap();
let facet_sort =
recursive_sort(&index, &rtxn, sort, &all_docs).unwrap();
let iter = facet_sort.iter().unwrap();
if let Some((offset, limit)) = offset {
let _results = iter.skip(*offset).take(*limit).collect::<Vec<_>>();
} else {
let _results = iter.collect::<Vec<_>>();
}
} else {
let all_docs = index.documents_ids(&rtxn).unwrap();
if let Some((offset, limit)) = offset {
let _results =
all_docs.iter().skip(*offset).take(*limit).collect::<Vec<_>>();
} else {
let _results = all_docs.iter().collect::<Vec<_>>();
}
}
});
}); });
} });
} }
group.finish(); group.finish();
index.prepare_for_closing().wait(); index.prepare_for_closing().wait();

View File

@ -67,7 +67,7 @@ fn main() -> anyhow::Result<()> {
writeln!( writeln!(
&mut manifest_paths_file, &mut manifest_paths_file,
r#"pub const {}: &str = {:?};"#, r#"pub const {}: &str = {:?};"#,
dataset.to_case(Case::UpperSnake), dataset.to_case(Case::ScreamingSnake),
out_file.display(), out_file.display(),
)?; )?;

View File

@ -11,8 +11,8 @@ license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies] [dependencies]
time = { version = "0.3.41", features = ["parsing"] } time = { version = "0.3.37", features = ["parsing"] }
[build-dependencies] [build-dependencies]
anyhow = "1.0.98" anyhow = "1.0.95"
vergen-git2 = "1.0.7" vergen-git2 = "1.0.2"

View File

@ -11,21 +11,21 @@ readme.workspace = true
license.workspace = true license.workspace = true
[dependencies] [dependencies]
anyhow = "1.0.98" anyhow = "1.0.95"
flate2 = "1.1.2" flate2 = "1.0.35"
http = "1.3.1" http = "1.2.0"
meilisearch-types = { path = "../meilisearch-types" } meilisearch-types = { path = "../meilisearch-types" }
once_cell = "1.21.3" once_cell = "1.20.2"
regex = "1.11.1" regex = "1.11.1"
roaring = { version = "0.10.12", features = ["serde"] } roaring = { version = "0.10.10", features = ["serde"] }
serde = { version = "1.0.219", features = ["derive"] } serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] } serde_json = { version = "1.0.135", features = ["preserve_order"] }
tar = "0.4.44" tar = "0.4.43"
tempfile = "3.20.0" tempfile = "3.15.0"
thiserror = "2.0.12" thiserror = "2.0.9"
time = { version = "0.3.41", features = ["serde-well-known", "formatting", "parsing", "macros"] } time = { version = "0.3.37", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tracing = "0.1.41" tracing = "0.1.41"
uuid = { version = "1.17.0", features = ["serde", "v4"] } uuid = { version = "1.11.0", features = ["serde", "v4"] }
[dev-dependencies] [dev-dependencies]
big_s = "1.0.2" big_s = "1.0.2"

View File

@ -10,10 +10,8 @@ dump
├── instance-uid.uuid ├── instance-uid.uuid
├── keys.jsonl ├── keys.jsonl
├── metadata.json ├── metadata.json
├── tasks └── tasks
│ ├── update_files ├── update_files
│ │ └── [task_id].jsonl │ └── [task_id].jsonl
│ └── queue.jsonl
└── batches
└── queue.jsonl └── queue.jsonl
``` ```

View File

@ -1,17 +1,12 @@
#![allow(clippy::type_complexity)] #![allow(clippy::type_complexity)]
#![allow(clippy::wrong_self_convention)] #![allow(clippy::wrong_self_convention)]
use std::collections::BTreeMap;
use meilisearch_types::batches::BatchId; use meilisearch_types::batches::BatchId;
use meilisearch_types::byte_unit::Byte;
use meilisearch_types::error::ResponseError; use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::Key; use meilisearch_types::keys::Key;
use meilisearch_types::milli::update::IndexDocumentsMethod; use meilisearch_types::milli::update::IndexDocumentsMethod;
use meilisearch_types::settings::Unchecked; use meilisearch_types::settings::Unchecked;
use meilisearch_types::tasks::{ use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task, TaskId};
Details, ExportIndexSettings, IndexSwap, KindWithContent, Status, Task, TaskId,
};
use meilisearch_types::InstanceUid; use meilisearch_types::InstanceUid;
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
@ -146,15 +141,6 @@ pub enum KindDump {
instance_uid: Option<InstanceUid>, instance_uid: Option<InstanceUid>,
}, },
SnapshotCreation, SnapshotCreation,
Export {
url: String,
api_key: Option<String>,
payload_size: Option<Byte>,
indexes: BTreeMap<String, ExportIndexSettings>,
},
UpgradeDatabase {
from: (u32, u32, u32),
},
} }
impl From<Task> for TaskDump { impl From<Task> for TaskDump {
@ -224,18 +210,6 @@ impl From<KindWithContent> for KindDump {
KindDump::DumpCreation { keys, instance_uid } KindDump::DumpCreation { keys, instance_uid }
} }
KindWithContent::SnapshotCreation => KindDump::SnapshotCreation, KindWithContent::SnapshotCreation => KindDump::SnapshotCreation,
KindWithContent::Export { url, api_key, payload_size, indexes } => KindDump::Export {
url,
api_key,
payload_size,
indexes: indexes
.into_iter()
.map(|(pattern, settings)| (pattern.to_string(), settings))
.collect(),
},
KindWithContent::UpgradeDatabase { from: version } => {
KindDump::UpgradeDatabase { from: version }
}
} }
} }
} }
@ -248,16 +222,14 @@ pub(crate) mod test {
use big_s::S; use big_s::S;
use maplit::{btreemap, btreeset}; use maplit::{btreemap, btreeset};
use meilisearch_types::batches::{Batch, BatchEnqueuedAt, BatchStats};
use meilisearch_types::facet_values_sort::FacetValuesSort; use meilisearch_types::facet_values_sort::FacetValuesSort;
use meilisearch_types::features::{Network, Remote, RuntimeTogglableFeatures}; use meilisearch_types::features::RuntimeTogglableFeatures;
use meilisearch_types::index_uid_pattern::IndexUidPattern; use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::{Action, Key}; use meilisearch_types::keys::{Action, Key};
use meilisearch_types::milli;
use meilisearch_types::milli::update::Setting; use meilisearch_types::milli::update::Setting;
use meilisearch_types::milli::{self, FilterableAttributesRule};
use meilisearch_types::settings::{Checked, FacetingSettings, Settings}; use meilisearch_types::settings::{Checked, FacetingSettings, Settings};
use meilisearch_types::task_view::DetailsView; use meilisearch_types::tasks::{Details, Status};
use meilisearch_types::tasks::{BatchStopReason, Details, Kind, Status};
use serde_json::{json, Map, Value}; use serde_json::{json, Map, Value};
use time::macros::datetime; use time::macros::datetime;
use uuid::Uuid; use uuid::Uuid;
@ -299,10 +271,7 @@ pub(crate) mod test {
let settings = Settings { let settings = Settings {
displayed_attributes: Setting::Set(vec![S("race"), S("name")]).into(), displayed_attributes: Setting::Set(vec![S("race"), S("name")]).into(),
searchable_attributes: Setting::Set(vec![S("name"), S("race")]).into(), searchable_attributes: Setting::Set(vec![S("name"), S("race")]).into(),
filterable_attributes: Setting::Set(vec![ filterable_attributes: Setting::Set(btreeset! { S("race"), S("age") }),
FilterableAttributesRule::Field(S("race")),
FilterableAttributesRule::Field(S("age")),
]),
sortable_attributes: Setting::Set(btreeset! { S("age") }), sortable_attributes: Setting::Set(btreeset! { S("age") }),
ranking_rules: Setting::NotSet, ranking_rules: Setting::NotSet,
stop_words: Setting::NotSet, stop_words: Setting::NotSet,
@ -325,41 +294,11 @@ pub(crate) mod test {
localized_attributes: Setting::NotSet, localized_attributes: Setting::NotSet,
facet_search: Setting::NotSet, facet_search: Setting::NotSet,
prefix_search: Setting::NotSet, prefix_search: Setting::NotSet,
chat: Setting::NotSet,
_kind: std::marker::PhantomData, _kind: std::marker::PhantomData,
}; };
settings.check() settings.check()
} }
pub fn create_test_batches() -> Vec<Batch> {
vec![Batch {
uid: 0,
details: DetailsView {
received_documents: Some(12),
indexed_documents: Some(Some(10)),
..DetailsView::default()
},
progress: None,
stats: BatchStats {
total_nb_tasks: 1,
status: maplit::btreemap! { Status::Succeeded => 1 },
types: maplit::btreemap! { Kind::DocumentAdditionOrUpdate => 1 },
index_uids: maplit::btreemap! { "doggo".to_string() => 1 },
progress_trace: Default::default(),
write_channel_congestion: None,
internal_database_sizes: Default::default(),
},
embedder_stats: Default::default(),
enqueued_at: Some(BatchEnqueuedAt {
earliest: datetime!(2022-11-11 0:00 UTC),
oldest: datetime!(2022-11-11 0:00 UTC),
}),
started_at: datetime!(2022-11-20 0:00 UTC),
finished_at: Some(datetime!(2022-11-21 0:00 UTC)),
stop_reason: BatchStopReason::Unspecified.to_string(),
}]
}
pub fn create_test_tasks() -> Vec<(TaskDump, Option<Vec<Document>>)> { pub fn create_test_tasks() -> Vec<(TaskDump, Option<Vec<Document>>)> {
vec![ vec![
( (
@ -482,15 +421,6 @@ pub(crate) mod test {
index.flush().unwrap(); index.flush().unwrap();
index.settings(&settings).unwrap(); index.settings(&settings).unwrap();
// ========== pushing the batch queue
let batches = create_test_batches();
let mut batch_queue = dump.create_batches_queue().unwrap();
for batch in &batches {
batch_queue.push_batch(batch).unwrap();
}
batch_queue.flush().unwrap();
// ========== pushing the task queue // ========== pushing the task queue
let tasks = create_test_tasks(); let tasks = create_test_tasks();
@ -519,10 +449,6 @@ pub(crate) mod test {
dump.create_experimental_features(features).unwrap(); dump.create_experimental_features(features).unwrap();
// ========== network
let network = create_test_network();
dump.create_network(network).unwrap();
// create the dump // create the dump
let mut file = tempfile::tempfile().unwrap(); let mut file = tempfile::tempfile().unwrap();
dump.persist_to(&mut file).unwrap(); dump.persist_to(&mut file).unwrap();
@ -535,13 +461,6 @@ pub(crate) mod test {
RuntimeTogglableFeatures::default() RuntimeTogglableFeatures::default()
} }
fn create_test_network() -> Network {
Network {
local: Some("myself".to_string()),
remotes: maplit::btreemap! {"other".to_string() => Remote { url: "http://test".to_string(), search_api_key: Some("apiKey".to_string()) }},
}
}
#[test] #[test]
fn test_creating_and_read_dump() { fn test_creating_and_read_dump() {
let mut file = create_test_dump(); let mut file = create_test_dump();
@ -590,9 +509,5 @@ pub(crate) mod test {
// ==== checking the features // ==== checking the features
let expected = create_test_features(); let expected = create_test_features();
assert_eq!(dump.features().unwrap().unwrap(), expected); assert_eq!(dump.features().unwrap().unwrap(), expected);
// ==== checking the network
let expected = create_test_network();
assert_eq!(&expected, dump.network().unwrap().unwrap());
} }
} }

View File

@ -1,4 +1,3 @@
use std::num::NonZeroUsize;
use std::str::FromStr; use std::str::FromStr;
use super::v4_to_v5::{CompatIndexV4ToV5, CompatV4ToV5}; use super::v4_to_v5::{CompatIndexV4ToV5, CompatV4ToV5};
@ -197,10 +196,6 @@ impl CompatV5ToV6 {
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> { pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
Ok(None) Ok(None)
} }
pub fn network(&self) -> Result<Option<&v6::Network>> {
Ok(None)
}
} }
pub enum CompatIndexV5ToV6 { pub enum CompatIndexV5ToV6 {
@ -323,16 +318,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
v6::Settings { v6::Settings {
displayed_attributes: v6::Setting::from(settings.displayed_attributes).into(), displayed_attributes: v6::Setting::from(settings.displayed_attributes).into(),
searchable_attributes: v6::Setting::from(settings.searchable_attributes).into(), searchable_attributes: v6::Setting::from(settings.searchable_attributes).into(),
filterable_attributes: match settings.filterable_attributes { filterable_attributes: settings.filterable_attributes.into(),
v5::settings::Setting::Set(filterable_attributes) => v6::Setting::Set(
filterable_attributes
.into_iter()
.map(v6::FilterableAttributesRule::Field)
.collect(),
),
v5::settings::Setting::Reset => v6::Setting::Reset,
v5::settings::Setting::NotSet => v6::Setting::NotSet,
},
sortable_attributes: settings.sortable_attributes.into(), sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: { ranking_rules: {
match settings.ranking_rules { match settings.ranking_rules {
@ -374,7 +360,6 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
}, },
disable_on_words: typo.disable_on_words.into(), disable_on_words: typo.disable_on_words.into(),
disable_on_attributes: typo.disable_on_attributes.into(), disable_on_attributes: typo.disable_on_attributes.into(),
disable_on_numbers: v6::Setting::NotSet,
}), }),
v5::Setting::Reset => v6::Setting::Reset, v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet, v5::Setting::NotSet => v6::Setting::NotSet,
@ -389,13 +374,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
}, },
pagination: match settings.pagination { pagination: match settings.pagination {
v5::Setting::Set(pagination) => v6::Setting::Set(v6::PaginationSettings { v5::Setting::Set(pagination) => v6::Setting::Set(v6::PaginationSettings {
max_total_hits: match pagination.max_total_hits { max_total_hits: pagination.max_total_hits.into(),
v5::Setting::Set(max_total_hits) => v6::Setting::Set(
max_total_hits.try_into().unwrap_or(NonZeroUsize::new(1).unwrap()),
),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
}), }),
v5::Setting::Reset => v6::Setting::Reset, v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet, v5::Setting::NotSet => v6::Setting::NotSet,
@ -405,7 +384,6 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
search_cutoff_ms: v6::Setting::NotSet, search_cutoff_ms: v6::Setting::NotSet,
facet_search: v6::Setting::NotSet, facet_search: v6::Setting::NotSet,
prefix_search: v6::Setting::NotSet, prefix_search: v6::Setting::NotSet,
chat: v6::Setting::NotSet,
_kind: std::marker::PhantomData, _kind: std::marker::PhantomData,
} }
} }

View File

@ -23,7 +23,6 @@ mod v6;
pub type Document = serde_json::Map<String, serde_json::Value>; pub type Document = serde_json::Map<String, serde_json::Value>;
pub type UpdateFile = dyn Iterator<Item = Result<Document>>; pub type UpdateFile = dyn Iterator<Item = Result<Document>>;
#[allow(clippy::large_enum_variant)]
pub enum DumpReader { pub enum DumpReader {
Current(V6Reader), Current(V6Reader),
Compat(CompatV5ToV6), Compat(CompatV5ToV6),
@ -102,13 +101,6 @@ impl DumpReader {
} }
} }
pub fn batches(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Batch>> + '_>> {
match self {
DumpReader::Current(current) => Ok(current.batches()),
DumpReader::Compat(_compat) => Ok(Box::new(std::iter::empty())),
}
}
pub fn keys(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Key>> + '_>> { pub fn keys(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Key>> + '_>> {
match self { match self {
DumpReader::Current(current) => Ok(current.keys()), DumpReader::Current(current) => Ok(current.keys()),
@ -116,28 +108,12 @@ impl DumpReader {
} }
} }
pub fn chat_completions_settings(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(String, v6::ChatCompletionSettings)>> + '_>> {
match self {
DumpReader::Current(current) => current.chat_completions_settings(),
DumpReader::Compat(_compat) => Ok(Box::new(std::iter::empty())),
}
}
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> { pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
match self { match self {
DumpReader::Current(current) => Ok(current.features()), DumpReader::Current(current) => Ok(current.features()),
DumpReader::Compat(compat) => compat.features(), DumpReader::Compat(compat) => compat.features(),
} }
} }
pub fn network(&self) -> Result<Option<&v6::Network>> {
match self {
DumpReader::Current(current) => Ok(current.network()),
DumpReader::Compat(compat) => compat.network(),
}
}
} }
impl From<V6Reader> for DumpReader { impl From<V6Reader> for DumpReader {
@ -243,10 +219,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2024-05-16 15:51:34.151044 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2024-05-16 15:51:34.151044 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None"); insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -356,7 +328,6 @@ pub(crate) mod test {
} }
assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default()); assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default());
assert_eq!(dump.network().unwrap(), None);
} }
#[test] #[test]
@ -368,10 +339,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None"); insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -406,27 +373,6 @@ pub(crate) mod test {
assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default()); assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default());
} }
#[test]
fn import_dump_v6_network() {
let dump = File::open("tests/assets/v6-with-network.dump").unwrap();
let dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2025-01-29 15:45:32.738676 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// network
let network = dump.network().unwrap().unwrap();
insta::assert_snapshot!(network.local.as_ref().unwrap(), @"ms-0");
insta::assert_snapshot!(network.remotes.get("ms-0").as_ref().unwrap().url, @"http://localhost:7700");
insta::assert_snapshot!(network.remotes.get("ms-0").as_ref().unwrap().search_api_key.is_none(), @"true");
insta::assert_snapshot!(network.remotes.get("ms-1").as_ref().unwrap().url, @"http://localhost:7701");
insta::assert_snapshot!(network.remotes.get("ms-1").as_ref().unwrap().search_api_key.is_none(), @"true");
insta::assert_snapshot!(network.remotes.get("ms-2").as_ref().unwrap().url, @"http://ms-5679.example.meilisearch.io");
insta::assert_snapshot!(network.remotes.get("ms-2").as_ref().unwrap().search_api_key.as_ref().unwrap(), @"foo");
}
#[test] #[test]
fn import_dump_v5() { fn import_dump_v5() {
let dump = File::open("tests/assets/v5.dump").unwrap(); let dump = File::open("tests/assets/v5.dump").unwrap();
@ -436,10 +382,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d"); insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -520,10 +462,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d"); insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -601,10 +539,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None); assert_eq!(dump.instance_uid().unwrap(), None);
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -698,10 +632,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None); assert_eq!(dump.instance_uid().unwrap(), None);
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -795,10 +725,6 @@ pub(crate) mod test {
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00"); insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None); assert_eq!(dump.instance_uid().unwrap(), None);
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
@ -875,10 +801,6 @@ pub(crate) mod test {
assert_eq!(dump.date(), None); assert_eq!(dump.date(), None);
assert_eq!(dump.instance_uid().unwrap(), None); assert_eq!(dump.instance_uid().unwrap(), None);
// batches didn't exists at the time
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
// tasks // tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap(); let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip(); let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();

View File

@ -1,5 +1,5 @@
--- ---
source: crates/dump/src/reader/mod.rs source: dump/src/reader/mod.rs
expression: vector_index.settings().unwrap() expression: vector_index.settings().unwrap()
--- ---
{ {
@ -49,7 +49,6 @@ expression: vector_index.settings().unwrap()
"source": "huggingFace", "source": "huggingFace",
"model": "BAAI/bge-base-en-v1.5", "model": "BAAI/bge-base-en-v1.5",
"revision": "617ca489d9e86b49b8167676d8220688b99db36e", "revision": "617ca489d9e86b49b8167676d8220688b99db36e",
"pooling": "forceMean",
"documentTemplate": "{% for field in fields %} {{ field.name }}: {{ field.value }}\n{% endfor %}" "documentTemplate": "{% for field in fields %} {{ field.name }}: {{ field.value }}\n{% endfor %}"
} }
}, },

View File

@ -108,7 +108,7 @@ where
/// not supported on untagged enums. /// not supported on untagged enums.
struct StarOrVisitor<T>(PhantomData<T>); struct StarOrVisitor<T>(PhantomData<T>);
impl<T, FE> Visitor<'_> for StarOrVisitor<T> impl<'de, T, FE> Visitor<'de> for StarOrVisitor<T>
where where
T: FromStr<Err = FE>, T: FromStr<Err = FE>,
FE: Display, FE: Display,

View File

@ -99,7 +99,7 @@ impl Task {
/// Return true when a task is finished. /// Return true when a task is finished.
/// A task is finished when its last state is either `Succeeded` or `Failed`. /// A task is finished when its last state is either `Succeeded` or `Failed`.
pub fn is_finished(&self) -> bool { pub fn is_finished(&self) -> bool {
self.events.last().is_some_and(|event| { self.events.last().map_or(false, |event| {
matches!(event, TaskEvent::Succeded { .. } | TaskEvent::Failed { .. }) matches!(event, TaskEvent::Succeded { .. } | TaskEvent::Failed { .. })
}) })
} }

View File

@ -108,7 +108,7 @@ where
/// not supported on untagged enums. /// not supported on untagged enums.
struct StarOrVisitor<T>(PhantomData<T>); struct StarOrVisitor<T>(PhantomData<T>);
impl<T, FE> Visitor<'_> for StarOrVisitor<T> impl<'de, T, FE> Visitor<'de> for StarOrVisitor<T>
where where
T: FromStr<Err = FE>, T: FromStr<Err = FE>,
FE: Display, FE: Display,

View File

@ -114,7 +114,7 @@ impl Task {
/// Return true when a task is finished. /// Return true when a task is finished.
/// A task is finished when its last state is either `Succeeded` or `Failed`. /// A task is finished when its last state is either `Succeeded` or `Failed`.
pub fn is_finished(&self) -> bool { pub fn is_finished(&self) -> bool {
self.events.last().is_some_and(|event| { self.events.last().map_or(false, |event| {
matches!(event, TaskEvent::Succeeded { .. } | TaskEvent::Failed { .. }) matches!(event, TaskEvent::Succeeded { .. } | TaskEvent::Failed { .. })
}) })
} }

View File

@ -1,10 +1,8 @@
use std::ffi::OsStr;
use std::fs::{self, File}; use std::fs::{self, File};
use std::io::{BufRead, BufReader, ErrorKind}; use std::io::{BufRead, BufReader, ErrorKind};
use std::path::Path; use std::path::Path;
pub use meilisearch_types::milli; pub use meilisearch_types::milli;
use meilisearch_types::milli::vector::hf::OverridePooling;
use tempfile::TempDir; use tempfile::TempDir;
use time::OffsetDateTime; use time::OffsetDateTime;
use tracing::debug; use tracing::debug;
@ -20,11 +18,8 @@ pub type Checked = meilisearch_types::settings::Checked;
pub type Unchecked = meilisearch_types::settings::Unchecked; pub type Unchecked = meilisearch_types::settings::Unchecked;
pub type Task = crate::TaskDump; pub type Task = crate::TaskDump;
pub type Batch = meilisearch_types::batches::Batch;
pub type Key = meilisearch_types::keys::Key; pub type Key = meilisearch_types::keys::Key;
pub type ChatCompletionSettings = meilisearch_types::features::ChatCompletionSettings;
pub type RuntimeTogglableFeatures = meilisearch_types::features::RuntimeTogglableFeatures; pub type RuntimeTogglableFeatures = meilisearch_types::features::RuntimeTogglableFeatures;
pub type Network = meilisearch_types::features::Network;
// ===== Other types to clarify the code of the compat module // ===== Other types to clarify the code of the compat module
// everything related to the tasks // everything related to the tasks
@ -48,17 +43,13 @@ pub type ResponseError = meilisearch_types::error::ResponseError;
pub type Code = meilisearch_types::error::Code; pub type Code = meilisearch_types::error::Code;
pub type RankingRuleView = meilisearch_types::settings::RankingRuleView; pub type RankingRuleView = meilisearch_types::settings::RankingRuleView;
pub type FilterableAttributesRule = meilisearch_types::milli::FilterableAttributesRule;
pub struct V6Reader { pub struct V6Reader {
dump: TempDir, dump: TempDir,
instance_uid: Option<Uuid>, instance_uid: Option<Uuid>,
metadata: Metadata, metadata: Metadata,
tasks: BufReader<File>, tasks: BufReader<File>,
batches: Option<BufReader<File>>,
keys: BufReader<File>, keys: BufReader<File>,
features: Option<RuntimeTogglableFeatures>, features: Option<RuntimeTogglableFeatures>,
network: Option<Network>,
} }
impl V6Reader { impl V6Reader {
@ -86,38 +77,13 @@ impl V6Reader {
} else { } else {
None None
}; };
let batches = match File::open(dump.path().join("batches").join("queue.jsonl")) {
Ok(file) => Some(BufReader::new(file)),
// The batch file was only introduced during the v1.13, anything prior to that won't have batches
Err(err) if err.kind() == ErrorKind::NotFound => None,
Err(e) => return Err(e.into()),
};
let network_file = match fs::read(dump.path().join("network.json")) {
Ok(network_file) => Some(network_file),
Err(error) => match error.kind() {
// Allows the file to be missing, this will only result in all experimental features disabled.
ErrorKind::NotFound => {
debug!("`network.json` not found in dump");
None
}
_ => return Err(error.into()),
},
};
let network = if let Some(network_file) = network_file {
Some(serde_json::from_reader(&*network_file)?)
} else {
None
};
Ok(V6Reader { Ok(V6Reader {
metadata: serde_json::from_reader(&*meta_file)?, metadata: serde_json::from_reader(&*meta_file)?,
instance_uid, instance_uid,
tasks: BufReader::new(File::open(dump.path().join("tasks").join("queue.jsonl"))?), tasks: BufReader::new(File::open(dump.path().join("tasks").join("queue.jsonl"))?),
batches,
keys: BufReader::new(File::open(dump.path().join("keys.jsonl"))?), keys: BufReader::new(File::open(dump.path().join("keys.jsonl"))?),
features, features,
network,
dump, dump,
}) })
} }
@ -158,7 +124,7 @@ impl V6Reader {
&mut self, &mut self,
) -> Box<dyn Iterator<Item = Result<(Task, Option<Box<super::UpdateFile>>)>> + '_> { ) -> Box<dyn Iterator<Item = Result<(Task, Option<Box<super::UpdateFile>>)>> + '_> {
Box::new((&mut self.tasks).lines().map(|line| -> Result<_> { Box::new((&mut self.tasks).lines().map(|line| -> Result<_> {
let task: Task = serde_json::from_str(&line?)?; let task: Task = serde_json::from_str(&line?).unwrap();
let update_file_path = self let update_file_path = self
.dump .dump
@ -170,7 +136,8 @@ impl V6Reader {
if update_file_path.exists() { if update_file_path.exists() {
Ok(( Ok((
task, task,
Some(Box::new(UpdateFile::new(&update_file_path)?) as Box<super::UpdateFile>), Some(Box::new(UpdateFile::new(&update_file_path).unwrap())
as Box<super::UpdateFile>),
)) ))
} else { } else {
Ok((task, None)) Ok((task, None))
@ -178,57 +145,15 @@ impl V6Reader {
})) }))
} }
pub fn batches(&mut self) -> Box<dyn Iterator<Item = Result<Batch>> + '_> {
match self.batches.as_mut() {
Some(batches) => Box::new((batches).lines().map(|line| -> Result<_> {
let batch = serde_json::from_str(&line?)?;
Ok(batch)
})),
None => Box::new(std::iter::empty()) as Box<dyn Iterator<Item = Result<Batch>> + '_>,
}
}
pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<Key>> + '_> { pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<Key>> + '_> {
Box::new( Box::new(
(&mut self.keys).lines().map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }), (&mut self.keys).lines().map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }),
) )
} }
pub fn chat_completions_settings(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(String, ChatCompletionSettings)>> + '_>> {
let entries = match fs::read_dir(self.dump.path().join("chat-completions-settings")) {
Ok(entries) => entries,
Err(e) if e.kind() == ErrorKind::NotFound => return Ok(Box::new(std::iter::empty())),
Err(e) => return Err(e.into()),
};
Ok(Box::new(
entries
.map(|entry| -> Result<Option<_>> {
let entry = entry?;
let file_name = entry.file_name();
let path = Path::new(&file_name);
if entry.file_type()?.is_file() && path.extension() == Some(OsStr::new("json"))
{
let name = path.file_stem().unwrap().to_str().unwrap().to_string();
let file = File::open(entry.path())?;
let settings = serde_json::from_reader(file)?;
Ok(Some((name, settings)))
} else {
Ok(None)
}
})
.filter_map(|entry| entry.transpose()),
))
}
pub fn features(&self) -> Option<RuntimeTogglableFeatures> { pub fn features(&self) -> Option<RuntimeTogglableFeatures> {
self.features self.features
} }
pub fn network(&self) -> Option<&Network> {
self.network.as_ref()
}
} }
pub struct UpdateFile { pub struct UpdateFile {
@ -285,29 +210,7 @@ impl V6IndexReader {
} }
pub fn settings(&mut self) -> Result<Settings<Checked>> { pub fn settings(&mut self) -> Result<Settings<Checked>> {
let mut settings: Settings<Unchecked> = serde_json::from_reader(&mut self.settings)?; let settings: Settings<Unchecked> = serde_json::from_reader(&mut self.settings)?;
patch_embedders(&mut settings);
Ok(settings.check()) Ok(settings.check())
} }
} }
fn patch_embedders(settings: &mut Settings<Unchecked>) {
if let Setting::Set(embedders) = &mut settings.embedders {
for settings in embedders.values_mut() {
let Setting::Set(settings) = &mut settings.inner else {
continue;
};
if settings.source != Setting::Set(milli::vector::settings::EmbedderSource::HuggingFace)
{
continue;
}
settings.pooling = match settings.pooling {
Setting::Set(pooling) => Setting::Set(pooling),
// if the pooling for a hugging face embedder is not set, force it to `forceMean`
// for backward compatibility with v1.13
// dumps created in v1.14 and up will have the setting set for hugging face embedders
Setting::Reset | Setting::NotSet => Setting::Set(OverridePooling::ForceMean),
};
}
}
}

View File

@ -4,8 +4,7 @@ use std::path::PathBuf;
use flate2::write::GzEncoder; use flate2::write::GzEncoder;
use flate2::Compression; use flate2::Compression;
use meilisearch_types::batches::Batch; use meilisearch_types::features::RuntimeTogglableFeatures;
use meilisearch_types::features::{ChatCompletionSettings, Network, RuntimeTogglableFeatures};
use meilisearch_types::keys::Key; use meilisearch_types::keys::Key;
use meilisearch_types::settings::{Checked, Settings}; use meilisearch_types::settings::{Checked, Settings};
use serde_json::{Map, Value}; use serde_json::{Map, Value};
@ -51,18 +50,10 @@ impl DumpWriter {
KeyWriter::new(self.dir.path().to_path_buf()) KeyWriter::new(self.dir.path().to_path_buf())
} }
pub fn create_chat_completions_settings(&self) -> Result<ChatCompletionsSettingsWriter> {
ChatCompletionsSettingsWriter::new(self.dir.path().join("chat-completions-settings"))
}
pub fn create_tasks_queue(&self) -> Result<TaskWriter> { pub fn create_tasks_queue(&self) -> Result<TaskWriter> {
TaskWriter::new(self.dir.path().join("tasks")) TaskWriter::new(self.dir.path().join("tasks"))
} }
pub fn create_batches_queue(&self) -> Result<BatchWriter> {
BatchWriter::new(self.dir.path().join("batches"))
}
pub fn create_experimental_features(&self, features: RuntimeTogglableFeatures) -> Result<()> { pub fn create_experimental_features(&self, features: RuntimeTogglableFeatures) -> Result<()> {
Ok(std::fs::write( Ok(std::fs::write(
self.dir.path().join("experimental-features.json"), self.dir.path().join("experimental-features.json"),
@ -70,10 +61,6 @@ impl DumpWriter {
)?) )?)
} }
pub fn create_network(&self, network: Network) -> Result<()> {
Ok(std::fs::write(self.dir.path().join("network.json"), serde_json::to_string(&network)?)?)
}
pub fn persist_to(self, mut writer: impl Write) -> Result<()> { pub fn persist_to(self, mut writer: impl Write) -> Result<()> {
let gz_encoder = GzEncoder::new(&mut writer, Compression::default()); let gz_encoder = GzEncoder::new(&mut writer, Compression::default());
let mut tar_encoder = tar::Builder::new(gz_encoder); let mut tar_encoder = tar::Builder::new(gz_encoder);
@ -97,7 +84,7 @@ impl KeyWriter {
} }
pub fn push_key(&mut self, key: &Key) -> Result<()> { pub fn push_key(&mut self, key: &Key) -> Result<()> {
serde_json::to_writer(&mut self.keys, &key)?; self.keys.write_all(&serde_json::to_vec(key)?)?;
self.keys.write_all(b"\n")?; self.keys.write_all(b"\n")?;
Ok(()) Ok(())
} }
@ -108,24 +95,6 @@ impl KeyWriter {
} }
} }
pub struct ChatCompletionsSettingsWriter {
path: PathBuf,
}
impl ChatCompletionsSettingsWriter {
pub(crate) fn new(path: PathBuf) -> Result<Self> {
std::fs::create_dir(&path)?;
Ok(ChatCompletionsSettingsWriter { path })
}
pub fn push_settings(&mut self, name: &str, settings: &ChatCompletionSettings) -> Result<()> {
let mut settings_file = File::create(self.path.join(name).with_extension("json"))?;
serde_json::to_writer(&mut settings_file, &settings)?;
settings_file.flush()?;
Ok(())
}
}
pub struct TaskWriter { pub struct TaskWriter {
queue: BufWriter<File>, queue: BufWriter<File>,
update_files: PathBuf, update_files: PathBuf,
@ -145,7 +114,7 @@ impl TaskWriter {
/// Pushes tasks in the dump. /// Pushes tasks in the dump.
/// If the tasks has an associated `update_file` it'll use the `task_id` as its name. /// If the tasks has an associated `update_file` it'll use the `task_id` as its name.
pub fn push_task(&mut self, task: &TaskDump) -> Result<UpdateFile> { pub fn push_task(&mut self, task: &TaskDump) -> Result<UpdateFile> {
serde_json::to_writer(&mut self.queue, &task)?; self.queue.write_all(&serde_json::to_vec(task)?)?;
self.queue.write_all(b"\n")?; self.queue.write_all(b"\n")?;
Ok(UpdateFile::new(self.update_files.join(format!("{}.jsonl", task.uid)))) Ok(UpdateFile::new(self.update_files.join(format!("{}.jsonl", task.uid))))
@ -157,30 +126,6 @@ impl TaskWriter {
} }
} }
pub struct BatchWriter {
queue: BufWriter<File>,
}
impl BatchWriter {
pub(crate) fn new(path: PathBuf) -> Result<Self> {
std::fs::create_dir(&path)?;
let queue = File::create(path.join("queue.jsonl"))?;
Ok(BatchWriter { queue: BufWriter::new(queue) })
}
/// Pushes batches in the dump.
pub fn push_batch(&mut self, batch: &Batch) -> Result<()> {
serde_json::to_writer(&mut self.queue, &batch)?;
self.queue.write_all(b"\n")?;
Ok(())
}
pub fn flush(mut self) -> Result<()> {
self.queue.flush()?;
Ok(())
}
}
pub struct UpdateFile { pub struct UpdateFile {
path: PathBuf, path: PathBuf,
writer: Option<BufWriter<File>>, writer: Option<BufWriter<File>>,
@ -192,8 +137,8 @@ impl UpdateFile {
} }
pub fn push_document(&mut self, document: &Document) -> Result<()> { pub fn push_document(&mut self, document: &Document) -> Result<()> {
if let Some(mut writer) = self.writer.as_mut() { if let Some(writer) = self.writer.as_mut() {
serde_json::to_writer(&mut writer, &document)?; writer.write_all(&serde_json::to_vec(document)?)?;
writer.write_all(b"\n")?; writer.write_all(b"\n")?;
} else { } else {
let file = File::create(&self.path).unwrap(); let file = File::create(&self.path).unwrap();
@ -260,8 +205,8 @@ pub(crate) mod test {
use super::*; use super::*;
use crate::reader::Document; use crate::reader::Document;
use crate::test::{ use crate::test::{
create_test_api_keys, create_test_batches, create_test_documents, create_test_dump, create_test_api_keys, create_test_documents, create_test_dump, create_test_instance_uid,
create_test_instance_uid, create_test_settings, create_test_tasks, create_test_settings, create_test_tasks,
}; };
fn create_directory_hierarchy(dir: &Path) -> String { fn create_directory_hierarchy(dir: &Path) -> String {
@ -336,10 +281,8 @@ pub(crate) mod test {
let dump_path = dump.path(); let dump_path = dump.path();
// ==== checking global file hierarchy (we want to be sure there isn't too many files or too few) // ==== checking global file hierarchy (we want to be sure there isn't too many files or too few)
insta::assert_snapshot!(create_directory_hierarchy(dump_path), @r" insta::assert_snapshot!(create_directory_hierarchy(dump_path), @r###"
. .
├---- batches/
│ └---- queue.jsonl
├---- indexes/ ├---- indexes/
│ └---- doggos/ │ └---- doggos/
│ │ ├---- documents.jsonl │ │ ├---- documents.jsonl
@ -352,9 +295,8 @@ pub(crate) mod test {
├---- experimental-features.json ├---- experimental-features.json
├---- instance_uid.uuid ├---- instance_uid.uuid
├---- keys.jsonl ├---- keys.jsonl
├---- metadata.json └---- metadata.json
â””---- network.json "###);
");
// ==== checking the top level infos // ==== checking the top level infos
let metadata = fs::read_to_string(dump_path.join("metadata.json")).unwrap(); let metadata = fs::read_to_string(dump_path.join("metadata.json")).unwrap();
@ -407,16 +349,6 @@ pub(crate) mod test {
} }
} }
// ==== checking the batch queue
let batches_queue = fs::read_to_string(dump_path.join("batches/queue.jsonl")).unwrap();
for (batch, expected) in batches_queue.lines().zip(create_test_batches()) {
let mut batch = serde_json::from_str::<Batch>(batch).unwrap();
if batch.details.settings == Some(Box::new(Settings::<Unchecked>::default())) {
batch.details.settings = None;
}
assert_eq!(batch, expected, "{batch:#?}{expected:#?}");
}
// ==== checking the keys // ==== checking the keys
let keys = fs::read_to_string(dump_path.join("keys.jsonl")).unwrap(); let keys = fs::read_to_string(dump_path.join("keys.jsonl")).unwrap();
for (key, expected) in keys.lines().zip(create_test_api_keys()) { for (key, expected) in keys.lines().zip(create_test_api_keys()) {

View File

@ -11,7 +11,7 @@ edition.workspace = true
license.workspace = true license.workspace = true
[dependencies] [dependencies]
tempfile = "3.20.0" tempfile = "3.15.0"
thiserror = "2.0.12" thiserror = "2.0.9"
tracing = "0.1.41" tracing = "0.1.41"
uuid = { version = "1.17.0", features = ["serde", "v4"] } uuid = { version = "1.11.0", features = ["serde", "v4"] }

View File

@ -14,7 +14,7 @@ license.workspace = true
[dependencies] [dependencies]
nom = "7.1.3" nom = "7.1.3"
nom_locate = "4.2.0" nom_locate = "4.2.0"
unescaper = "0.1.6" unescaper = "0.1.5"
[dev-dependencies] [dev-dependencies]
# fixed version due to format breakages in v1.40 # fixed version due to format breakages in v1.40

View File

@ -30,25 +30,6 @@ pub enum Condition<'a> {
StartsWith { keyword: Token<'a>, word: Token<'a> }, StartsWith { keyword: Token<'a>, word: Token<'a> },
} }
impl Condition<'_> {
pub fn operator(&self) -> &str {
match self {
Condition::GreaterThan(_) => ">",
Condition::GreaterThanOrEqual(_) => ">=",
Condition::Equal(_) => "=",
Condition::NotEqual(_) => "!=",
Condition::Null => "IS NULL",
Condition::Empty => "IS EMPTY",
Condition::Exists => "EXISTS",
Condition::LowerThan(_) => "<",
Condition::LowerThanOrEqual(_) => "<=",
Condition::Between { .. } => "TO",
Condition::Contains { .. } => "CONTAINS",
Condition::StartsWith { .. } => "STARTS WITH",
}
}
}
/// condition = value ("==" | ">" ...) value /// condition = value ("==" | ">" ...) value
pub fn parse_condition(input: Span) -> IResult<FilterCondition> { pub fn parse_condition(input: Span) -> IResult<FilterCondition> {
let operator = alt((tag("<="), tag(">="), tag("!="), tag("<"), tag(">"), tag("="))); let operator = alt((tag("<="), tag(">="), tag("!="), tag("<"), tag(">"), tag("=")));

View File

@ -35,7 +35,7 @@ impl<E> NomErrorExt<E> for nom::Err<E> {
pub fn cut_with_err<'a, O>( pub fn cut_with_err<'a, O>(
mut parser: impl FnMut(Span<'a>) -> IResult<'a, O>, mut parser: impl FnMut(Span<'a>) -> IResult<'a, O>,
mut with: impl FnMut(Error<'a>) -> Error<'a>, mut with: impl FnMut(Error<'a>) -> Error<'a>,
) -> impl FnMut(Span<'a>) -> IResult<'a, O> { ) -> impl FnMut(Span<'a>) -> IResult<O> {
move |input| match parser.parse(input) { move |input| match parser.parse(input) {
Err(nom::Err::Error(e)) => Err(nom::Err::Failure(with(e))), Err(nom::Err::Error(e)) => Err(nom::Err::Failure(with(e))),
rest => rest, rest => rest,
@ -121,7 +121,7 @@ impl<'a> ParseError<Span<'a>> for Error<'a> {
} }
} }
impl Display for Error<'_> { impl<'a> Display for Error<'a> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let input = self.context.fragment(); let input = self.context.fragment();
// When printing our error message we want to escape all `\n` to be sure we keep our format with the // When printing our error message we want to escape all `\n` to be sure we keep our format with the

View File

@ -80,7 +80,7 @@ pub struct Token<'a> {
value: Option<String>, value: Option<String>,
} }
impl PartialEq for Token<'_> { impl<'a> PartialEq for Token<'a> {
fn eq(&self, other: &Self) -> bool { fn eq(&self, other: &Self) -> bool {
self.span.fragment() == other.span.fragment() self.span.fragment() == other.span.fragment()
} }
@ -226,7 +226,7 @@ impl<'a> FilterCondition<'a> {
} }
} }
pub fn parse(input: &'a str) -> Result<Option<Self>, Error<'a>> { pub fn parse(input: &'a str) -> Result<Option<Self>, Error> {
if input.trim().is_empty() { if input.trim().is_empty() {
return Ok(None); return Ok(None);
} }
@ -527,7 +527,7 @@ pub fn parse_filter(input: Span) -> IResult<FilterCondition> {
terminated(|input| parse_expression(input, 0), eof)(input) terminated(|input| parse_expression(input, 0), eof)(input)
} }
impl std::fmt::Display for FilterCondition<'_> { impl<'a> std::fmt::Display for FilterCondition<'a> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self { match self {
FilterCondition::Not(filter) => { FilterCondition::Not(filter) => {
@ -576,8 +576,7 @@ impl std::fmt::Display for FilterCondition<'_> {
} }
} }
} }
impl<'a> std::fmt::Display for Condition<'a> {
impl std::fmt::Display for Condition<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self { match self {
Condition::GreaterThan(token) => write!(f, "> {token}"), Condition::GreaterThan(token) => write!(f, "> {token}"),
@ -595,8 +594,7 @@ impl std::fmt::Display for Condition<'_> {
} }
} }
} }
impl<'a> std::fmt::Display for Token<'a> {
impl std::fmt::Display for Token<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{{{}}}", self.value()) write!(f, "{{{}}}", self.value())
} }

View File

@ -52,7 +52,7 @@ fn quoted_by(quote: char, input: Span) -> IResult<Token> {
} }
// word = (alphanumeric | _ | - | .)+ except for reserved keywords // word = (alphanumeric | _ | - | .)+ except for reserved keywords
pub fn word_not_keyword<'a>(input: Span<'a>) -> IResult<'a, Token<'a>> { pub fn word_not_keyword<'a>(input: Span<'a>) -> IResult<Token<'a>> {
let (input, word): (_, Token<'a>) = let (input, word): (_, Token<'a>) =
take_while1(is_value_component)(input).map(|(s, t)| (s, t.into()))?; take_while1(is_value_component)(input).map(|(s, t)| (s, t.into()))?;
if is_keyword(word.value()) { if is_keyword(word.value()) {

View File

@ -16,7 +16,7 @@ license.workspace = true
serde_json = "1.0" serde_json = "1.0"
[dev-dependencies] [dev-dependencies]
criterion = { version = "0.6.0", features = ["html_reports"] } criterion = { version = "0.5.1", features = ["html_reports"] }
[[bench]] [[bench]]
name = "benchmarks" name = "benchmarks"

View File

@ -12,11 +12,11 @@ license.workspace = true
[dependencies] [dependencies]
arbitrary = { version = "1.4.1", features = ["derive"] } arbitrary = { version = "1.4.1", features = ["derive"] }
bumpalo = "3.18.1" bumpalo = "3.16.0"
clap = { version = "4.5.40", features = ["derive"] } clap = { version = "4.5.24", features = ["derive"] }
either = "1.15.0" either = "1.13.0"
fastrand = "2.3.0" fastrand = "2.3.0"
milli = { path = "../milli" } milli = { path = "../milli" }
serde = { version = "1.0.219", features = ["derive"] } serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] } serde_json = { version = "1.0.135", features = ["preserve_order"] }
tempfile = "3.20.0" tempfile = "3.15.0"

View File

@ -12,8 +12,8 @@ use milli::documents::mmap_from_objects;
use milli::heed::EnvOpenOptions; use milli::heed::EnvOpenOptions;
use milli::progress::Progress; use milli::progress::Progress;
use milli::update::new::indexer; use milli::update::new::indexer;
use milli::update::IndexerConfig; use milli::update::{IndexDocumentsMethod, IndexerConfig};
use milli::vector::RuntimeEmbedders; use milli::vector::EmbeddingConfigs;
use milli::Index; use milli::Index;
use serde_json::Value; use serde_json::Value;
use tempfile::TempDir; use tempfile::TempDir;
@ -57,14 +57,13 @@ fn main() {
let opt = opt.clone(); let opt = opt.clone();
let handle = std::thread::spawn(move || { let handle = std::thread::spawn(move || {
let options = EnvOpenOptions::new(); let mut options = EnvOpenOptions::new();
let mut options = options.read_txn_without_tls();
options.map_size(1024 * 1024 * 1024 * 1024); options.map_size(1024 * 1024 * 1024 * 1024);
let tempdir = match opt.path { let tempdir = match opt.path {
Some(path) => TempDir::new_in(path).unwrap(), Some(path) => TempDir::new_in(path).unwrap(),
None => TempDir::new().unwrap(), None => TempDir::new().unwrap(),
}; };
let index = Index::new(options, tempdir.path(), true).unwrap(); let index = Index::new(options, tempdir.path()).unwrap();
let indexer_config = IndexerConfig::default(); let indexer_config = IndexerConfig::default();
std::thread::scope(|s| { std::thread::scope(|s| {
@ -89,8 +88,10 @@ fn main() {
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let embedders = RuntimeEmbedders::default(); let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer = indexer::DocumentOperation::new(
IndexDocumentsMethod::ReplaceDocuments,
);
let mut operations = Vec::new(); let mut operations = Vec::new();
for op in batch.0 { for op in batch.0 {
@ -114,7 +115,7 @@ fn main() {
for op in &operations { for op in &operations {
match op { match op {
Either::Left(documents) => { Either::Left(documents) => {
indexer.replace_documents(documents).unwrap() indexer.add_documents(documents).unwrap()
} }
Either::Right(ids) => indexer.delete_documents(ids), Either::Right(ids) => indexer.delete_documents(ids),
} }
@ -144,13 +145,12 @@ fn main() {
embedders, embedders,
&|| false, &|| false,
&Progress::default(), &Progress::default(),
&Default::default(),
) )
.unwrap(); .unwrap();
// after executing a batch we check if the database is corrupted // after executing a batch we check if the database is corrupted
let res = index.search(&wtxn).execute().unwrap(); let res = index.search(&wtxn).execute().unwrap();
index.documents(&wtxn, res.documents_ids).unwrap(); index.compressed_documents(&wtxn, res.documents_ids).unwrap();
progression.fetch_add(1, Ordering::Relaxed); progression.fetch_add(1, Ordering::Relaxed);
} }
wtxn.abort(); wtxn.abort();

View File

@ -11,31 +11,29 @@ edition.workspace = true
license.workspace = true license.workspace = true
[dependencies] [dependencies]
anyhow = "1.0.98" anyhow = "1.0.95"
bincode = "1.3.3" bincode = "1.3.3"
byte-unit = "5.1.6" bumpalo = "3.16.0"
bumpalo = "3.18.1"
bumparaw-collections = "0.1.4" bumparaw-collections = "0.1.4"
convert_case = "0.8.0" convert_case = "0.6.0"
csv = "1.3.1" csv = "1.3.1"
derive_builder = "0.20.2" derive_builder = "0.20.2"
dump = { path = "../dump" } dump = { path = "../dump" }
enum-iterator = "2.1.0" enum-iterator = "2.1.0"
file-store = { path = "../file-store" } file-store = { path = "../file-store" }
flate2 = "1.1.2" flate2 = "1.0.35"
indexmap = "2.9.0"
meilisearch-auth = { path = "../meilisearch-auth" } meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" } meilisearch-types = { path = "../meilisearch-types" }
memmap2 = "0.9.5" memmap2 = "0.9.5"
page_size = "0.6.0" page_size = "0.6.0"
rayon = "1.10.0" rayon = "1.10.0"
roaring = { version = "0.10.12", features = ["serde"] } roaring = { version = "0.10.10", features = ["serde"] }
serde = { version = "1.0.219", features = ["derive"] } serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] } serde_json = { version = "1.0.135", features = ["preserve_order"] }
synchronoise = "1.0.1" synchronoise = "1.0.1"
tempfile = "3.20.0" tempfile = "3.15.0"
thiserror = "2.0.12" thiserror = "2.0.9"
time = { version = "0.3.41", features = [ time = { version = "0.3.37", features = [
"serde-well-known", "serde-well-known",
"formatting", "formatting",
"parsing", "parsing",
@ -43,12 +41,12 @@ time = { version = "0.3.41", features = [
] } ] }
tracing = "0.1.41" tracing = "0.1.41"
ureq = "2.12.1" ureq = "2.12.1"
uuid = { version = "1.17.0", features = ["serde", "v4"] } uuid = { version = "1.11.0", features = ["serde", "v4"] }
backoff = "0.4.0"
[dev-dependencies] [dev-dependencies]
arroy = "0.5.0"
big_s = "1.0.2" big_s = "1.0.2"
crossbeam-channel = "0.5.15" crossbeam-channel = "0.5.14"
# fixed version due to format breakages in v1.40 # fixed version due to format breakages in v1.40
insta = { version = "=1.39.0", features = ["json", "redactions"] } insta = { version = "=1.39.0", features = ["json", "redactions"] }
maplit = "1.0.2" maplit = "1.0.2"

View File

@ -1,11 +1,8 @@
use std::collections::HashMap; use std::collections::HashMap;
use std::io;
use dump::{KindDump, TaskDump, UpdateFile}; use dump::{KindDump, TaskDump, UpdateFile};
use meilisearch_types::batches::{Batch, BatchId};
use meilisearch_types::heed::RwTxn; use meilisearch_types::heed::RwTxn;
use meilisearch_types::index_uid_pattern::IndexUidPattern; use meilisearch_types::milli::documents::DocumentsBatchBuilder;
use meilisearch_types::milli;
use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use uuid::Uuid; use uuid::Uuid;
@ -16,15 +13,9 @@ pub struct Dump<'a> {
index_scheduler: &'a IndexScheduler, index_scheduler: &'a IndexScheduler,
wtxn: RwTxn<'a>, wtxn: RwTxn<'a>,
batch_to_task_mapping: HashMap<BatchId, RoaringBitmap>,
indexes: HashMap<String, RoaringBitmap>, indexes: HashMap<String, RoaringBitmap>,
statuses: HashMap<Status, RoaringBitmap>, statuses: HashMap<Status, RoaringBitmap>,
kinds: HashMap<Kind, RoaringBitmap>, kinds: HashMap<Kind, RoaringBitmap>,
batch_indexes: HashMap<String, RoaringBitmap>,
batch_statuses: HashMap<Status, RoaringBitmap>,
batch_kinds: HashMap<Kind, RoaringBitmap>,
} }
impl<'a> Dump<'a> { impl<'a> Dump<'a> {
@ -35,72 +26,12 @@ impl<'a> Dump<'a> {
Ok(Dump { Ok(Dump {
index_scheduler, index_scheduler,
wtxn, wtxn,
batch_to_task_mapping: HashMap::new(),
indexes: HashMap::new(), indexes: HashMap::new(),
statuses: HashMap::new(), statuses: HashMap::new(),
kinds: HashMap::new(), kinds: HashMap::new(),
batch_indexes: HashMap::new(),
batch_statuses: HashMap::new(),
batch_kinds: HashMap::new(),
}) })
} }
/// Register a new batch coming from a dump in the scheduler.
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
pub fn register_dumped_batch(&mut self, batch: Batch) -> Result<()> {
self.index_scheduler.queue.batches.all_batches.put(&mut self.wtxn, &batch.uid, &batch)?;
if let Some(enqueued_at) = batch.enqueued_at {
utils::insert_task_datetime(
&mut self.wtxn,
self.index_scheduler.queue.batches.enqueued_at,
enqueued_at.earliest,
batch.uid,
)?;
utils::insert_task_datetime(
&mut self.wtxn,
self.index_scheduler.queue.batches.enqueued_at,
enqueued_at.oldest,
batch.uid,
)?;
}
utils::insert_task_datetime(
&mut self.wtxn,
self.index_scheduler.queue.batches.started_at,
batch.started_at,
batch.uid,
)?;
if let Some(finished_at) = batch.finished_at {
utils::insert_task_datetime(
&mut self.wtxn,
self.index_scheduler.queue.batches.finished_at,
finished_at,
batch.uid,
)?;
}
for index in batch.stats.index_uids.keys() {
match self.batch_indexes.get_mut(index) {
Some(bitmap) => {
bitmap.insert(batch.uid);
}
None => {
let mut bitmap = RoaringBitmap::new();
bitmap.insert(batch.uid);
self.batch_indexes.insert(index.to_string(), bitmap);
}
};
}
for status in batch.stats.status.keys() {
self.batch_statuses.entry(*status).or_default().insert(batch.uid);
}
for kind in batch.stats.types.keys() {
self.batch_kinds.entry(*kind).or_default().insert(batch.uid);
}
Ok(())
}
/// Register a new task coming from a dump in the scheduler. /// Register a new task coming from a dump in the scheduler.
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running. /// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
pub fn register_dumped_task( pub fn register_dumped_task(
@ -108,19 +39,14 @@ impl<'a> Dump<'a> {
task: TaskDump, task: TaskDump,
content_file: Option<Box<UpdateFile>>, content_file: Option<Box<UpdateFile>>,
) -> Result<Task> { ) -> Result<Task> {
let task_has_no_docs = matches!(task.kind, KindDump::DocumentImport { documents_count, .. } if documents_count == 0);
let content_uuid = match content_file { let content_uuid = match content_file {
Some(content_file) if task.status == Status::Enqueued => { Some(content_file) if task.status == Status::Enqueued => {
let (uuid, file) = self.index_scheduler.queue.create_update_file(false)?; let (uuid, mut file) = self.index_scheduler.queue.create_update_file(false)?;
let mut writer = io::BufWriter::new(file); let mut builder = DocumentsBatchBuilder::new(&mut file);
for doc in content_file { for doc in content_file {
let doc = doc?; builder.append_json_object(&doc?)?;
serde_json::to_writer(&mut writer, &doc).map_err(|e| {
Error::from_milli(milli::InternalError::SerdeJson(e).into(), None)
})?;
} }
let file = writer.into_inner().map_err(|e| e.into_error())?; builder.into_inner()?;
file.persist()?; file.persist()?;
Some(uuid) Some(uuid)
@ -128,12 +54,6 @@ impl<'a> Dump<'a> {
// If the task isn't `Enqueued` then just generate a recognisable `Uuid` // If the task isn't `Enqueued` then just generate a recognisable `Uuid`
// in case we try to open it later. // in case we try to open it later.
_ if task.status != Status::Enqueued => Some(Uuid::nil()), _ if task.status != Status::Enqueued => Some(Uuid::nil()),
None if task.status == Status::Enqueued && task_has_no_docs => {
let (uuid, file) = self.index_scheduler.queue.create_update_file(false)?;
file.persist()?;
Some(uuid)
}
_ => None, _ => None,
}; };
@ -212,31 +132,10 @@ impl<'a> Dump<'a> {
KindWithContent::DumpCreation { keys, instance_uid } KindWithContent::DumpCreation { keys, instance_uid }
} }
KindDump::SnapshotCreation => KindWithContent::SnapshotCreation, KindDump::SnapshotCreation => KindWithContent::SnapshotCreation,
KindDump::Export { url, api_key, payload_size, indexes } => {
KindWithContent::Export {
url,
api_key,
payload_size,
indexes: indexes
.into_iter()
.map(|(pattern, settings)| {
Ok((
IndexUidPattern::try_from(pattern)
.map_err(|_| Error::CorruptedDump)?,
settings,
))
})
.collect::<Result<_, Error>>()?,
}
}
KindDump::UpgradeDatabase { from } => KindWithContent::UpgradeDatabase { from },
}, },
}; };
self.index_scheduler.queue.tasks.all_tasks.put(&mut self.wtxn, &task.uid, &task)?; self.index_scheduler.queue.tasks.all_tasks.put(&mut self.wtxn, &task.uid, &task)?;
if let Some(batch_id) = task.batch_uid {
self.batch_to_task_mapping.entry(batch_id).or_default().insert(task.uid);
}
for index in task.indexes() { for index in task.indexes() {
match self.indexes.get_mut(index) { match self.indexes.get_mut(index) {
@ -286,14 +185,6 @@ impl<'a> Dump<'a> {
/// Commit all the changes and exit the importing dump state /// Commit all the changes and exit the importing dump state
pub fn finish(mut self) -> Result<()> { pub fn finish(mut self) -> Result<()> {
for (batch_id, task_ids) in self.batch_to_task_mapping {
self.index_scheduler.queue.batch_to_tasks_mapping.put(
&mut self.wtxn,
&batch_id,
&task_ids,
)?;
}
for (index, bitmap) in self.indexes { for (index, bitmap) in self.indexes {
self.index_scheduler.queue.tasks.index_tasks.put(&mut self.wtxn, &index, &bitmap)?; self.index_scheduler.queue.tasks.index_tasks.put(&mut self.wtxn, &index, &bitmap)?;
} }
@ -304,16 +195,6 @@ impl<'a> Dump<'a> {
self.index_scheduler.queue.tasks.put_kind(&mut self.wtxn, kind, &bitmap)?; self.index_scheduler.queue.tasks.put_kind(&mut self.wtxn, kind, &bitmap)?;
} }
for (index, bitmap) in self.batch_indexes {
self.index_scheduler.queue.batches.index_tasks.put(&mut self.wtxn, &index, &bitmap)?;
}
for (status, bitmap) in self.batch_statuses {
self.index_scheduler.queue.batches.put_status(&mut self.wtxn, status, &bitmap)?;
}
for (kind, bitmap) in self.batch_kinds {
self.index_scheduler.queue.batches.put_kind(&mut self.wtxn, kind, &bitmap)?;
}
self.wtxn.commit()?; self.wtxn.commit()?;
self.index_scheduler.scheduler.wake_up.signal(); self.index_scheduler.scheduler.wake_up.signal();

View File

@ -2,7 +2,6 @@ use std::fmt::Display;
use meilisearch_types::batches::BatchId; use meilisearch_types::batches::BatchId;
use meilisearch_types::error::{Code, ErrorCode}; use meilisearch_types::error::{Code, ErrorCode};
use meilisearch_types::milli::index::RollbackOutcome;
use meilisearch_types::tasks::{Kind, Status}; use meilisearch_types::tasks::{Kind, Status};
use meilisearch_types::{heed, milli}; use meilisearch_types::{heed, milli};
use thiserror::Error; use thiserror::Error;
@ -110,8 +109,6 @@ pub enum Error {
InvalidIndexUid { index_uid: String }, InvalidIndexUid { index_uid: String },
#[error("Task `{0}` not found.")] #[error("Task `{0}` not found.")]
TaskNotFound(TaskId), TaskNotFound(TaskId),
#[error("Task `{0}` does not contain any documents. Only `documentAdditionOrUpdate` tasks with the statuses `enqueued` or `processing` contain documents")]
TaskFileNotFound(TaskId),
#[error("Batch `{0}` not found.")] #[error("Batch `{0}` not found.")]
BatchNotFound(BatchId), BatchNotFound(BatchId),
#[error("Query parameters to filter the tasks to delete are missing. Available query parameters are: `uids`, `indexUids`, `statuses`, `types`, `canceledBy`, `beforeEnqueuedAt`, `afterEnqueuedAt`, `beforeStartedAt`, `afterStartedAt`, `beforeFinishedAt`, `afterFinishedAt`.")] #[error("Query parameters to filter the tasks to delete are missing. Available query parameters are: `uids`, `indexUids`, `statuses`, `types`, `canceledBy`, `beforeEnqueuedAt`, `afterEnqueuedAt`, `beforeStartedAt`, `afterStartedAt`, `beforeFinishedAt`, `afterFinishedAt`.")]
@ -130,8 +127,8 @@ pub enum Error {
_ => format!("{error}") _ => format!("{error}")
})] })]
Milli { error: milli::Error, index_uid: Option<String> }, Milli { error: milli::Error, index_uid: Option<String> },
#[error("An unexpected crash occurred when processing the task: {0}")] #[error("An unexpected crash occurred when processing the task.")]
ProcessBatchPanicked(String), ProcessBatchPanicked,
#[error(transparent)] #[error(transparent)]
FileStore(#[from] file_store::Error), FileStore(#[from] file_store::Error),
#[error(transparent)] #[error(transparent)]
@ -150,29 +147,7 @@ pub enum Error {
#[error("Corrupted task queue.")] #[error("Corrupted task queue.")]
CorruptedTaskQueue, CorruptedTaskQueue,
#[error(transparent)] #[error(transparent)]
DatabaseUpgrade(Box<Self>), TaskDatabaseUpdate(Box<Self>),
#[error(transparent)]
Export(Box<Self>),
#[error("Failed to export documents to remote server {code} ({type}): {message} <{link}>")]
FromRemoteWhenExporting { message: String, code: String, r#type: String, link: String },
#[error("Failed to rollback for index `{index}`: {rollback_outcome} ")]
RollbackFailed { index: String, rollback_outcome: RollbackOutcome },
#[error(transparent)]
UnrecoverableError(Box<Self>),
#[error("The index scheduler is in version v{}.{}.{}, but Meilisearch is in version v{}.{}.{}.\n - hint: start the correct version of Meilisearch, or consider updating your database. See also <https://www.meilisearch.com/docs/learn/update_and_migration/updating>",
index_scheduler_version.0, index_scheduler_version.1, index_scheduler_version.2,
package_version.0, package_version.1, package_version.2)]
IndexSchedulerVersionMismatch {
index_scheduler_version: (u32, u32, u32),
package_version: (u32, u32, u32),
},
#[error("Index `{index}` is in version v{}.{}.{}, but Meilisearch is in version v{}.{}.{}.\n - note: this is an internal error, please consider filing a bug report: <https://github.com/meilisearch/meilisearch/issues/new?template=bug_report.md>",
index_version.0, index_version.1, index_version.2, package_version.0, package_version.1, package_version.2)]
IndexVersionMismatch {
index: String,
index_version: (u32, u32, u32),
package_version: (u32, u32, u32),
},
#[error(transparent)] #[error(transparent)]
HeedTransaction(heed::Error), HeedTransaction(heed::Error),
@ -212,29 +187,22 @@ impl Error {
| Error::InvalidTaskCanceledBy { .. } | Error::InvalidTaskCanceledBy { .. }
| Error::InvalidIndexUid { .. } | Error::InvalidIndexUid { .. }
| Error::TaskNotFound(_) | Error::TaskNotFound(_)
| Error::TaskFileNotFound(_)
| Error::BatchNotFound(_) | Error::BatchNotFound(_)
| Error::TaskDeletionWithEmptyQuery | Error::TaskDeletionWithEmptyQuery
| Error::TaskCancelationWithEmptyQuery | Error::TaskCancelationWithEmptyQuery
| Error::FromRemoteWhenExporting { .. }
| Error::AbortedTask | Error::AbortedTask
| Error::Dump(_) | Error::Dump(_)
| Error::Heed(_) | Error::Heed(_)
| Error::Milli { .. } | Error::Milli { .. }
| Error::ProcessBatchPanicked(_) | Error::ProcessBatchPanicked
| Error::FileStore(_) | Error::FileStore(_)
| Error::IoError(_) | Error::IoError(_)
| Error::Persist(_) | Error::Persist(_)
| Error::FeatureNotEnabled(_) | Error::FeatureNotEnabled(_)
| Error::Export(_)
| Error::Anyhow(_) => true, | Error::Anyhow(_) => true,
Error::CreateBatch(_) Error::CreateBatch(_)
| Error::CorruptedTaskQueue | Error::CorruptedTaskQueue
| Error::DatabaseUpgrade(_) | Error::TaskDatabaseUpdate(_)
| Error::UnrecoverableError(_)
| Error::IndexSchedulerVersionMismatch { .. }
| Error::IndexVersionMismatch { .. }
| Error::RollbackFailed { .. }
| Error::HeedTransaction(_) => false, | Error::HeedTransaction(_) => false,
#[cfg(test)] #[cfg(test)]
Error::PlannedFailure => false, Error::PlannedFailure => false,
@ -279,7 +247,6 @@ impl ErrorCode for Error {
Error::InvalidTaskCanceledBy { .. } => Code::InvalidTaskCanceledBy, Error::InvalidTaskCanceledBy { .. } => Code::InvalidTaskCanceledBy,
Error::InvalidIndexUid { .. } => Code::InvalidIndexUid, Error::InvalidIndexUid { .. } => Code::InvalidIndexUid,
Error::TaskNotFound(_) => Code::TaskNotFound, Error::TaskNotFound(_) => Code::TaskNotFound,
Error::TaskFileNotFound(_) => Code::TaskFileNotFound,
Error::BatchNotFound(_) => Code::BatchNotFound, Error::BatchNotFound(_) => Code::BatchNotFound,
Error::TaskDeletionWithEmptyQuery => Code::MissingTaskFilters, Error::TaskDeletionWithEmptyQuery => Code::MissingTaskFilters,
Error::TaskCancelationWithEmptyQuery => Code::MissingTaskFilters, Error::TaskCancelationWithEmptyQuery => Code::MissingTaskFilters,
@ -287,8 +254,7 @@ impl ErrorCode for Error {
Error::NoSpaceLeftInTaskQueue => Code::NoSpaceLeftOnDevice, Error::NoSpaceLeftInTaskQueue => Code::NoSpaceLeftOnDevice,
Error::Dump(e) => e.error_code(), Error::Dump(e) => e.error_code(),
Error::Milli { error, .. } => error.error_code(), Error::Milli { error, .. } => error.error_code(),
Error::ProcessBatchPanicked(_) => Code::Internal, Error::ProcessBatchPanicked => Code::Internal,
Error::FromRemoteWhenExporting { .. } => Code::Internal,
Error::Heed(e) => e.error_code(), Error::Heed(e) => e.error_code(),
Error::HeedTransaction(e) => e.error_code(), Error::HeedTransaction(e) => e.error_code(),
Error::FileStore(e) => e.error_code(), Error::FileStore(e) => e.error_code(),
@ -300,12 +266,7 @@ impl ErrorCode for Error {
Error::Anyhow(_) => Code::Internal, Error::Anyhow(_) => Code::Internal,
Error::CorruptedTaskQueue => Code::Internal, Error::CorruptedTaskQueue => Code::Internal,
Error::CorruptedDump => Code::Internal, Error::CorruptedDump => Code::Internal,
Error::DatabaseUpgrade(_) => Code::Internal, Error::TaskDatabaseUpdate(_) => Code::Internal,
Error::Export(_) => Code::Internal,
Error::RollbackFailed { .. } => Code::Internal,
Error::UnrecoverableError(_) => Code::Internal,
Error::IndexSchedulerVersionMismatch { .. } => Code::Internal,
Error::IndexVersionMismatch { .. } => Code::Internal,
Error::CreateBatch(_) => Code::Internal, Error::CreateBatch(_) => Code::Internal,
// This one should never be seen by the end user // This one should never be seen by the end user

View File

@ -1,29 +1,18 @@
use std::sync::{Arc, RwLock}; use std::sync::{Arc, RwLock};
use meilisearch_types::features::{InstanceTogglableFeatures, Network, RuntimeTogglableFeatures}; use meilisearch_types::features::{InstanceTogglableFeatures, RuntimeTogglableFeatures};
use meilisearch_types::heed::types::{SerdeJson, Str}; use meilisearch_types::heed::types::{SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RwTxn, WithoutTls}; use meilisearch_types::heed::{Database, Env, RwTxn};
use crate::error::FeatureNotEnabledError; use crate::error::FeatureNotEnabledError;
use crate::Result; use crate::Result;
/// The number of database used by features const EXPERIMENTAL_FEATURES: &str = "experimental-features";
const NUMBER_OF_DATABASES: u32 = 1;
/// Database const names for the `FeatureData`.
mod db_name {
pub const EXPERIMENTAL_FEATURES: &str = "experimental-features";
}
mod db_keys {
pub const EXPERIMENTAL_FEATURES: &str = "experimental-features";
pub const NETWORK: &str = "network";
}
#[derive(Clone)] #[derive(Clone)]
pub(crate) struct FeatureData { pub(crate) struct FeatureData {
persisted: Database<Str, SerdeJson<RuntimeTogglableFeatures>>, persisted: Database<Str, SerdeJson<RuntimeTogglableFeatures>>,
runtime: Arc<RwLock<RuntimeTogglableFeatures>>, runtime: Arc<RwLock<RuntimeTogglableFeatures>>,
network: Arc<RwLock<Network>>,
} }
#[derive(Debug, Clone, Copy)] #[derive(Debug, Clone, Copy)]
@ -92,88 +81,17 @@ impl RoFeatures {
.into()) .into())
} }
} }
pub fn check_network(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.network {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "network",
issue_link: "https://github.com/orgs/meilisearch/discussions/805",
}
.into())
}
}
pub fn check_get_task_documents_route(&self) -> Result<()> {
if self.runtime.get_task_documents_route {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Getting the documents of an enqueued task",
feature: "get task documents route",
issue_link: "https://github.com/orgs/meilisearch/discussions/808",
}
.into())
}
}
pub fn check_composite_embedders(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.composite_embedders {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "composite embedders",
issue_link: "https://github.com/orgs/meilisearch/discussions/816",
}
.into())
}
}
pub fn check_chat_completions(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.chat_completions {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "chat completions",
issue_link: "https://github.com/orgs/meilisearch/discussions/835",
}
.into())
}
}
pub fn check_multimodal(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.multimodal {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "multimodal",
issue_link: "https://github.com/orgs/meilisearch/discussions/846",
}
.into())
}
}
} }
impl FeatureData { impl FeatureData {
pub(crate) const fn nb_db() -> u32 { pub fn new(env: &Env, instance_features: InstanceTogglableFeatures) -> Result<Self> {
NUMBER_OF_DATABASES let mut wtxn = env.write_txn()?;
} let runtime_features_db = env.create_database(&mut wtxn, Some(EXPERIMENTAL_FEATURES))?;
wtxn.commit()?;
pub fn new(
env: &Env<WithoutTls>,
wtxn: &mut RwTxn,
instance_features: InstanceTogglableFeatures,
) -> Result<Self> {
let runtime_features_db =
env.create_database(wtxn, Some(db_name::EXPERIMENTAL_FEATURES))?;
let txn = env.read_txn()?;
let persisted_features: RuntimeTogglableFeatures = let persisted_features: RuntimeTogglableFeatures =
runtime_features_db.get(wtxn, db_keys::EXPERIMENTAL_FEATURES)?.unwrap_or_default(); runtime_features_db.get(&txn, EXPERIMENTAL_FEATURES)?.unwrap_or_default();
let InstanceTogglableFeatures { metrics, logs_route, contains_filter } = instance_features; let InstanceTogglableFeatures { metrics, logs_route, contains_filter } = instance_features;
let runtime = Arc::new(RwLock::new(RuntimeTogglableFeatures { let runtime = Arc::new(RwLock::new(RuntimeTogglableFeatures {
metrics: metrics || persisted_features.metrics, metrics: metrics || persisted_features.metrics,
@ -182,14 +100,7 @@ impl FeatureData {
..persisted_features ..persisted_features
})); }));
let network_db = runtime_features_db.remap_data_type::<SerdeJson<Network>>(); Ok(Self { persisted: runtime_features_db, runtime })
let network: Network = network_db.get(wtxn, db_keys::NETWORK)?.unwrap_or_default();
Ok(Self {
persisted: runtime_features_db,
runtime,
network: Arc::new(RwLock::new(network)),
})
} }
pub fn put_runtime_features( pub fn put_runtime_features(
@ -197,7 +108,7 @@ impl FeatureData {
mut wtxn: RwTxn, mut wtxn: RwTxn,
features: RuntimeTogglableFeatures, features: RuntimeTogglableFeatures,
) -> Result<()> { ) -> Result<()> {
self.persisted.put(&mut wtxn, db_keys::EXPERIMENTAL_FEATURES, &features)?; self.persisted.put(&mut wtxn, EXPERIMENTAL_FEATURES, &features)?;
wtxn.commit()?; wtxn.commit()?;
// safe to unwrap, the lock will only fail if: // safe to unwrap, the lock will only fail if:
@ -218,21 +129,4 @@ impl FeatureData {
pub fn features(&self) -> RoFeatures { pub fn features(&self) -> RoFeatures {
RoFeatures::new(self) RoFeatures::new(self)
} }
pub fn put_network(&self, mut wtxn: RwTxn, new_network: Network) -> Result<()> {
self.persisted.remap_data_type::<SerdeJson<Network>>().put(
&mut wtxn,
db_keys::NETWORK,
&new_network,
)?;
wtxn.commit()?;
let mut network = self.network.write().unwrap();
*network = new_network;
Ok(())
}
pub fn network(&self) -> Network {
Network::clone(&*self.network.read().unwrap())
}
} }

View File

@ -1,7 +1,5 @@
use std::collections::BTreeMap; use std::collections::BTreeMap;
use std::env::VarError;
use std::path::Path; use std::path::Path;
use std::str::FromStr;
use std::time::Duration; use std::time::Duration;
use meilisearch_types::heed::{EnvClosingEvent, EnvFlags, EnvOpenOptions}; use meilisearch_types::heed::{EnvClosingEvent, EnvFlags, EnvOpenOptions};
@ -104,7 +102,7 @@ impl ReopenableIndex {
return Ok(()); return Ok(());
} }
map.unavailable.remove(&self.uuid); map.unavailable.remove(&self.uuid);
map.create(&self.uuid, path, None, self.enable_mdb_writemap, self.map_size, false)?; map.create(&self.uuid, path, None, self.enable_mdb_writemap, self.map_size)?;
} }
Ok(()) Ok(())
} }
@ -173,12 +171,11 @@ impl IndexMap {
date: Option<(OffsetDateTime, OffsetDateTime)>, date: Option<(OffsetDateTime, OffsetDateTime)>,
enable_mdb_writemap: bool, enable_mdb_writemap: bool,
map_size: usize, map_size: usize,
creation: bool,
) -> Result<Index> { ) -> Result<Index> {
if !matches!(self.get_unavailable(uuid), Missing) { if !matches!(self.get_unavailable(uuid), Missing) {
panic!("Attempt to open an index that was unavailable"); panic!("Attempt to open an index that was unavailable");
} }
let index = create_or_open_index(path, date, enable_mdb_writemap, map_size, creation)?; let index = create_or_open_index(path, date, enable_mdb_writemap, map_size)?;
match self.available.insert(*uuid, index.clone()) { match self.available.insert(*uuid, index.clone()) {
InsertionOutcome::InsertedNew => (), InsertionOutcome::InsertedNew => (),
InsertionOutcome::Evicted(evicted_uuid, evicted_index) => { InsertionOutcome::Evicted(evicted_uuid, evicted_index) => {
@ -302,31 +299,18 @@ fn create_or_open_index(
date: Option<(OffsetDateTime, OffsetDateTime)>, date: Option<(OffsetDateTime, OffsetDateTime)>,
enable_mdb_writemap: bool, enable_mdb_writemap: bool,
map_size: usize, map_size: usize,
creation: bool,
) -> Result<Index> { ) -> Result<Index> {
let options = EnvOpenOptions::new(); let mut options = EnvOpenOptions::new();
let mut options = options.read_txn_without_tls();
options.map_size(clamp_to_page_size(map_size)); options.map_size(clamp_to_page_size(map_size));
options.max_readers(1024);
// You can find more details about this experimental
// environment variable on the following GitHub discussion:
// <https://github.com/orgs/meilisearch/discussions/806>
let max_readers = match std::env::var("MEILI_EXPERIMENTAL_INDEX_MAX_READERS") {
Ok(value) => u32::from_str(&value).unwrap(),
Err(VarError::NotPresent) => 1024,
Err(VarError::NotUnicode(value)) => panic!(
"Invalid unicode for the `MEILI_EXPERIMENTAL_INDEX_MAX_READERS` env var: {value:?}"
),
};
options.max_readers(max_readers);
if enable_mdb_writemap { if enable_mdb_writemap {
unsafe { options.flags(EnvFlags::WRITE_MAP) }; unsafe { options.flags(EnvFlags::WRITE_MAP) };
} }
if let Some((created, updated)) = date { if let Some((created, updated)) = date {
Ok(Index::new_with_creation_dates(options, path, created, updated, creation)?) Ok(Index::new_with_creation_dates(options, path, created, updated)?)
} else { } else {
Ok(Index::new(options, path, creation)?) Ok(Index::new(options, path)?)
} }
} }
@ -334,7 +318,7 @@ fn create_or_open_index(
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use meilisearch_types::heed::{Env, WithoutTls}; use meilisearch_types::heed::Env;
use meilisearch_types::Index; use meilisearch_types::Index;
use uuid::Uuid; use uuid::Uuid;
@ -344,7 +328,7 @@ mod tests {
use crate::IndexScheduler; use crate::IndexScheduler;
impl IndexMapper { impl IndexMapper {
fn test() -> (Self, Env<WithoutTls>, IndexSchedulerHandle) { fn test() -> (Self, Env, IndexSchedulerHandle) {
let (index_scheduler, handle) = IndexScheduler::test(true, vec![]); let (index_scheduler, handle) = IndexScheduler::test(true, vec![]);
(index_scheduler.index_mapper, index_scheduler.env, handle) (index_scheduler.index_mapper, index_scheduler.env, handle)
} }

View File

@ -4,10 +4,8 @@ use std::time::Duration;
use std::{fs, thread}; use std::{fs, thread};
use meilisearch_types::heed::types::{SerdeJson, Str}; use meilisearch_types::heed::types::{SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls}; use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli; use meilisearch_types::milli;
use meilisearch_types::milli::database_stats::DatabaseStats;
use meilisearch_types::milli::index::RollbackOutcome;
use meilisearch_types::milli::update::IndexerConfig; use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::{FieldDistribution, Index}; use meilisearch_types::milli::{FieldDistribution, Index};
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
@ -22,13 +20,8 @@ use crate::{Error, IndexBudget, IndexSchedulerOptions, Result};
mod index_map; mod index_map;
/// The number of database used by index mapper const INDEX_MAPPING: &str = "index-mapping";
const NUMBER_OF_DATABASES: u32 = 2; const INDEX_STATS: &str = "index-stats";
/// Database const names for the `IndexMapper`.
mod db_name {
pub const INDEX_MAPPING: &str = "index-mapping";
pub const INDEX_STATS: &str = "index-stats";
}
/// Structure managing meilisearch's indexes. /// Structure managing meilisearch's indexes.
/// ///
@ -100,25 +93,14 @@ pub enum IndexStatus {
/// The statistics that can be computed from an `Index` object. /// The statistics that can be computed from an `Index` object.
#[derive(Serialize, Deserialize, Debug)] #[derive(Serialize, Deserialize, Debug)]
pub struct IndexStats { pub struct IndexStats {
/// Stats of the documents database. /// Number of documents in the index.
#[serde(default)] pub number_of_documents: u64,
pub documents_database_stats: DatabaseStats,
#[serde(default, skip_serializing)]
pub number_of_documents: Option<u64>,
/// Size taken up by the index' DB, in bytes. /// Size taken up by the index' DB, in bytes.
/// ///
/// This includes the size taken by both the used and free pages of the DB, and as the free pages /// This includes the size taken by both the used and free pages of the DB, and as the free pages
/// are not returned to the disk after a deletion, this number is typically larger than /// are not returned to the disk after a deletion, this number is typically larger than
/// `used_database_size` that only includes the size of the used pages. /// `used_database_size` that only includes the size of the used pages.
pub database_size: u64, pub database_size: u64,
/// Number of embeddings in the index.
/// Option: retrocompatible with the stats of the pre-v1.13.0 versions of meilisearch
pub number_of_embeddings: Option<u64>,
/// Number of embedded documents in the index.
/// Option: retrocompatible with the stats of the pre-v1.13.0 versions of meilisearch
pub number_of_embedded_documents: Option<u64>,
/// Size taken by the used pages of the index' DB, in bytes. /// Size taken by the used pages of the index' DB, in bytes.
/// ///
/// As the DB backend does not return to the disk the pages that are not currently used by the DB, /// As the DB backend does not return to the disk the pages that are not currently used by the DB,
@ -143,12 +125,8 @@ impl IndexStats {
/// ///
/// - rtxn: a RO transaction for the index, obtained from `Index::read_txn()`. /// - rtxn: a RO transaction for the index, obtained from `Index::read_txn()`.
pub fn new(index: &Index, rtxn: &RoTxn) -> milli::Result<Self> { pub fn new(index: &Index, rtxn: &RoTxn) -> milli::Result<Self> {
let arroy_stats = index.arroy_stats(rtxn)?;
Ok(IndexStats { Ok(IndexStats {
number_of_embeddings: Some(arroy_stats.number_of_embeddings), number_of_documents: index.number_of_documents(rtxn)?,
number_of_embedded_documents: Some(arroy_stats.documents.len()),
documents_database_stats: index.documents_stats(rtxn)?.unwrap_or_default(),
number_of_documents: None,
database_size: index.on_disk_size()?, database_size: index.on_disk_size()?,
used_database_size: index.used_size()?, used_database_size: index.used_size()?,
primary_key: index.primary_key(rtxn)?.map(|s| s.to_string()), primary_key: index.primary_key(rtxn)?.map(|s| s.to_string()),
@ -160,20 +138,16 @@ impl IndexStats {
} }
impl IndexMapper { impl IndexMapper {
pub(crate) const fn nb_db() -> u32 {
NUMBER_OF_DATABASES
}
pub fn new( pub fn new(
env: &Env<WithoutTls>, env: &Env,
wtxn: &mut RwTxn, wtxn: &mut RwTxn,
options: &IndexSchedulerOptions, options: &IndexSchedulerOptions,
budget: IndexBudget, budget: IndexBudget,
) -> Result<Self> { ) -> Result<Self> {
Ok(Self { Ok(Self {
index_map: Arc::new(RwLock::new(IndexMap::new(budget.index_count))), index_map: Arc::new(RwLock::new(IndexMap::new(budget.index_count))),
index_mapping: env.create_database(wtxn, Some(db_name::INDEX_MAPPING))?, index_mapping: env.create_database(wtxn, Some(INDEX_MAPPING))?,
index_stats: env.create_database(wtxn, Some(db_name::INDEX_STATS))?, index_stats: env.create_database(wtxn, Some(INDEX_STATS))?,
base_path: options.indexes_path.clone(), base_path: options.indexes_path.clone(),
index_base_map_size: budget.map_size, index_base_map_size: budget.map_size,
index_growth_amount: options.index_growth_amount, index_growth_amount: options.index_growth_amount,
@ -215,7 +189,6 @@ impl IndexMapper {
date, date,
self.enable_mdb_writemap, self.enable_mdb_writemap,
self.index_base_map_size, self.index_base_map_size,
true,
) )
.map_err(|e| Error::from_milli(e, Some(uuid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(uuid.to_string())))?;
let index_rtxn = index.read_txn()?; let index_rtxn = index.read_txn()?;
@ -414,7 +387,6 @@ impl IndexMapper {
None, None,
self.enable_mdb_writemap, self.enable_mdb_writemap,
self.index_base_map_size, self.index_base_map_size,
false,
) )
.map_err(|e| Error::from_milli(e, Some(uuid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(uuid.to_string())))?;
} }
@ -432,51 +404,6 @@ impl IndexMapper {
Ok(index) Ok(index)
} }
pub fn rollback_index(
&self,
rtxn: &RoTxn,
name: &str,
to: (u32, u32, u32),
) -> Result<RollbackOutcome> {
// remove any currently updating index to make sure that we aren't keeping a reference to the index somewhere
drop(self.currently_updating_index.write().unwrap().take());
let uuid = self
.index_mapping
.get(rtxn, name)?
.ok_or_else(|| Error::IndexNotFound(name.to_string()))?;
// take the lock to make sure noone is messing with the indexes while we rollback
// this will block any search or other operation, but we are rollbacking so this is probably acceptable.
let mut index_map = self.index_map.write().unwrap();
'close_index: loop {
match index_map.get(&uuid) {
Available(_) => {
index_map.close_for_resize(&uuid, self.enable_mdb_writemap, 0);
// index should now be `Closing`; try again
continue;
}
// index already closed
Missing => break 'close_index,
// closing requested by this thread or another one; wait for closing to complete, then exit
Closing(closing_index) => {
if closing_index.wait_timeout(Duration::from_secs(100)).is_none() {
// release the lock so it doesn't get poisoned
drop(index_map);
panic!("cannot close index")
}
break;
}
BeingDeleted => return Err(Error::IndexNotFound(name.to_string())),
};
}
let index_path = self.base_path.join(uuid.to_string());
Index::rollback(milli::heed::EnvOpenOptions::new().read_txn_without_tls(), index_path, to)
.map_err(|err| crate::Error::from_milli(err, Some(name.to_string())))
}
/// Attempts `f` for each index that exists in the index mapper. /// Attempts `f` for each index that exists in the index mapper.
/// ///
/// It is preferable to use this function rather than a loop that opens all indexes, as a way to avoid having all indexes opened, /// It is preferable to use this function rather than a loop that opens all indexes, as a way to avoid having all indexes opened,

View File

@ -1,12 +1,11 @@
use std::collections::BTreeSet; use std::collections::BTreeSet;
use std::fmt::Write; use std::fmt::Write;
use meilisearch_types::batches::{Batch, BatchEnqueuedAt, BatchStats}; use meilisearch_types::batches::Batch;
use meilisearch_types::heed::types::{SerdeBincode, SerdeJson, Str}; use meilisearch_types::heed::types::{SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{Database, RoTxn}; use meilisearch_types::heed::{Database, RoTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32}; use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
use meilisearch_types::tasks::{Details, Kind, Status, Task}; use meilisearch_types::tasks::{Details, Kind, Status, Task};
use meilisearch_types::versioning;
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use crate::index_mapper::IndexMapper; use crate::index_mapper::IndexMapper;
@ -22,7 +21,6 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
cleanup_enabled: _, cleanup_enabled: _,
processing_tasks, processing_tasks,
env, env,
version,
queue, queue,
scheduler, scheduler,
@ -34,20 +32,12 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
planned_failures: _, planned_failures: _,
run_loop_iteration: _, run_loop_iteration: _,
embedders: _, embedders: _,
chat_settings: _,
} = scheduler; } = scheduler;
let rtxn = env.read_txn().unwrap(); let rtxn = env.read_txn().unwrap();
let mut snap = String::new(); let mut snap = String::new();
let indx_sched_version = version.get_version(&rtxn).unwrap();
let latest_version =
(versioning::VERSION_MAJOR, versioning::VERSION_MINOR, versioning::VERSION_PATCH);
if indx_sched_version != Some(latest_version) {
snap.push_str(&format!("index scheduler running on version {indx_sched_version:?}\n"));
}
let processing = processing_tasks.read().unwrap().clone(); let processing = processing_tasks.read().unwrap().clone();
snap.push_str(&format!("### Autobatching Enabled = {}\n", scheduler.autobatching_enabled)); snap.push_str(&format!("### Autobatching Enabled = {}\n", scheduler.autobatching_enabled));
snap.push_str(&format!( snap.push_str(&format!(
@ -289,12 +279,6 @@ fn snapshot_details(d: &Details) -> String {
Details::IndexSwap { swaps } => { Details::IndexSwap { swaps } => {
format!("{{ swaps: {swaps:?} }}") format!("{{ swaps: {swaps:?} }}")
} }
Details::Export { url, api_key, payload_size, indexes } => {
format!("{{ url: {url:?}, api_key: {api_key:?}, payload_size: {payload_size:?}, indexes: {indexes:?} }}")
}
Details::UpgradeDatabase { from, to } => {
format!("{{ from: {from:?}, to: {to:?} }}")
}
} }
} }
@ -342,41 +326,14 @@ pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec
pub fn snapshot_batch(batch: &Batch) -> String { pub fn snapshot_batch(batch: &Batch) -> String {
let mut snap = String::new(); let mut snap = String::new();
let Batch { let Batch { uid, details, stats, started_at, finished_at, progress: _ } = batch;
uid,
details,
stats,
embedder_stats,
started_at,
finished_at,
progress: _,
enqueued_at,
stop_reason,
} = batch;
let stats = BatchStats {
progress_trace: Default::default(),
internal_database_sizes: Default::default(),
write_channel_congestion: None,
..stats.clone()
};
if let Some(finished_at) = finished_at { if let Some(finished_at) = finished_at {
assert!(finished_at > started_at); assert!(finished_at > started_at);
} }
let BatchEnqueuedAt { earliest, oldest } = enqueued_at.unwrap();
assert!(*started_at > earliest);
assert!(earliest >= oldest);
snap.push('{'); snap.push('{');
snap.push_str(&format!("uid: {uid}, ")); snap.push_str(&format!("uid: {uid}, "));
snap.push_str(&format!("details: {}, ", serde_json::to_string(details).unwrap())); snap.push_str(&format!("details: {}, ", serde_json::to_string(details).unwrap()));
snap.push_str(&format!("stats: {}, ", serde_json::to_string(&stats).unwrap())); snap.push_str(&format!("stats: {}, ", serde_json::to_string(stats).unwrap()));
if !embedder_stats.skip_serializing() {
snap.push_str(&format!(
"embedder stats: {}, ",
serde_json::to_string(&embedder_stats).unwrap()
));
}
snap.push_str(&format!("stop reason: {}, ", serde_json::to_string(&stop_reason).unwrap()));
snap.push('}'); snap.push('}');
snap snap
} }
@ -389,8 +346,7 @@ pub fn snapshot_index_mapper(rtxn: &RoTxn, mapper: &IndexMapper) -> String {
let stats = mapper.stats_of(rtxn, &name).unwrap(); let stats = mapper.stats_of(rtxn, &name).unwrap();
s.push_str(&format!( s.push_str(&format!(
"{name}: {{ number_of_documents: {}, field_distribution: {:?} }}\n", "{name}: {{ number_of_documents: {}, field_distribution: {:?} }}\n",
stats.documents_database_stats.number_of_entries(), stats.number_of_documents, stats.field_distribution
stats.field_distribution
)); ));
} }

View File

@ -30,10 +30,8 @@ mod queue;
mod scheduler; mod scheduler;
#[cfg(test)] #[cfg(test)]
mod test_utils; mod test_utils;
pub mod upgrade;
mod utils; mod utils;
pub mod uuid_codec; pub mod uuid_codec;
pub mod versioning;
pub type Result<T, E = Error> = std::result::Result<T, E>; pub type Result<T, E = Error> = std::result::Result<T, E>;
pub type TaskId = u32; pub type TaskId = u32;
@ -51,37 +49,28 @@ pub use features::RoFeatures;
use flate2::bufread::GzEncoder; use flate2::bufread::GzEncoder;
use flate2::Compression; use flate2::Compression;
use meilisearch_types::batches::Batch; use meilisearch_types::batches::Batch;
use meilisearch_types::features::{ use meilisearch_types::features::{InstanceTogglableFeatures, RuntimeTogglableFeatures};
ChatCompletionSettings, InstanceTogglableFeatures, Network, RuntimeTogglableFeatures,
};
use meilisearch_types::heed::byteorder::BE; use meilisearch_types::heed::byteorder::BE;
use meilisearch_types::heed::types::{DecodeIgnore, SerdeJson, Str, I128}; use meilisearch_types::heed::types::I128;
use meilisearch_types::heed::{self, Database, Env, RoTxn, WithoutTls}; use meilisearch_types::heed::{self, Env, RoTxn};
use meilisearch_types::milli::index::IndexEmbeddingConfig;
use meilisearch_types::milli::update::IndexerConfig; use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::vector::json_template::JsonTemplate; use meilisearch_types::milli::vector::{Embedder, EmbedderOptions, EmbeddingConfigs};
use meilisearch_types::milli::vector::{
Embedder, EmbedderOptions, RuntimeEmbedder, RuntimeEmbedders, RuntimeFragment,
};
use meilisearch_types::milli::{self, Index}; use meilisearch_types::milli::{self, Index};
use meilisearch_types::task_view::TaskView; use meilisearch_types::task_view::TaskView;
use meilisearch_types::tasks::{KindWithContent, Task}; use meilisearch_types::tasks::{KindWithContent, Task};
use milli::vector::db::IndexEmbeddingConfig;
use processing::ProcessingTasks; use processing::ProcessingTasks;
pub use queue::Query; pub use queue::Query;
use queue::Queue; use queue::Queue;
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use scheduler::Scheduler; use scheduler::Scheduler;
use time::OffsetDateTime; use time::OffsetDateTime;
use versioning::Versioning;
use crate::index_mapper::IndexMapper; use crate::index_mapper::IndexMapper;
use crate::utils::clamp_to_page_size; use crate::utils::clamp_to_page_size;
pub(crate) type BEI128 = I128<BE>; pub(crate) type BEI128 = I128<BE>;
const TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT: u64 = 40;
const CHAT_SETTINGS_DB_NAME: &str = "chat-settings";
#[derive(Debug)] #[derive(Debug)]
pub struct IndexSchedulerOptions { pub struct IndexSchedulerOptions {
/// The path to the version file of Meilisearch. /// The path to the version file of Meilisearch.
@ -131,40 +120,28 @@ pub struct IndexSchedulerOptions {
pub batched_tasks_size_limit: u64, pub batched_tasks_size_limit: u64,
/// The experimental features enabled for this instance. /// The experimental features enabled for this instance.
pub instance_features: InstanceTogglableFeatures, pub instance_features: InstanceTogglableFeatures,
/// The experimental features enabled for this instance.
pub auto_upgrade: bool,
/// The maximal number of entries in the search query cache of an embedder.
///
/// 0 disables the cache.
pub embedding_cache_cap: usize,
/// Snapshot compaction status.
pub experimental_no_snapshot_compaction: bool,
} }
/// Structure which holds meilisearch's indexes and schedules the tasks /// Structure which holds meilisearch's indexes and schedules the tasks
/// to be performed on them. /// to be performed on them.
pub struct IndexScheduler { pub struct IndexScheduler {
/// The LMDB environment which the DBs are associated with. /// The LMDB environment which the DBs are associated with.
pub(crate) env: Env<WithoutTls>, pub(crate) env: Env,
/// The list of tasks currently processing /// The list of tasks currently processing
pub(crate) processing_tasks: Arc<RwLock<ProcessingTasks>>, pub(crate) processing_tasks: Arc<RwLock<ProcessingTasks>>,
/// A database containing only the version of the index-scheduler
pub version: versioning::Versioning,
/// The queue containing both the tasks and the batches. /// The queue containing both the tasks and the batches.
pub queue: queue::Queue, pub queue: queue::Queue,
pub scheduler: scheduler::Scheduler,
/// In charge of creating, opening, storing and returning indexes. /// In charge of creating, opening, storing and returning indexes.
pub(crate) index_mapper: IndexMapper, pub(crate) index_mapper: IndexMapper,
/// In charge of fetching and setting the status of experimental features. /// In charge of fetching and setting the status of experimental features.
features: features::FeatureData, features: features::FeatureData,
/// Stores the custom chat prompts and other settings of the indexes.
pub(crate) chat_settings: Database<Str, SerdeJson<ChatCompletionSettings>>,
/// Everything related to the processing of the tasks
pub scheduler: scheduler::Scheduler,
/// Whether we should automatically cleanup the task queue or not. /// Whether we should automatically cleanup the task queue or not.
pub(crate) cleanup_enabled: bool, pub(crate) cleanup_enabled: bool,
@ -173,11 +150,6 @@ pub struct IndexScheduler {
/// The Authorization header to send to the webhook URL. /// The Authorization header to send to the webhook URL.
pub(crate) webhook_authorization_header: Option<String>, pub(crate) webhook_authorization_header: Option<String>,
/// A map to retrieve the runtime representation of an embedder depending on its configuration.
///
/// This map may return the same embedder object for two different indexes or embedder settings,
/// but it will only do this if the embedder configuration options are the same, leading
/// to the same embeddings for the same input text.
embedders: Arc<RwLock<HashMap<EmbedderOptions, Arc<Embedder>>>>, embedders: Arc<RwLock<HashMap<EmbedderOptions, Arc<Embedder>>>>,
// ================= test // ================= test
@ -204,7 +176,6 @@ impl IndexScheduler {
IndexScheduler { IndexScheduler {
env: self.env.clone(), env: self.env.clone(),
processing_tasks: self.processing_tasks.clone(), processing_tasks: self.processing_tasks.clone(),
version: self.version.clone(),
queue: self.queue.private_clone(), queue: self.queue.private_clone(),
scheduler: self.scheduler.private_clone(), scheduler: self.scheduler.private_clone(),
@ -220,24 +191,13 @@ impl IndexScheduler {
#[cfg(test)] #[cfg(test)]
run_loop_iteration: self.run_loop_iteration.clone(), run_loop_iteration: self.run_loop_iteration.clone(),
features: self.features.clone(), features: self.features.clone(),
chat_settings: self.chat_settings,
} }
} }
pub(crate) const fn nb_db() -> u32 {
Versioning::nb_db()
+ Queue::nb_db()
+ IndexMapper::nb_db()
+ features::FeatureData::nb_db()
+ 1 // chat-prompts
}
/// Create an index scheduler and start its run loop. /// Create an index scheduler and start its run loop.
#[allow(private_interfaces)] // because test_utils is private #[allow(private_interfaces)] // because test_utils is private
pub fn new( pub fn new(
options: IndexSchedulerOptions, options: IndexSchedulerOptions,
auth_env: Env<WithoutTls>,
from_db_version: (u32, u32, u32),
#[cfg(test)] test_breakpoint_sdr: crossbeam_channel::Sender<(test_utils::Breakpoint, bool)>, #[cfg(test)] test_breakpoint_sdr: crossbeam_channel::Sender<(test_utils::Breakpoint, bool)>,
#[cfg(test)] planned_failures: Vec<(usize, test_utils::FailureLocation)>, #[cfg(test)] planned_failures: Vec<(usize, test_utils::FailureLocation)>,
) -> Result<Self> { ) -> Result<Self> {
@ -268,30 +228,24 @@ impl IndexScheduler {
}; };
let env = unsafe { let env = unsafe {
let env_options = heed::EnvOpenOptions::new(); heed::EnvOpenOptions::new()
let mut env_options = env_options.read_txn_without_tls(); .max_dbs(19)
env_options
.max_dbs(Self::nb_db())
.map_size(budget.task_db_size) .map_size(budget.task_db_size)
.open(&options.tasks_path) .open(&options.tasks_path)
}?; }?;
// We **must** starts by upgrading the version because it'll also upgrade the required database before we can open them let features = features::FeatureData::new(&env, options.instance_features)?;
let version = versioning::Versioning::new(&env, from_db_version)?;
let mut wtxn = env.write_txn()?; let mut wtxn = env.write_txn()?;
let features = features::FeatureData::new(&env, &mut wtxn, options.instance_features)?;
let queue = Queue::new(&env, &mut wtxn, &options)?; let queue = Queue::new(&env, &mut wtxn, &options)?;
let index_mapper = IndexMapper::new(&env, &mut wtxn, &options, budget)?; let index_mapper = IndexMapper::new(&env, &mut wtxn, &options, budget)?;
let chat_settings = env.create_database(&mut wtxn, Some(CHAT_SETTINGS_DB_NAME))?;
wtxn.commit()?; wtxn.commit()?;
// allow unreachable_code to get rids of the warning in the case of a test build. // allow unreachable_code to get rids of the warning in the case of a test build.
let this = Self { let this = Self {
processing_tasks: Arc::new(RwLock::new(ProcessingTasks::new())), processing_tasks: Arc::new(RwLock::new(ProcessingTasks::new())),
version,
queue, queue,
scheduler: Scheduler::new(&options, auth_env), scheduler: Scheduler::new(&options),
index_mapper, index_mapper,
env, env,
@ -307,17 +261,12 @@ impl IndexScheduler {
#[cfg(test)] #[cfg(test)]
run_loop_iteration: Arc::new(RwLock::new(0)), run_loop_iteration: Arc::new(RwLock::new(0)),
features, features,
chat_settings,
}; };
this.run(); this.run();
Ok(this) Ok(this)
} }
fn read_txn(&self) -> Result<RoTxn<WithoutTls>> {
self.env.read_txn().map_err(|e| e.into())
}
/// Return `Ok(())` if the index scheduler is able to access one of its database. /// Return `Ok(())` if the index scheduler is able to access one of its database.
pub fn health(&self) -> Result<()> { pub fn health(&self) -> Result<()> {
let rtxn = self.env.read_txn()?; let rtxn = self.env.read_txn()?;
@ -394,16 +343,15 @@ impl IndexScheduler {
} }
} }
pub fn read_txn(&self) -> Result<RoTxn> {
self.env.read_txn().map_err(|e| e.into())
}
/// Start the run loop for the given index scheduler. /// Start the run loop for the given index scheduler.
/// ///
/// This function will execute in a different thread and must be called /// This function will execute in a different thread and must be called
/// only once per index scheduler. /// only once per index scheduler.
fn run(&self) { fn run(&self) {
// If the number of batched tasks is 0, we don't need to run the scheduler at all.
// It will never be able to process any tasks.
if self.scheduler.max_number_of_batched_tasks == 0 {
return;
}
let run = self.private_clone(); let run = self.private_clone();
std::thread::Builder::new() std::thread::Builder::new()
.name(String::from("scheduler")) .name(String::from("scheduler"))
@ -418,12 +366,11 @@ impl IndexScheduler {
match ret { match ret {
Ok(Ok(TickOutcome::TickAgain(_))) => (), Ok(Ok(TickOutcome::TickAgain(_))) => (),
Ok(Ok(TickOutcome::WaitForSignal)) => run.scheduler.wake_up.wait(), Ok(Ok(TickOutcome::WaitForSignal)) => run.scheduler.wake_up.wait(),
Ok(Ok(TickOutcome::StopProcessingForever)) => break,
Ok(Err(e)) => { Ok(Err(e)) => {
tracing::error!("{e}"); tracing::error!("{e}");
// Wait when an irrecoverable error occurs. // Wait one second when an irrecoverable error occurs.
if !e.is_recoverable() { if !e.is_recoverable() {
std::thread::sleep(Duration::from_secs(10)); std::thread::sleep(Duration::from_secs(1));
} }
} }
Err(_panic) => { Err(_panic) => {
@ -450,17 +397,6 @@ impl IndexScheduler {
Ok(self.env.non_free_pages_size()?) Ok(self.env.non_free_pages_size()?)
} }
/// Return the maximum possible database size
pub fn max_size(&self) -> Result<u64> {
Ok(self.env.info().map_size as u64)
}
/// Return the max size of task allowed until the task queue stop receiving.
pub fn remaining_size_until_task_queue_stop(&self) -> Result<u64> {
Ok((self.env.info().map_size as u64 * TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT / 100)
.saturating_sub(self.used_size()?))
}
/// Return the index corresponding to the name. /// Return the index corresponding to the name.
/// ///
/// * If the index wasn't opened before, the index will be opened. /// * If the index wasn't opened before, the index will be opened.
@ -475,14 +411,12 @@ impl IndexScheduler {
/// If you need to fetch information from or perform an action on all indexes, /// If you need to fetch information from or perform an action on all indexes,
/// see the `try_for_each_index` function. /// see the `try_for_each_index` function.
pub fn index(&self, name: &str) -> Result<Index> { pub fn index(&self, name: &str) -> Result<Index> {
let rtxn = self.env.read_txn()?; self.index_mapper.index(&self.env.read_txn()?, name)
self.index_mapper.index(&rtxn, name)
} }
/// Return the boolean referring if index exists. /// Return the boolean referring if index exists.
pub fn index_exists(&self, name: &str) -> Result<bool> { pub fn index_exists(&self, name: &str) -> Result<bool> {
let rtxn = self.env.read_txn()?; self.index_mapper.index_exists(&self.env.read_txn()?, name)
self.index_mapper.index_exists(&rtxn, name)
} }
/// Return the name of all indexes without opening them. /// Return the name of all indexes without opening them.
@ -511,7 +445,7 @@ impl IndexScheduler {
/// Returns the total number of indexes available for the specified filter. /// Returns the total number of indexes available for the specified filter.
/// And a `Vec` of the index_uid + its stats /// And a `Vec` of the index_uid + its stats
pub fn paginated_indexes_stats( pub fn get_paginated_indexes_stats(
&self, &self,
filters: &meilisearch_auth::AuthFilter, filters: &meilisearch_auth::AuthFilter,
from: usize, from: usize,
@ -552,31 +486,12 @@ impl IndexScheduler {
ret.map(|ret| (total, ret)) ret.map(|ret| (total, ret))
} }
/// Returns the total number of chat workspaces available ~~for the specified filter~~.
/// And a `Vec` of the workspace_uids
pub fn paginated_chat_workspace_uids(
&self,
from: usize,
limit: usize,
) -> Result<(usize, Vec<String>)> {
let rtxn = self.read_txn()?;
let total = self.chat_settings.len(&rtxn)?;
let mut iter = self.chat_settings.iter(&rtxn)?.skip(from);
iter.by_ref()
.take(limit)
.map(|ret| ret.map_err(Error::from))
.map(|ret| ret.map(|(uid, _)| uid.to_string()))
.collect::<Result<Vec<_>, Error>>()
.map(|ret| (total as usize, ret))
}
/// The returned structure contains: /// The returned structure contains:
/// 1. The name of the property being observed can be `statuses`, `types`, or `indexes`. /// 1. The name of the property being observed can be `statuses`, `types`, or `indexes`.
/// 2. The name of the specific data related to the property can be `enqueued` for the `statuses`, `settingsUpdate` for the `types`, or the name of the index for the `indexes`, for example. /// 2. The name of the specific data related to the property can be `enqueued` for the `statuses`, `settingsUpdate` for the `types`, or the name of the index for the `indexes`, for example.
/// 3. The number of times the properties appeared. /// 3. The number of times the properties appeared.
pub fn get_stats(&self) -> Result<BTreeMap<String, BTreeMap<String, u64>>> { pub fn get_stats(&self) -> Result<BTreeMap<String, BTreeMap<String, u64>>> {
let rtxn = self.read_txn()?; self.queue.get_stats(&self.read_txn()?, &self.processing_tasks.read().unwrap())
self.queue.get_stats(&rtxn, &self.processing_tasks.read().unwrap())
} }
// Return true if there is at least one task that is processing. // Return true if there is at least one task that is processing.
@ -679,10 +594,9 @@ impl IndexScheduler {
task_id: Option<TaskId>, task_id: Option<TaskId>,
dry_run: bool, dry_run: bool,
) -> Result<Task> { ) -> Result<Task> {
// if the task doesn't delete or cancel anything and 40% of the task queue is full, we must refuse to enqueue the incoming task // if the task doesn't delete anything and 50% of the task queue is full, we must refuse to enqueue the incomming task
if !matches!(&kind, KindWithContent::TaskDeletion { tasks, .. } | KindWithContent::TaskCancelation { tasks, .. } if !tasks.is_empty()) if !matches!(&kind, KindWithContent::TaskDeletion { tasks, .. } if !tasks.is_empty())
&& (self.env.non_free_pages_size()? * 100) / self.env.info().map_size as u64 && (self.env.non_free_pages_size()? * 100) / self.env.info().map_size as u64 > 40
> TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT
{ {
return Err(Error::NoSpaceLeftInTaskQueue); return Err(Error::NoSpaceLeftInTaskQueue);
} }
@ -751,7 +665,7 @@ impl IndexScheduler {
written: usize, written: usize,
} }
impl Read for TaskReader<'_, '_> { impl<'a, 'b> Read for TaskReader<'a, 'b> {
fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> { fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> {
if self.buffer.is_empty() { if self.buffer.is_empty() {
match self.tasks.next() { match self.tasks.next() {
@ -840,62 +754,40 @@ impl IndexScheduler {
Ok(()) Ok(())
} }
pub fn put_network(&self, network: Network) -> Result<()> { // TODO: consider using a type alias or a struct embedder/template
let wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?;
self.features.put_network(wtxn, network)?;
Ok(())
}
pub fn network(&self) -> Network {
self.features.network()
}
pub fn embedders( pub fn embedders(
&self, &self,
index_uid: String, index_uid: String,
embedding_configs: Vec<IndexEmbeddingConfig>, embedding_configs: Vec<IndexEmbeddingConfig>,
) -> Result<RuntimeEmbedders> { ) -> Result<EmbeddingConfigs> {
let res: Result<_> = embedding_configs let res: Result<_> = embedding_configs
.into_iter() .into_iter()
.map( .map(
|IndexEmbeddingConfig { |IndexEmbeddingConfig {
name, name,
config: milli::vector::EmbeddingConfig { embedder_options, prompt, quantized }, config: milli::vector::EmbeddingConfig { embedder_options, prompt, quantized },
fragments, ..
}| }| {
-> Result<(String, Arc<RuntimeEmbedder>)> { let prompt = Arc::new(
let document_template = prompt prompt
.try_into() .try_into()
.map_err(meilisearch_types::milli::Error::from) .map_err(meilisearch_types::milli::Error::from)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?; .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
let fragments = fragments
.into_inner()
.into_iter()
.map(|fragment| {
let value = embedder_options.fragment(&fragment.name).unwrap();
let template = JsonTemplate::new(value.clone()).unwrap();
RuntimeFragment { name: fragment.name, id: fragment.id, template }
})
.collect();
// optimistically return existing embedder // optimistically return existing embedder
{ {
let embedders = self.embedders.read().unwrap(); let embedders = self.embedders.read().unwrap();
if let Some(embedder) = embedders.get(&embedder_options) { if let Some(embedder) = embedders.get(&embedder_options) {
let runtime = Arc::new(RuntimeEmbedder::new( return Ok((
embedder.clone(), name,
document_template, (embedder.clone(), prompt, quantized.unwrap_or_default()),
fragments,
quantized.unwrap_or_default(),
)); ));
return Ok((name, runtime));
} }
} }
// add missing embedder // add missing embedder
let embedder = Arc::new( let embedder = Arc::new(
Embedder::new(embedder_options.clone(), self.scheduler.embedding_cache_cap) Embedder::new(embedder_options.clone())
.map_err(meilisearch_types::milli::vector::Error::from) .map_err(meilisearch_types::milli::vector::Error::from)
.map_err(|err| { .map_err(|err| {
Error::from_milli(err.into(), Some(index_uid.clone())) Error::from_milli(err.into(), Some(index_uid.clone()))
@ -905,44 +797,11 @@ impl IndexScheduler {
let mut embedders = self.embedders.write().unwrap(); let mut embedders = self.embedders.write().unwrap();
embedders.insert(embedder_options, embedder.clone()); embedders.insert(embedder_options, embedder.clone());
} }
Ok((name, (embedder, prompt, quantized.unwrap_or_default())))
let runtime = Arc::new(RuntimeEmbedder::new(
embedder.clone(),
document_template,
fragments,
quantized.unwrap_or_default(),
));
Ok((name, runtime))
}, },
) )
.collect(); .collect();
res.map(RuntimeEmbedders::new) res.map(EmbeddingConfigs::new)
}
pub fn chat_settings(&self, uid: &str) -> Result<Option<ChatCompletionSettings>> {
let rtxn = self.env.read_txn()?;
self.chat_settings.get(&rtxn, uid).map_err(Into::into)
}
/// Return true if chat workspace exists.
pub fn chat_workspace_exists(&self, name: &str) -> Result<bool> {
let rtxn = self.env.read_txn()?;
Ok(self.chat_settings.remap_data_type::<DecodeIgnore>().get(&rtxn, name)?.is_some())
}
pub fn put_chat_settings(&self, uid: &str, settings: &ChatCompletionSettings) -> Result<()> {
let mut wtxn = self.env.write_txn()?;
self.chat_settings.put(&mut wtxn, uid, settings)?;
wtxn.commit()?;
Ok(())
}
pub fn delete_chat_settings(&self, uid: &str) -> Result<bool> {
let mut wtxn = self.env.write_txn()?;
let deleted = self.chat_settings.delete(&mut wtxn, uid)?;
wtxn.commit()?;
Ok(deleted)
} }
} }
@ -954,8 +813,6 @@ pub enum TickOutcome {
TickAgain(u64), TickAgain(u64),
/// The scheduler should wait for an external signal before attempting another `tick`. /// The scheduler should wait for an external signal before attempting another `tick`.
WaitForSignal, WaitForSignal,
/// The scheduler exits the run-loop and will never process tasks again
StopProcessingForever,
} }
/// How many indexes we can afford to have open simultaneously. /// How many indexes we can afford to have open simultaneously.

View File

@ -1,6 +1,8 @@
use std::borrow::Cow;
use std::sync::Arc; use std::sync::Arc;
use meilisearch_types::milli::progress::{AtomicSubStep, NamedStep, Progress, ProgressView}; use enum_iterator::Sequence;
use meilisearch_types::milli::progress::{AtomicSubStep, NamedStep, Progress, ProgressView, Step};
use meilisearch_types::milli::{make_atomic_progress, make_enum_progress}; use meilisearch_types::milli::{make_atomic_progress, make_enum_progress};
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
@ -64,17 +66,9 @@ make_enum_progress! {
} }
} }
make_enum_progress! {
pub enum FinalizingIndexStep {
Committing,
ComputingStats,
}
}
make_enum_progress! { make_enum_progress! {
pub enum TaskCancelationProgress { pub enum TaskCancelationProgress {
RetrievingTasks, RetrievingTasks,
CancelingUpgrade,
UpdatingTasks, UpdatingTasks,
} }
} }
@ -103,9 +97,7 @@ make_enum_progress! {
pub enum DumpCreationProgress { pub enum DumpCreationProgress {
StartTheDumpCreation, StartTheDumpCreation,
DumpTheApiKeys, DumpTheApiKeys,
DumpTheChatCompletionSettings,
DumpTheTasks, DumpTheTasks,
DumpTheBatches,
DumpTheIndexes, DumpTheIndexes,
DumpTheExperimentalFeatures, DumpTheExperimentalFeatures,
CompressTheDump, CompressTheDump,
@ -176,19 +168,36 @@ make_enum_progress! {
} }
} }
make_enum_progress! { make_atomic_progress!(Task alias AtomicTaskStep => "task" );
pub enum Export { make_atomic_progress!(Document alias AtomicDocumentStep => "document" );
EnsuringCorrectnessOfTheTarget, make_atomic_progress!(Batch alias AtomicBatchStep => "batch" );
ExportingTheSettings, make_atomic_progress!(UpdateFile alias AtomicUpdateFileStep => "update file" );
ExportingTheDocuments,
pub struct VariableNameStep {
name: String,
current: u32,
total: u32,
}
impl VariableNameStep {
pub fn new(name: impl Into<String>, current: u32, total: u32) -> Self {
Self { name: name.into(), current, total }
} }
} }
make_atomic_progress!(Task alias AtomicTaskStep => "task" ); impl Step for VariableNameStep {
make_atomic_progress!(Document alias AtomicDocumentStep => "document" ); fn name(&self) -> Cow<'static, str> {
make_atomic_progress!(Index alias AtomicIndexStep => "index" ); self.name.clone().into()
make_atomic_progress!(Batch alias AtomicBatchStep => "batch" ); }
make_atomic_progress!(UpdateFile alias AtomicUpdateFileStep => "update file" );
fn current(&self) -> u32 {
self.current
}
fn total(&self) -> u32 {
self.total
}
}
#[cfg(test)] #[cfg(test)]
mod test { mod test {

View File

@ -1,9 +1,8 @@
use std::collections::HashSet;
use std::ops::{Bound, RangeBounds}; use std::ops::{Bound, RangeBounds};
use meilisearch_types::batches::{Batch, BatchId}; use meilisearch_types::batches::{Batch, BatchId};
use meilisearch_types::heed::types::{DecodeIgnore, SerdeBincode, SerdeJson, Str}; use meilisearch_types::heed::types::{DecodeIgnore, SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls}; use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32}; use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
use meilisearch_types::tasks::{Kind, Status}; use meilisearch_types::tasks::{Kind, Status};
use roaring::{MultiOps, RoaringBitmap}; use roaring::{MultiOps, RoaringBitmap};
@ -11,14 +10,9 @@ use time::OffsetDateTime;
use super::{Query, Queue}; use super::{Query, Queue};
use crate::processing::ProcessingTasks; use crate::processing::ProcessingTasks;
use crate::utils::{ use crate::utils::{insert_task_datetime, keep_ids_within_datetimes, map_bound, ProcessingBatch};
insert_task_datetime, keep_ids_within_datetimes, map_bound,
remove_n_tasks_datetime_earlier_than, remove_task_datetime, ProcessingBatch,
};
use crate::{Error, Result, BEI128}; use crate::{Error, Result, BEI128};
/// The number of database used by the batch queue
const NUMBER_OF_DATABASES: u32 = 7;
/// Database const names for the `IndexScheduler`. /// Database const names for the `IndexScheduler`.
mod db_name { mod db_name {
pub const ALL_BATCHES: &str = "all-batches"; pub const ALL_BATCHES: &str = "all-batches";
@ -62,11 +56,7 @@ impl BatchQueue {
} }
} }
pub(crate) const fn nb_db() -> u32 { pub(super) fn new(env: &Env, wtxn: &mut RwTxn) -> Result<Self> {
NUMBER_OF_DATABASES
}
pub(super) fn new(env: &Env<WithoutTls>, wtxn: &mut RwTxn) -> Result<Self> {
Ok(Self { Ok(Self {
all_batches: env.create_database(wtxn, Some(db_name::ALL_BATCHES))?, all_batches: env.create_database(wtxn, Some(db_name::ALL_BATCHES))?,
status: env.create_database(wtxn, Some(db_name::BATCH_STATUS))?, status: env.create_database(wtxn, Some(db_name::BATCH_STATUS))?,
@ -169,8 +159,6 @@ impl BatchQueue {
} }
pub(crate) fn write_batch(&self, wtxn: &mut RwTxn, batch: ProcessingBatch) -> Result<()> { pub(crate) fn write_batch(&self, wtxn: &mut RwTxn, batch: ProcessingBatch) -> Result<()> {
let old_batch = self.all_batches.get(wtxn, &batch.uid)?;
self.all_batches.put( self.all_batches.put(
wtxn, wtxn,
&batch.uid, &batch.uid,
@ -179,90 +167,34 @@ impl BatchQueue {
progress: None, progress: None,
details: batch.details, details: batch.details,
stats: batch.stats, stats: batch.stats,
embedder_stats: batch.embedder_stats.as_ref().into(),
started_at: batch.started_at, started_at: batch.started_at,
finished_at: batch.finished_at, finished_at: batch.finished_at,
enqueued_at: batch.enqueued_at,
stop_reason: batch.reason.to_string(),
}, },
)?; )?;
// Update the statuses
if let Some(ref old_batch) = old_batch {
for status in old_batch.stats.status.keys() {
self.update_status(wtxn, *status, |bitmap| {
bitmap.remove(batch.uid);
})?;
}
}
for status in batch.statuses { for status in batch.statuses {
self.update_status(wtxn, status, |bitmap| { self.update_status(wtxn, status, |bitmap| {
bitmap.insert(batch.uid); bitmap.insert(batch.uid);
})?; })?;
} }
// Update the kinds / types
if let Some(ref old_batch) = old_batch {
let kinds: HashSet<_> = old_batch.stats.types.keys().cloned().collect();
for kind in kinds.difference(&batch.kinds) {
self.update_kind(wtxn, *kind, |bitmap| {
bitmap.remove(batch.uid);
})?;
}
}
for kind in batch.kinds { for kind in batch.kinds {
self.update_kind(wtxn, kind, |bitmap| { self.update_kind(wtxn, kind, |bitmap| {
bitmap.insert(batch.uid); bitmap.insert(batch.uid);
})?; })?;
} }
// Update the indexes
if let Some(ref old_batch) = old_batch {
let indexes: HashSet<_> = old_batch.stats.index_uids.keys().cloned().collect();
for index in indexes.difference(&batch.indexes) {
self.update_index(wtxn, index, |bitmap| {
bitmap.remove(batch.uid);
})?;
}
}
for index in batch.indexes { for index in batch.indexes {
self.update_index(wtxn, &index, |bitmap| { self.update_index(wtxn, &index, |bitmap| {
bitmap.insert(batch.uid); bitmap.insert(batch.uid);
})?; })?;
} }
// Update the enqueued_at: we cannot retrieve the previous enqueued at from the previous batch, and if let Some(enqueued_at) = batch.oldest_enqueued_at {
// must instead go through the db looking for it. We cannot look at the task contained in this batch either insert_task_datetime(wtxn, self.enqueued_at, enqueued_at, batch.uid)?;
// because they may have been removed.
// What we know, though, is that the task date is from before the enqueued_at, and max two timestamps have been written
// to the DB per batches.
if let Some(ref old_batch) = old_batch {
if let Some(enqueued_at) = old_batch.enqueued_at {
remove_task_datetime(wtxn, self.enqueued_at, enqueued_at.earliest, old_batch.uid)?;
remove_task_datetime(wtxn, self.enqueued_at, enqueued_at.oldest, old_batch.uid)?;
} else {
// If we don't have the enqueued at in the batch it means the database comes from the v1.12
// and we still need to find the date by scrolling the database
remove_n_tasks_datetime_earlier_than(
wtxn,
self.enqueued_at,
old_batch.started_at,
old_batch.stats.total_nb_tasks.clamp(1, 2) as usize,
old_batch.uid,
)?;
}
} }
// A finished batch MUST contains at least one task and have an enqueued_at if let Some(enqueued_at) = batch.earliest_enqueued_at {
let enqueued_at = batch.enqueued_at.as_ref().unwrap(); insert_task_datetime(wtxn, self.enqueued_at, enqueued_at, batch.uid)?;
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at.earliest, batch.uid)?;
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at.oldest, batch.uid)?;
// Update the started at and finished at
if let Some(ref old_batch) = old_batch {
remove_task_datetime(wtxn, self.started_at, old_batch.started_at, old_batch.uid)?;
if let Some(finished_at) = old_batch.finished_at {
remove_task_datetime(wtxn, self.finished_at, finished_at, old_batch.uid)?;
}
} }
insert_task_datetime(wtxn, self.started_at, batch.started_at, batch.uid)?; insert_task_datetime(wtxn, self.started_at, batch.started_at, batch.uid)?;
insert_task_datetime(wtxn, self.finished_at, batch.finished_at.unwrap(), batch.uid)?; insert_task_datetime(wtxn, self.finished_at, batch.finished_at.unwrap(), batch.uid)?;

View File

@ -102,34 +102,30 @@ fn query_batches_simple() {
.unwrap(); .unwrap();
assert_eq!(batches.len(), 1); assert_eq!(batches.len(), 1);
batches[0].started_at = OffsetDateTime::UNIX_EPOCH; batches[0].started_at = OffsetDateTime::UNIX_EPOCH;
assert!(batches[0].enqueued_at.is_some());
batches[0].enqueued_at = None;
// Insta cannot snapshot our batches because the batch stats contains an enum as key: https://github.com/mitsuhiko/insta/issues/689 // Insta cannot snapshot our batches because the batch stats contains an enum as key: https://github.com/mitsuhiko/insta/issues/689
let batch = serde_json::to_string_pretty(&batches[0]).unwrap(); let batch = serde_json::to_string_pretty(&batches[0]).unwrap();
snapshot!(batch, @r###" snapshot!(batch, @r#"
{ {
"uid": 0, "uid": 0,
"details": { "details": {
"primaryKey": "mouse" "primaryKey": "mouse"
}, },
"stats": { "stats": {
"totalNbTasks": 1, "totalNbTasks": 1,
"status": { "status": {
"processing": 1 "processing": 1
}, },
"types": { "types": {
"indexCreation": 1 "indexCreation": 1
}, },
"indexUids": { "indexUids": {
"catto": 1 "catto": 1
}
},
"startedAt": "1970-01-01T00:00:00Z",
"finishedAt": null
} }
}, "#);
"startedAt": "1970-01-01T00:00:00Z",
"finishedAt": null,
"enqueuedAt": null,
"stopReason": "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task."
}
"###);
let query = Query { statuses: Some(vec![Status::Enqueued]), ..Default::default() }; let query = Query { statuses: Some(vec![Status::Enqueued]), ..Default::default() };
let (batches, _) = index_scheduler let (batches, _) = index_scheduler

View File

@ -8,12 +8,11 @@ mod tasks_test;
mod test; mod test;
use std::collections::BTreeMap; use std::collections::BTreeMap;
use std::fs::File as StdFile;
use std::time::Duration; use std::time::Duration;
use file_store::FileStore; use file_store::FileStore;
use meilisearch_types::batches::BatchId; use meilisearch_types::batches::BatchId;
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls}; use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, BEU32}; use meilisearch_types::milli::{CboRoaringBitmapCodec, BEU32};
use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
@ -21,16 +20,14 @@ use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime; use time::OffsetDateTime;
use uuid::Uuid; use uuid::Uuid;
pub(crate) use self::batches::BatchQueue; use self::batches::BatchQueue;
pub(crate) use self::tasks::TaskQueue; use self::tasks::TaskQueue;
use crate::processing::ProcessingTasks; use crate::processing::ProcessingTasks;
use crate::utils::{ use crate::utils::{
check_index_swap_validity, filter_out_references_to_newer_tasks, ProcessingBatch, check_index_swap_validity, filter_out_references_to_newer_tasks, ProcessingBatch,
}; };
use crate::{Error, IndexSchedulerOptions, Result, TaskId}; use crate::{Error, IndexSchedulerOptions, Result, TaskId};
/// The number of database used by queue itself
const NUMBER_OF_DATABASES: u32 = 1;
/// Database const names for the `IndexScheduler`. /// Database const names for the `IndexScheduler`.
mod db_name { mod db_name {
pub const BATCH_TO_TASKS_MAPPING: &str = "batch-to-tasks-mapping"; pub const BATCH_TO_TASKS_MAPPING: &str = "batch-to-tasks-mapping";
@ -151,13 +148,9 @@ impl Queue {
} }
} }
pub(crate) const fn nb_db() -> u32 {
tasks::TaskQueue::nb_db() + batches::BatchQueue::nb_db() + NUMBER_OF_DATABASES
}
/// Create an index scheduler and start its run loop. /// Create an index scheduler and start its run loop.
pub(crate) fn new( pub(crate) fn new(
env: &Env<WithoutTls>, env: &Env,
wtxn: &mut RwTxn, wtxn: &mut RwTxn,
options: &IndexSchedulerOptions, options: &IndexSchedulerOptions,
) -> Result<Self> { ) -> Result<Self> {
@ -217,11 +210,6 @@ impl Queue {
} }
} }
/// Open and returns the task's content File.
pub fn update_file(&self, uuid: Uuid) -> file_store::Result<StdFile> {
self.file_store.get_update(uuid)
}
/// Delete a file from the index scheduler. /// Delete a file from the index scheduler.
/// ///
/// Counterpart to the [`create_update_file`](IndexScheduler::create_update_file) method. /// Counterpart to the [`create_update_file`](IndexScheduler::create_update_file) method.
@ -292,6 +280,8 @@ impl Queue {
return Ok(task); return Ok(task);
} }
// Get rid of the mutability.
let task = task;
self.tasks.register(wtxn, &task)?; self.tasks.register(wtxn, &task)?;
Ok(task) Ok(task)

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/batches_test.rs source: crates/index-scheduler/src/queue/batches_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -48,8 +49,8 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [1,2,3,] [timestamp] [1,2,3,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/batches_test.rs source: crates/index-scheduler/src/queue/batches_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -47,9 +48,9 @@ whalo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,] [timestamp] [2,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", } 2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,10 +1,11 @@
--- ---
source: crates/index-scheduler/src/queue/batches_test.rs source: crates/index-scheduler/src/queue/batches_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch Some(1): ### Processing batch Some(1):
[1,] [1,]
{uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Tasks: ### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }} 0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
@ -42,7 +43,7 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,] [timestamp] [0,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/batches_test.rs source: crates/index-scheduler/src/queue/batches_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -47,9 +48,9 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,] [timestamp] [2,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", } 2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/batches_test.rs source: crates/index-scheduler/src/queue/batches_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -52,10 +53,10 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [3,] [timestamp] [3,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
2 {uid: 2, details: {"swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "created batch containing only task with id 2 of type `indexSwap` that cannot be batched with any other task.", } 2 {uid: 2, details: {"swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, }
3 {uid: 3, details: {"swaps":[{"indexes":["catto","whalo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "created batch containing only task with id 3 of type `indexSwap` that cannot be batched with any other task.", } 3 {uid: 3, details: {"swaps":[{"indexes":["catto","whalo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/tasks_test.rs source: crates/index-scheduler/src/queue/tasks_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -48,8 +49,8 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [1,2,3,] [timestamp] [1,2,3,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/tasks_test.rs source: crates/index-scheduler/src/queue/tasks_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -47,9 +48,9 @@ whalo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,] [timestamp] [2,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", } 2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/queue/tasks_test.rs source: crates/index-scheduler/src/queue/tasks_test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -47,9 +48,9 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,] [timestamp] [2,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", } 1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", } 2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

View File

@ -1,7 +1,7 @@
use std::ops::{Bound, RangeBounds}; use std::ops::{Bound, RangeBounds};
use meilisearch_types::heed::types::{DecodeIgnore, SerdeBincode, SerdeJson, Str}; use meilisearch_types::heed::types::{DecodeIgnore, SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls}; use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32}; use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
use meilisearch_types::tasks::{Kind, Status, Task}; use meilisearch_types::tasks::{Kind, Status, Task};
use roaring::{MultiOps, RoaringBitmap}; use roaring::{MultiOps, RoaringBitmap};
@ -9,17 +9,12 @@ use time::OffsetDateTime;
use super::{Query, Queue}; use super::{Query, Queue};
use crate::processing::ProcessingTasks; use crate::processing::ProcessingTasks;
use crate::utils::{ use crate::utils::{self, insert_task_datetime, keep_ids_within_datetimes, map_bound};
self, insert_task_datetime, keep_ids_within_datetimes, map_bound, remove_task_datetime,
};
use crate::{Error, Result, TaskId, BEI128}; use crate::{Error, Result, TaskId, BEI128};
/// The number of database used by the task queue
const NUMBER_OF_DATABASES: u32 = 8;
/// Database const names for the `IndexScheduler`. /// Database const names for the `IndexScheduler`.
mod db_name { mod db_name {
pub const ALL_TASKS: &str = "all-tasks"; pub const ALL_TASKS: &str = "all-tasks";
pub const STATUS: &str = "status"; pub const STATUS: &str = "status";
pub const KIND: &str = "kind"; pub const KIND: &str = "kind";
pub const INDEX_TASKS: &str = "index-tasks"; pub const INDEX_TASKS: &str = "index-tasks";
@ -64,11 +59,7 @@ impl TaskQueue {
} }
} }
pub(crate) const fn nb_db() -> u32 { pub(super) fn new(env: &Env, wtxn: &mut RwTxn) -> Result<Self> {
NUMBER_OF_DATABASES
}
pub(crate) fn new(env: &Env<WithoutTls>, wtxn: &mut RwTxn) -> Result<Self> {
Ok(Self { Ok(Self {
all_tasks: env.create_database(wtxn, Some(db_name::ALL_TASKS))?, all_tasks: env.create_database(wtxn, Some(db_name::ALL_TASKS))?,
status: env.create_database(wtxn, Some(db_name::STATUS))?, status: env.create_database(wtxn, Some(db_name::STATUS))?,
@ -99,14 +90,12 @@ impl TaskQueue {
pub(crate) fn update_task(&self, wtxn: &mut RwTxn, task: &Task) -> Result<()> { pub(crate) fn update_task(&self, wtxn: &mut RwTxn, task: &Task) -> Result<()> {
let old_task = self.get_task(wtxn, task.uid)?.ok_or(Error::CorruptedTaskQueue)?; let old_task = self.get_task(wtxn, task.uid)?.ok_or(Error::CorruptedTaskQueue)?;
let reprocessing = old_task.status != Status::Enqueued;
debug_assert!(old_task != *task); debug_assert!(old_task != *task);
debug_assert_eq!(old_task.uid, task.uid); debug_assert_eq!(old_task.uid, task.uid);
debug_assert!(old_task.batch_uid.is_none() && task.batch_uid.is_some());
// If we're processing a task that failed it may already contains a batch_uid
debug_assert!( debug_assert!(
reprocessing || (old_task.batch_uid.is_none() && task.batch_uid.is_some()), old_task.batch_uid.is_none() && task.batch_uid.is_some(),
"\n==> old: {old_task:?}\n==> new: {task:?}" "\n==> old: {old_task:?}\n==> new: {task:?}"
); );
@ -133,25 +122,13 @@ impl TaskQueue {
"Cannot update a task's enqueued_at time" "Cannot update a task's enqueued_at time"
); );
if old_task.started_at != task.started_at { if old_task.started_at != task.started_at {
assert!( assert!(old_task.started_at.is_none(), "Cannot update a task's started_at time");
reprocessing || old_task.started_at.is_none(),
"Cannot update a task's started_at time"
);
if let Some(started_at) = old_task.started_at {
remove_task_datetime(wtxn, self.started_at, started_at, task.uid)?;
}
if let Some(started_at) = task.started_at { if let Some(started_at) = task.started_at {
insert_task_datetime(wtxn, self.started_at, started_at, task.uid)?; insert_task_datetime(wtxn, self.started_at, started_at, task.uid)?;
} }
} }
if old_task.finished_at != task.finished_at { if old_task.finished_at != task.finished_at {
assert!( assert!(old_task.finished_at.is_none(), "Cannot update a task's finished_at time");
reprocessing || old_task.finished_at.is_none(),
"Cannot update a task's finished_at time"
);
if let Some(finished_at) = old_task.finished_at {
remove_task_datetime(wtxn, self.finished_at, finished_at, task.uid)?;
}
if let Some(finished_at) = task.finished_at { if let Some(finished_at) = task.finished_at {
insert_task_datetime(wtxn, self.finished_at, finished_at, task.uid)?; insert_task_datetime(wtxn, self.finished_at, finished_at, task.uid)?;
} }
@ -315,7 +292,7 @@ impl Queue {
if let Some(batch_uids) = batch_uids { if let Some(batch_uids) = batch_uids {
let mut batch_tasks = RoaringBitmap::new(); let mut batch_tasks = RoaringBitmap::new();
for batch_uid in batch_uids { for batch_uid in batch_uids {
if processing_batch.as_ref().is_some_and(|batch| batch.uid == *batch_uid) { if processing_batch.as_ref().map_or(false, |batch| batch.uid == *batch_uid) {
batch_tasks |= &**processing_tasks; batch_tasks |= &**processing_tasks;
} else { } else {
batch_tasks |= self.tasks_in_batch(rtxn, *batch_uid)?; batch_tasks |= self.tasks_in_batch(rtxn, *batch_uid)?;

View File

@ -165,7 +165,6 @@ fn test_disable_auto_deletion_of_tasks() {
let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| { let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| {
config.cleanup_enabled = false; config.cleanup_enabled = false;
config.max_number_of_tasks = 2; config.max_number_of_tasks = 2;
None
}); });
index_scheduler index_scheduler
@ -229,7 +228,6 @@ fn test_disable_auto_deletion_of_tasks() {
fn test_auto_deletion_of_tasks() { fn test_auto_deletion_of_tasks() {
let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| { let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| {
config.max_number_of_tasks = 2; config.max_number_of_tasks = 2;
None
}); });
index_scheduler index_scheduler
@ -326,8 +324,7 @@ fn test_auto_deletion_of_tasks() {
fn test_task_queue_is_full() { fn test_task_queue_is_full() {
let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| { let (index_scheduler, mut handle) = IndexScheduler::test_with_custom_config(vec![], |config| {
// that's the minimum map size possible // that's the minimum map size possible
config.task_db_size = 1048576 * 3; config.task_db_size = 1048576;
None
}); });
index_scheduler index_scheduler
@ -364,7 +361,7 @@ fn test_task_queue_is_full() {
// we won't be able to test this error in an integration test thus as a best effort test I still ensure the error return the expected error code // we won't be able to test this error in an integration test thus as a best effort test I still ensure the error return the expected error code
snapshot!(format!("{:?}", result.error_code()), @"NoSpaceLeftOnDevice"); snapshot!(format!("{:?}", result.error_code()), @"NoSpaceLeftOnDevice");
// Even the task deletion and cancelation that don't delete anything should be refused // Even the task deletion that doesn't delete anything shouldn't be accepted
let result = index_scheduler let result = index_scheduler
.register( .register(
KindWithContent::TaskDeletion { query: S("test"), tasks: RoaringBitmap::new() }, KindWithContent::TaskDeletion { query: S("test"), tasks: RoaringBitmap::new() },
@ -373,39 +370,10 @@ fn test_task_queue_is_full() {
) )
.unwrap_err(); .unwrap_err();
snapshot!(result, @"Meilisearch cannot receive write operations because the limit of the task database has been reached. Please delete tasks to continue performing write operations."); snapshot!(result, @"Meilisearch cannot receive write operations because the limit of the task database has been reached. Please delete tasks to continue performing write operations.");
let result = index_scheduler
.register(
KindWithContent::TaskCancelation { query: S("test"), tasks: RoaringBitmap::new() },
None,
false,
)
.unwrap_err();
snapshot!(result, @"Meilisearch cannot receive write operations because the limit of the task database has been reached. Please delete tasks to continue performing write operations.");
// we won't be able to test this error in an integration test thus as a best effort test I still ensure the error return the expected error code // we won't be able to test this error in an integration test thus as a best effort test I still ensure the error return the expected error code
snapshot!(format!("{:?}", result.error_code()), @"NoSpaceLeftOnDevice"); snapshot!(format!("{:?}", result.error_code()), @"NoSpaceLeftOnDevice");
// But a task cancelation that cancel something should work // But a task deletion that delete something should works
index_scheduler
.register(
KindWithContent::TaskCancelation { query: S("test"), tasks: (0..100).collect() },
None,
false,
)
.unwrap();
handle.advance_one_successful_batch();
// But we should still be forbidden from enqueuing new tasks
let result = index_scheduler
.register(
KindWithContent::IndexCreation { index_uid: S("doggo"), primary_key: None },
None,
false,
)
.unwrap_err();
snapshot!(result, @"Meilisearch cannot receive write operations because the limit of the task database has been reached. Please delete tasks to continue performing write operations.");
// And a task deletion that delete something should works
index_scheduler index_scheduler
.register( .register(
KindWithContent::TaskDeletion { query: S("test"), tasks: (0..100).collect() }, KindWithContent::TaskDeletion { query: S("test"), tasks: (0..100).collect() },

View File

@ -7,7 +7,10 @@ The main function of the autobatcher is [`next_autobatch`].
use std::ops::ControlFlow::{self, Break, Continue}; use std::ops::ControlFlow::{self, Break, Continue};
use meilisearch_types::tasks::{BatchStopReason, PrimaryKeyMismatchReason, TaskId}; use meilisearch_types::milli::update::IndexDocumentsMethod::{
self, ReplaceDocuments, UpdateDocuments,
};
use meilisearch_types::tasks::TaskId;
use crate::KindWithContent; use crate::KindWithContent;
@ -16,11 +19,19 @@ use crate::KindWithContent;
/// ///
/// Only the non-prioritised tasks that can be grouped in a batch have a corresponding [`AutobatchKind`] /// Only the non-prioritised tasks that can be grouped in a batch have a corresponding [`AutobatchKind`]
enum AutobatchKind { enum AutobatchKind {
DocumentImport { allow_index_creation: bool, primary_key: Option<String> }, DocumentImport {
method: IndexDocumentsMethod,
allow_index_creation: bool,
primary_key: Option<String>,
},
DocumentEdition, DocumentEdition,
DocumentDeletion { by_filter: bool }, DocumentDeletion {
by_filter: bool,
},
DocumentClear, DocumentClear,
Settings { allow_index_creation: bool }, Settings {
allow_index_creation: bool,
},
IndexCreation, IndexCreation,
IndexDeletion, IndexDeletion,
IndexUpdate, IndexUpdate,
@ -49,8 +60,11 @@ impl From<KindWithContent> for AutobatchKind {
fn from(kind: KindWithContent) -> Self { fn from(kind: KindWithContent) -> Self {
match kind { match kind {
KindWithContent::DocumentAdditionOrUpdate { KindWithContent::DocumentAdditionOrUpdate {
allow_index_creation, primary_key, .. method,
} => AutobatchKind::DocumentImport { allow_index_creation, primary_key }, allow_index_creation,
primary_key,
..
} => AutobatchKind::DocumentImport { method, allow_index_creation, primary_key },
KindWithContent::DocumentEdition { .. } => AutobatchKind::DocumentEdition, KindWithContent::DocumentEdition { .. } => AutobatchKind::DocumentEdition,
KindWithContent::DocumentDeletion { .. } => { KindWithContent::DocumentDeletion { .. } => {
AutobatchKind::DocumentDeletion { by_filter: false } AutobatchKind::DocumentDeletion { by_filter: false }
@ -71,8 +85,6 @@ impl From<KindWithContent> for AutobatchKind {
KindWithContent::TaskCancelation { .. } KindWithContent::TaskCancelation { .. }
| KindWithContent::TaskDeletion { .. } | KindWithContent::TaskDeletion { .. }
| KindWithContent::DumpCreation { .. } | KindWithContent::DumpCreation { .. }
| KindWithContent::Export { .. }
| KindWithContent::UpgradeDatabase { .. }
| KindWithContent::SnapshotCreation => { | KindWithContent::SnapshotCreation => {
panic!("The autobatcher should never be called with tasks that don't apply to an index.") panic!("The autobatcher should never be called with tasks that don't apply to an index.")
} }
@ -86,6 +98,7 @@ pub enum BatchKind {
ids: Vec<TaskId>, ids: Vec<TaskId>,
}, },
DocumentOperation { DocumentOperation {
method: IndexDocumentsMethod,
allow_index_creation: bool, allow_index_creation: bool,
primary_key: Option<String>, primary_key: Option<String>,
operation_ids: Vec<TaskId>, operation_ids: Vec<TaskId>,
@ -147,48 +160,23 @@ impl BatchKind {
// TODO use an AutoBatchKind as input // TODO use an AutoBatchKind as input
pub fn new( pub fn new(
task_id: TaskId, task_id: TaskId,
kind_with_content: KindWithContent, kind: KindWithContent,
primary_key: Option<&str>, primary_key: Option<&str>,
) -> (ControlFlow<(BatchKind, BatchStopReason), BatchKind>, bool) { ) -> (ControlFlow<BatchKind, BatchKind>, bool) {
use AutobatchKind as K; use AutobatchKind as K;
let kind = kind_with_content.as_kind(); match AutobatchKind::from(kind) {
K::IndexCreation => (Break(BatchKind::IndexCreation { id: task_id }), true),
match AutobatchKind::from(kind_with_content) { K::IndexDeletion => (Break(BatchKind::IndexDeletion { ids: vec![task_id] }), false),
K::IndexCreation => ( K::IndexUpdate => (Break(BatchKind::IndexUpdate { id: task_id }), false),
Break(( K::IndexSwap => (Break(BatchKind::IndexSwap { id: task_id }), false),
BatchKind::IndexCreation { id: task_id },
BatchStopReason::TaskCannotBeBatched { kind, id: task_id },
)),
true,
),
K::IndexDeletion => (
Break((
BatchKind::IndexDeletion { ids: vec![task_id] },
BatchStopReason::IndexDeletion { id: task_id },
)),
false,
),
K::IndexUpdate => (
Break((
BatchKind::IndexUpdate { id: task_id },
BatchStopReason::TaskCannotBeBatched { kind, id: task_id },
)),
false,
),
K::IndexSwap => (
Break((
BatchKind::IndexSwap { id: task_id },
BatchStopReason::TaskCannotBeBatched { kind, id: task_id },
)),
false,
),
K::DocumentClear => (Continue(BatchKind::DocumentClear { ids: vec![task_id] }), false), K::DocumentClear => (Continue(BatchKind::DocumentClear { ids: vec![task_id] }), false),
K::DocumentImport { allow_index_creation, primary_key: pk } K::DocumentImport { method, allow_index_creation, primary_key: pk }
if primary_key.is_none() || pk.is_none() || primary_key == pk.as_deref() => if primary_key.is_none() || pk.is_none() || primary_key == pk.as_deref() =>
{ {
( (
Continue(BatchKind::DocumentOperation { Continue(BatchKind::DocumentOperation {
method,
allow_index_creation, allow_index_creation,
primary_key: pk, primary_key: pk,
operation_ids: vec![task_id], operation_ids: vec![task_id],
@ -197,28 +185,16 @@ impl BatchKind {
) )
} }
// if the primary key set in the task was different than ours we should stop and make this batch fail asap. // if the primary key set in the task was different than ours we should stop and make this batch fail asap.
K::DocumentImport { allow_index_creation, primary_key: pk } => ( K::DocumentImport { method, allow_index_creation, primary_key } => (
Break(( Break(BatchKind::DocumentOperation {
BatchKind::DocumentOperation { method,
allow_index_creation, allow_index_creation,
primary_key: pk.clone(), primary_key,
operation_ids: vec![task_id], operation_ids: vec![task_id],
}, }),
BatchStopReason::PrimaryKeyIndexMismatch {
id: task_id,
in_index: primary_key.unwrap().to_owned(),
in_task: pk.unwrap(),
},
)),
allow_index_creation, allow_index_creation,
), ),
K::DocumentEdition => ( K::DocumentEdition => (Break(BatchKind::DocumentEdition { id: task_id }), false),
Break((
BatchKind::DocumentEdition { id: task_id },
BatchStopReason::TaskCannotBeBatched { kind, id: task_id },
)),
false,
),
K::DocumentDeletion { by_filter: includes_by_filter } => ( K::DocumentDeletion { by_filter: includes_by_filter } => (
Continue(BatchKind::DocumentDeletion { Continue(BatchKind::DocumentDeletion {
deletion_ids: vec![task_id], deletion_ids: vec![task_id],
@ -238,71 +214,54 @@ impl BatchKind {
/// To ease the writing of the code. `true` can be returned when you don't need to create an index /// To ease the writing of the code. `true` can be returned when you don't need to create an index
/// but false can't be returned if you needs to create an index. /// but false can't be returned if you needs to create an index.
#[rustfmt::skip] #[rustfmt::skip]
fn accumulate(self, id: TaskId, kind_with_content: KindWithContent, index_already_exists: bool, primary_key: Option<&str>) -> ControlFlow<(BatchKind, BatchStopReason), BatchKind> { fn accumulate(self, id: TaskId, kind: AutobatchKind, index_already_exists: bool, primary_key: Option<&str>) -> ControlFlow<BatchKind, BatchKind> {
use AutobatchKind as K; use AutobatchKind as K;
let kind = kind_with_content.as_kind(); match (self, kind) {
let autobatch_kind = AutobatchKind::from(kind_with_content);
let pk: Option<String> = match (self.primary_key(), autobatch_kind.primary_key(), primary_key) {
// 1. If incoming task don't interact with primary key -> we can continue
(batch_pk, None | Some(None), _) => {
batch_pk.flatten().map(ToOwned::to_owned)
},
// 2.1 If we already have a primary-key ->
// 2.1.1 If the task we're trying to accumulate have a pk it must be equal to our primary key
(_batch_pk, Some(Some(task_pk)), Some(index_pk)) => if task_pk == index_pk {
Some(task_pk.to_owned())
} else {
return Break((self, BatchStopReason::PrimaryKeyMismatch {
id,
reason: PrimaryKeyMismatchReason::TaskPrimaryKeyDifferFromIndexPrimaryKey {
task_pk: task_pk.to_owned(),
index_pk: index_pk.to_owned(),
},
}))
},
// 2.2 If we don't have a primary-key ->
// 2.2.2 If the batch is set to Some(None), the task should be too
(Some(None), Some(Some(task_pk)), None) => return Break((self, BatchStopReason::PrimaryKeyMismatch {
id,
reason: PrimaryKeyMismatchReason::CannotInterfereWithPrimaryKeyGuessing {
task_pk: task_pk.to_owned(),
},
})),
(Some(Some(batch_pk)), Some(Some(task_pk)), None) => if task_pk == batch_pk {
Some(task_pk.to_owned())
} else {
let batch_pk = batch_pk.to_owned();
let task_pk = task_pk.to_owned();
return Break((self, BatchStopReason::PrimaryKeyMismatch {
id,
reason: PrimaryKeyMismatchReason::TaskPrimaryKeyDifferFromCurrentBatchPrimaryKey {
batch_pk,
task_pk
},
}))
},
(None, Some(Some(task_pk)), None) => Some(task_pk.to_owned())
};
match (self, autobatch_kind) {
// We don't batch any of these operations // We don't batch any of these operations
(this, K::IndexCreation | K::IndexUpdate | K::IndexSwap | K::DocumentEdition) => Break((this, BatchStopReason::TaskCannotBeBatched { kind, id })), (this, K::IndexCreation | K::IndexUpdate | K::IndexSwap | K::DocumentEdition) => Break(this),
// We must not batch tasks that don't have the same index creation rights if the index doesn't already exists. // We must not batch tasks that don't have the same index creation rights if the index doesn't already exists.
(this, kind) if !index_already_exists && this.allow_index_creation() == Some(false) && kind.allow_index_creation() == Some(true) => { (this, kind) if !index_already_exists && this.allow_index_creation() == Some(false) && kind.allow_index_creation() == Some(true) => {
Break((this, BatchStopReason::IndexCreationMismatch { id })) Break(this)
},
// NOTE: We need to negate the whole condition since we're checking if we need to break instead of continue.
// I wrote it this way because it's easier to understand than the other way around.
(this, kind) if !(
// 1. If both task don't interact with primary key -> we can continue
(this.primary_key().is_none() && kind.primary_key().is_none()) ||
// 2. Else ->
(
// 2.1 If we already have a primary-key ->
(
primary_key.is_some() &&
// 2.1.1 If the task we're trying to accumulate have a pk it must be equal to our primary key
// 2.1.2 If the task don't have a primary-key -> we can continue
kind.primary_key().map_or(true, |pk| pk == primary_key)
) ||
// 2.2 If we don't have a primary-key ->
(
// 2.2.1 If both the batch and the task have a primary key they should be equal
// 2.2.2 If the batch is set to Some(None), the task should be too
// 2.2.3 If the batch is set to None -> we can continue
this.primary_key().zip(kind.primary_key()).map_or(true, |(this, kind)| this == kind)
)
)
) // closing the negation
=> {
Break(this)
}, },
// The index deletion can batch with everything but must stop after // The index deletion can batch with everything but must stop after
( (
BatchKind::DocumentClear { mut ids } BatchKind::DocumentClear { mut ids }
| BatchKind::DocumentDeletion { deletion_ids: mut ids, includes_by_filter: _ } | BatchKind::DocumentDeletion { deletion_ids: mut ids, includes_by_filter: _ }
| BatchKind::DocumentOperation { allow_index_creation: _, primary_key: _, operation_ids: mut ids } | BatchKind::DocumentOperation { method: _, allow_index_creation: _, primary_key: _, operation_ids: mut ids }
| BatchKind::Settings { allow_index_creation: _, settings_ids: mut ids }, | BatchKind::Settings { allow_index_creation: _, settings_ids: mut ids },
K::IndexDeletion, K::IndexDeletion,
) => { ) => {
ids.push(id); ids.push(id);
Break((BatchKind::IndexDeletion { ids }, BatchStopReason::IndexDeletion { id })) Break(BatchKind::IndexDeletion { ids })
} }
( (
BatchKind::ClearAndSettings { settings_ids: mut ids, allow_index_creation: _, mut other }, BatchKind::ClearAndSettings { settings_ids: mut ids, allow_index_creation: _, mut other },
@ -310,7 +269,7 @@ impl BatchKind {
) => { ) => {
ids.push(id); ids.push(id);
ids.append(&mut other); ids.append(&mut other);
Break((BatchKind::IndexDeletion { ids }, BatchStopReason::IndexDeletion { id })) Break(BatchKind::IndexDeletion { ids })
} }
( (
@ -323,37 +282,51 @@ impl BatchKind {
( (
this @ BatchKind::DocumentClear { .. }, this @ BatchKind::DocumentClear { .. },
K::DocumentImport { .. } | K::Settings { .. }, K::DocumentImport { .. } | K::Settings { .. },
) => Break((this, BatchStopReason::DocumentOperationWithSettings { id })), ) => Break(this),
( (
BatchKind::DocumentOperation { allow_index_creation: _, primary_key: _, mut operation_ids }, BatchKind::DocumentOperation { method: _, allow_index_creation: _, primary_key: _, mut operation_ids },
K::DocumentClear, K::DocumentClear,
) => { ) => {
operation_ids.push(id); operation_ids.push(id);
Continue(BatchKind::DocumentClear { ids: operation_ids }) Continue(BatchKind::DocumentClear { ids: operation_ids })
} }
// we can autobatch different kind of document operations and mix replacements with updates // we can autobatch the same kind of document additions / updates
( (
BatchKind::DocumentOperation { allow_index_creation, primary_key: _, mut operation_ids }, BatchKind::DocumentOperation { method: ReplaceDocuments, allow_index_creation, primary_key: _, mut operation_ids },
K::DocumentImport { primary_key: _, .. }, K::DocumentImport { method: ReplaceDocuments, primary_key: pk, .. },
) => { ) => {
operation_ids.push(id); operation_ids.push(id);
Continue(BatchKind::DocumentOperation { Continue(BatchKind::DocumentOperation {
method: ReplaceDocuments,
allow_index_creation, allow_index_creation,
operation_ids, operation_ids,
primary_key: pk, primary_key: pk,
}) })
} }
( (
BatchKind::DocumentOperation { allow_index_creation, primary_key: _, mut operation_ids }, BatchKind::DocumentOperation { method: UpdateDocuments, allow_index_creation, primary_key: _, mut operation_ids },
K::DocumentImport { method: UpdateDocuments, primary_key: pk, .. },
) => {
operation_ids.push(id);
Continue(BatchKind::DocumentOperation {
method: UpdateDocuments,
allow_index_creation,
primary_key: pk,
operation_ids,
})
}
(
BatchKind::DocumentOperation { method, allow_index_creation, primary_key, mut operation_ids },
K::DocumentDeletion { by_filter: false }, K::DocumentDeletion { by_filter: false },
) => { ) => {
operation_ids.push(id); operation_ids.push(id);
Continue(BatchKind::DocumentOperation { Continue(BatchKind::DocumentOperation {
method,
allow_index_creation, allow_index_creation,
primary_key,
operation_ids, operation_ids,
primary_key: pk,
}) })
} }
// We can't batch a document operation with a delete by filter // We can't batch a document operation with a delete by filter
@ -361,12 +334,19 @@ impl BatchKind {
this @ BatchKind::DocumentOperation { .. }, this @ BatchKind::DocumentOperation { .. },
K::DocumentDeletion { by_filter: true }, K::DocumentDeletion { by_filter: true },
) => { ) => {
Break((this, BatchStopReason::DocumentOperationWithDeletionByFilter { id })) Break(this)
} }
// but we can't autobatch documents if it's not the same kind
// this match branch MUST be AFTER the previous one
(
this @ BatchKind::DocumentOperation { .. },
K::DocumentImport { .. },
) => Break(this),
( (
this @ BatchKind::DocumentOperation { .. }, this @ BatchKind::DocumentOperation { .. },
K::Settings { .. }, K::Settings { .. },
) => Break((this, BatchStopReason::DocumentOperationWithSettings { id })), ) => Break(this),
(BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: _ }, K::DocumentClear) => { (BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: _ }, K::DocumentClear) => {
deletion_ids.push(id); deletion_ids.push(id);
@ -376,15 +356,16 @@ impl BatchKind {
( (
this @ BatchKind::DocumentDeletion { deletion_ids: _, includes_by_filter: true }, this @ BatchKind::DocumentDeletion { deletion_ids: _, includes_by_filter: true },
K::DocumentImport { .. } K::DocumentImport { .. }
) => Break((this, BatchStopReason::DeletionByFilterWithDocumentOperation { id })), ) => Break(this),
// we can autobatch the deletion and import if the index already exists // we can autobatch the deletion and import if the index already exists
( (
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false }, BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { allow_index_creation, primary_key } K::DocumentImport { method, allow_index_creation, primary_key }
) if index_already_exists => { ) if index_already_exists => {
deletion_ids.push(id); deletion_ids.push(id);
Continue(BatchKind::DocumentOperation { Continue(BatchKind::DocumentOperation {
method,
allow_index_creation, allow_index_creation,
primary_key, primary_key,
operation_ids: deletion_ids, operation_ids: deletion_ids,
@ -393,28 +374,29 @@ impl BatchKind {
// we can autobatch the deletion and import if both can't create an index // we can autobatch the deletion and import if both can't create an index
( (
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false }, BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { allow_index_creation, primary_key } K::DocumentImport { method, allow_index_creation, primary_key }
) if !allow_index_creation => { ) if !allow_index_creation => {
deletion_ids.push(id); deletion_ids.push(id);
Continue(BatchKind::DocumentOperation { Continue(BatchKind::DocumentOperation {
method,
allow_index_creation, allow_index_creation,
primary_key, primary_key,
operation_ids: deletion_ids, operation_ids: deletion_ids,
}) })
} }
// we can't autobatch a deletion and an import if the index does not exist but would be created by an addition // we can't autobatch a deletion and an import if the index does not exists but would be created by an addition
( (
this @ BatchKind::DocumentDeletion { .. }, this @ BatchKind::DocumentDeletion { .. },
K::DocumentImport { .. } K::DocumentImport { .. }
) => { ) => {
Break((this, BatchStopReason::IndexCreationMismatch { id })) Break(this)
} }
(BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter }, K::DocumentDeletion { by_filter }) => { (BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter }, K::DocumentDeletion { by_filter }) => {
deletion_ids.push(id); deletion_ids.push(id);
Continue(BatchKind::DocumentDeletion { deletion_ids, includes_by_filter: includes_by_filter | by_filter }) Continue(BatchKind::DocumentDeletion { deletion_ids, includes_by_filter: includes_by_filter | by_filter })
} }
(this @ BatchKind::DocumentDeletion { .. }, K::Settings { .. }) => Break((this, BatchStopReason::DocumentOperationWithSettings { id })), (this @ BatchKind::DocumentDeletion { .. }, K::Settings { .. }) => Break(this),
( (
BatchKind::Settings { settings_ids, allow_index_creation }, BatchKind::Settings { settings_ids, allow_index_creation },
@ -427,7 +409,7 @@ impl BatchKind {
( (
this @ BatchKind::Settings { .. }, this @ BatchKind::Settings { .. },
K::DocumentImport { .. } | K::DocumentDeletion { .. }, K::DocumentImport { .. } | K::DocumentDeletion { .. },
) => Break((this, BatchStopReason::SettingsWithDocumentOperation { id })), ) => Break(this),
( (
BatchKind::Settings { mut settings_ids, allow_index_creation }, BatchKind::Settings { mut settings_ids, allow_index_creation },
K::Settings { .. }, K::Settings { .. },
@ -450,7 +432,7 @@ impl BatchKind {
allow_index_creation, allow_index_creation,
}) })
} }
(this @ BatchKind::ClearAndSettings { .. }, K::DocumentImport { .. }) => Break((this, BatchStopReason::SettingsWithDocumentOperation { id })), (this @ BatchKind::ClearAndSettings { .. }, K::DocumentImport { .. }) => Break(this),
( (
BatchKind::ClearAndSettings { BatchKind::ClearAndSettings {
mut other, mut other,
@ -506,7 +488,7 @@ pub fn autobatch(
enqueued: Vec<(TaskId, KindWithContent)>, enqueued: Vec<(TaskId, KindWithContent)>,
index_already_exists: bool, index_already_exists: bool,
primary_key: Option<&str>, primary_key: Option<&str>,
) -> Option<(BatchKind, bool, Option<BatchStopReason>)> { ) -> Option<(BatchKind, bool)> {
let mut enqueued = enqueued.into_iter(); let mut enqueued = enqueued.into_iter();
let (id, kind) = enqueued.next()?; let (id, kind) = enqueued.next()?;
@ -515,22 +497,18 @@ pub fn autobatch(
let (mut acc, must_create_index) = match BatchKind::new(id, kind, primary_key) { let (mut acc, must_create_index) = match BatchKind::new(id, kind, primary_key) {
(Continue(acc), create) => (acc, create), (Continue(acc), create) => (acc, create),
(Break((acc, batch_stop_reason)), create) => { (Break(acc), create) => return Some((acc, create)),
return Some((acc, create, Some(batch_stop_reason)))
}
}; };
// if an index has been created in the previous step we can consider it as existing. // if an index has been created in the previous step we can consider it as existing.
index_exist |= must_create_index; index_exist |= must_create_index;
for (id, kind_with_content) in enqueued { for (id, kind) in enqueued {
acc = match acc.accumulate(id, kind_with_content, index_exist, primary_key) { acc = match acc.accumulate(id, kind.into(), index_exist, primary_key) {
Continue(acc) => acc, Continue(acc) => acc,
Break((acc, batch_stop_reason)) => { Break(acc) => return Some((acc, must_create_index)),
return Some((acc, must_create_index, Some(batch_stop_reason)))
}
}; };
} }
Some((acc, must_create_index, None)) Some((acc, must_create_index))
} }

View File

@ -1,7 +1,7 @@
use meilisearch_types::milli::update::IndexDocumentsMethod::{ use meilisearch_types::milli::update::IndexDocumentsMethod::{
self, ReplaceDocuments, UpdateDocuments, self, ReplaceDocuments, UpdateDocuments,
}; };
use meilisearch_types::tasks::{BatchStopReason, IndexSwap, KindWithContent}; use meilisearch_types::tasks::{IndexSwap, KindWithContent};
use uuid::Uuid; use uuid::Uuid;
use self::autobatcher::{autobatch, BatchKind}; use self::autobatcher::{autobatch, BatchKind};
@ -20,7 +20,7 @@ fn autobatch_from(
index_already_exists: bool, index_already_exists: bool,
primary_key: Option<&str>, primary_key: Option<&str>,
input: impl IntoIterator<Item = KindWithContent>, input: impl IntoIterator<Item = KindWithContent>,
) -> Option<(BatchKind, bool, Option<BatchStopReason>)> { ) -> Option<(BatchKind, bool)> {
autobatch( autobatch(
input.into_iter().enumerate().map(|(id, kind)| (id as TaskId, kind)).collect(), input.into_iter().enumerate().map(|(id, kind)| (id as TaskId, kind)).collect(),
index_already_exists, index_already_exists,
@ -92,304 +92,304 @@ fn idx_swap() -> KindWithContent {
fn autobatch_simple_operation_together() { fn autobatch_simple_operation_together() {
// we can autobatch one or multiple `ReplaceDocuments` together. // we can autobatch one or multiple `ReplaceDocuments` together.
// if the index exists. // if the index exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, false , None), doc_imp(ReplaceDocuments, false , None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, false , None), doc_imp(ReplaceDocuments, false , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// if it doesn't exists. // if it doesn't exists.
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
// we can autobatch one or multiple `UpdateDocuments` together. // we can autobatch one or multiple `UpdateDocuments` together.
// if the index exists. // if the index exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// if it doesn't exists. // if it doesn't exists.
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// we can autobatch one or multiple DocumentDeletion together // we can autobatch one or multiple DocumentDeletion together
debug_snapshot!(autobatch_from(true, None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false))");
// we can autobatch one or multiple DocumentDeletionByFilter together // we can autobatch one or multiple DocumentDeletionByFilter together
debug_snapshot!(autobatch_from(true, None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false))");
// we can autobatch one or multiple Settings together // we can autobatch one or multiple Settings together
debug_snapshot!(autobatch_from(true, None, [settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [settings(true), settings(true), settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [settings(true), settings(true), settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [settings(false), settings(false), settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0, 1, 2] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [settings(false), settings(false), settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(false,None, [settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [settings(true), settings(true), settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [settings(true), settings(true), settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [settings(false), settings(false), settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0, 1, 2] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [settings(false), settings(false), settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0, 1, 2] }, false))");
// We can autobatch document addition with document deletion // We can autobatch document addition with document deletion
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
// And the other way around // And the other way around
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false, None))"###); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
// But we can't autobatch document addition with document deletion by filter // But we can't autobatch document addition with document deletion by filter
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"###); debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
// And the other way around // And the other way around
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
} }
#[test] #[test]
fn simple_different_document_operations_autobatch_together() { fn simple_document_operation_dont_autobatch_with_other() {
// addition and updates with deletion by filter can't batch together // addition, updates and deletion by filter can't batch together
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithDeletionByFilter { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(DeletionByFilterWithDocumentOperation { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexCreation, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexCreation, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(TaskCannotBeBatched { kind: IndexCreation, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(TaskCannotBeBatched { kind: IndexCreation, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexUpdate, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexUpdate, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(TaskCannotBeBatched { kind: IndexUpdate, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(TaskCannotBeBatched { kind: IndexUpdate, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexSwap, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(TaskCannotBeBatched { kind: IndexSwap, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(TaskCannotBeBatched { kind: IndexSwap, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false, Some(TaskCannotBeBatched { kind: IndexSwap, id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
} }
#[test] #[test]
fn document_addition_doesnt_batch_with_settings() { fn document_addition_doesnt_batch_with_settings() {
// simple case // simple case
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
// multiple settings and doc addition // multiple settings and doc addition
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, Some(DocumentOperationWithSettings { id: 2 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, Some(DocumentOperationWithSettings { id: 2 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
// addition and setting unordered // addition and setting unordered
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
// Doesn't batch with other forbidden operations // Doesn't batch with other forbidden operations
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
} }
#[test] #[test]
fn clear_and_additions() { fn clear_and_additions() {
// these two doesn't need to batch // these two doesn't need to batch
debug_snapshot!(autobatch_from(true, None, [doc_clr(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentClear { ids: [0] }, false, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_clr(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentClear { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_clr(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentClear { ids: [0] }, false, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_clr(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentClear { ids: [0] }, false))");
// Basic use case // Basic use case
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2] }, true))");
// This batch kind doesn't mix with other document addition // This batch kind doesn't mix with other document addition
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentClear { ids: [0, 1, 2] }, true, Some(DocumentOperationWithSettings { id: 3 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentClear { ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentClear { ids: [0, 1, 2] }, true, Some(DocumentOperationWithSettings { id: 3 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentClear { ids: [0, 1, 2] }, true))");
// But you can batch multiple clear together // But you can batch multiple clear together
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr(), doc_clr(), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2, 3, 4] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), doc_clr(), doc_clr(), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2, 3, 4] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr(), doc_clr(), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2, 3, 4] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_clr(), doc_clr(), doc_clr()]), @"Some((DocumentClear { ids: [0, 1, 2, 3, 4] }, true))");
} }
#[test] #[test]
fn clear_and_additions_and_settings() { fn clear_and_additions_and_settings() {
// A clear don't need to autobatch the settings that happens AFTER there is no documents // A clear don't need to autobatch the settings that happens AFTER there is no documents
debug_snapshot!(autobatch_from(true, None, [doc_clr(), settings(true)]), @"Some((DocumentClear { ids: [0] }, false, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_clr(), settings(true)]), @"Some((DocumentClear { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [settings(true), doc_clr(), settings(true)]), @"Some((ClearAndSettings { other: [1], allow_index_creation: true, settings_ids: [0, 2] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [settings(true), doc_clr(), settings(true)]), @"Some((ClearAndSettings { other: [1], allow_index_creation: true, settings_ids: [0, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
} }
#[test] #[test]
fn anything_and_index_deletion() { fn anything_and_index_deletion() {
// The `IndexDeletion` doesn't batch with anything that happens AFTER. // The `IndexDeletion` doesn't batch with anything that happens AFTER.
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false, Some(IndexDeletion { id: 0 })))"); debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false))");
// The index deletion can accept almost any type of `BatchKind` and transform it to an `IndexDeletion`. // The index deletion can accept almost any type of `BatchKind` and transform it to an `IndexDeletion`.
// First, the basic cases // First, the basic cases
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false, Some(IndexDeletion { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
} }
#[test] #[test]
fn allowed_and_disallowed_index_creation() { fn allowed_and_disallowed_index_creation() {
// `DocumentImport` can't be mixed with those disallowed to do so except if the index already exists. // `DocumentImport` can't be mixed with those disallowed to do so except if the index already exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false, None))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(DocumentOperationWithSettings { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
// batch deletion and addition // batch deletion and addition
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false, Some(IndexCreationMismatch { id: 1 })))"); debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
} }
#[test] #[test]
fn autobatch_primary_key() { fn autobatch_primary_key() {
// ==> If I have a pk // ==> If I have a pk
// With a single update // With a single update
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// With a multiple updates // With a multiple updates
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1, 2] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1, 2] }, true, None))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: TaskPrimaryKeyDifferFromIndexPrimaryKey { task_pk: "other", index_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, Some(PrimaryKeyIndexMismatch { id: 0, in_index: "id", in_task: "other" })))"###); debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// ==> If I don't have a pk // ==> If I don't have a pk
// With a single update // With a single update
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// With a multiple updates // With a multiple updates
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true, None))"); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true, Some(PrimaryKeyMismatch { id: 1, reason: CannotInterfereWithPrimaryKeyGuessing { task_pk: "id" } })))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true, None))"###); debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
} }

View File

@ -1,10 +1,9 @@
use std::fmt; use std::fmt;
use std::io::ErrorKind;
use meilisearch_types::heed::RoTxn; use meilisearch_types::heed::RoTxn;
use meilisearch_types::milli::update::IndexDocumentsMethod; use meilisearch_types::milli::update::IndexDocumentsMethod;
use meilisearch_types::settings::{Settings, Unchecked}; use meilisearch_types::settings::{Settings, Unchecked};
use meilisearch_types::tasks::{BatchStopReason, Kind, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use uuid::Uuid; use uuid::Uuid;
@ -48,18 +47,11 @@ pub(crate) enum Batch {
IndexSwap { IndexSwap {
task: Task, task: Task,
}, },
Export {
task: Task,
},
UpgradeDatabase {
tasks: Vec<Task>,
},
} }
#[derive(Debug)] #[derive(Debug)]
pub(crate) enum DocumentOperation { pub(crate) enum DocumentOperation {
Replace(Uuid), Add(Uuid),
Update(Uuid),
Delete(Vec<String>), Delete(Vec<String>),
} }
@ -69,6 +61,7 @@ pub(crate) enum IndexOperation {
DocumentOperation { DocumentOperation {
index_uid: String, index_uid: String,
primary_key: Option<String>, primary_key: Option<String>,
method: IndexDocumentsMethod,
operations: Vec<DocumentOperation>, operations: Vec<DocumentOperation>,
tasks: Vec<Task>, tasks: Vec<Task>,
}, },
@ -107,13 +100,11 @@ impl Batch {
Batch::TaskCancelation { task, .. } Batch::TaskCancelation { task, .. }
| Batch::Dump(task) | Batch::Dump(task)
| Batch::IndexCreation { task, .. } | Batch::IndexCreation { task, .. }
| Batch::Export { task }
| Batch::IndexUpdate { task, .. } => { | Batch::IndexUpdate { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap() RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
} }
Batch::SnapshotCreation(tasks) Batch::SnapshotCreation(tasks)
| Batch::TaskDeletions(tasks) | Batch::TaskDeletions(tasks)
| Batch::UpgradeDatabase { tasks }
| Batch::IndexDeletion { tasks, .. } => { | Batch::IndexDeletion { tasks, .. } => {
RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid)) RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid))
} }
@ -147,8 +138,6 @@ impl Batch {
| TaskDeletions(_) | TaskDeletions(_)
| SnapshotCreation(_) | SnapshotCreation(_)
| Dump(_) | Dump(_)
| Export { .. }
| UpgradeDatabase { .. }
| IndexSwap { .. } => None, | IndexSwap { .. } => None,
IndexOperation { op, .. } => Some(op.index_uid()), IndexOperation { op, .. } => Some(op.index_uid()),
IndexCreation { index_uid, .. } IndexCreation { index_uid, .. }
@ -173,8 +162,6 @@ impl fmt::Display for Batch {
Batch::IndexUpdate { .. } => f.write_str("IndexUpdate")?, Batch::IndexUpdate { .. } => f.write_str("IndexUpdate")?,
Batch::IndexDeletion { .. } => f.write_str("IndexDeletion")?, Batch::IndexDeletion { .. } => f.write_str("IndexDeletion")?,
Batch::IndexSwap { .. } => f.write_str("IndexSwap")?, Batch::IndexSwap { .. } => f.write_str("IndexSwap")?,
Batch::Export { .. } => f.write_str("Export")?,
Batch::UpgradeDatabase { .. } => f.write_str("UpgradeDatabase")?,
}; };
match index_uid { match index_uid {
Some(name) => f.write_fmt(format_args!(" on {name:?} from tasks: {tasks:?}")), Some(name) => f.write_fmt(format_args!(" on {name:?} from tasks: {tasks:?}")),
@ -261,7 +248,7 @@ impl IndexScheduler {
_ => unreachable!(), _ => unreachable!(),
} }
} }
BatchKind::DocumentOperation { operation_ids, .. } => { BatchKind::DocumentOperation { method, operation_ids, .. } => {
let tasks = self.queue.get_existing_tasks_for_processing_batch( let tasks = self.queue.get_existing_tasks_for_processing_batch(
rtxn, rtxn,
current_batch, current_batch,
@ -283,17 +270,9 @@ impl IndexScheduler {
for task in tasks.iter() { for task in tasks.iter() {
match task.kind { match task.kind {
KindWithContent::DocumentAdditionOrUpdate { KindWithContent::DocumentAdditionOrUpdate { content_file, .. } => {
content_file, method, .. operations.push(DocumentOperation::Add(content_file));
} => match method { }
IndexDocumentsMethod::ReplaceDocuments => {
operations.push(DocumentOperation::Replace(content_file))
}
IndexDocumentsMethod::UpdateDocuments => {
operations.push(DocumentOperation::Update(content_file))
}
_ => unreachable!("Unknown document merging method"),
},
KindWithContent::DocumentDeletion { ref documents_ids, .. } => { KindWithContent::DocumentDeletion { ref documents_ids, .. } => {
operations.push(DocumentOperation::Delete(documents_ids.clone())); operations.push(DocumentOperation::Delete(documents_ids.clone()));
} }
@ -305,6 +284,7 @@ impl IndexScheduler {
op: IndexOperation::DocumentOperation { op: IndexOperation::DocumentOperation {
index_uid, index_uid,
primary_key, primary_key,
method,
operations, operations,
tasks, tasks,
}, },
@ -430,13 +410,11 @@ impl IndexScheduler {
} }
/// Create the next batch to be processed; /// Create the next batch to be processed;
/// 0. We get the *last* task to cancel. /// 1. We get the *last* task to cancel.
/// 1. We get the tasks to upgrade.
/// 2. We get the *next* task to delete. /// 2. We get the *next* task to delete.
/// 3. We get the *next* export to process. /// 3. We get the *next* snapshot to process.
/// 4. We get the *next* snapshot to process. /// 4. We get the *next* dump to process.
/// 5. We get the *next* dump to process. /// 5. We get the *next* tasks to process for a specific index.
/// 6. We get the *next* tasks to process for a specific index.
#[tracing::instrument(level = "trace", skip(self, rtxn), target = "indexing::scheduler")] #[tracing::instrument(level = "trace", skip(self, rtxn), target = "indexing::scheduler")]
pub(crate) fn create_next_batch( pub(crate) fn create_next_batch(
&self, &self,
@ -449,99 +427,42 @@ impl IndexScheduler {
let mut current_batch = ProcessingBatch::new(batch_id); let mut current_batch = ProcessingBatch::new(batch_id);
let enqueued = &self.queue.tasks.get_status(rtxn, Status::Enqueued)?; let enqueued = &self.queue.tasks.get_status(rtxn, Status::Enqueued)?;
let count_total_enqueued = enqueued.len();
let failed = &self.queue.tasks.get_status(rtxn, Status::Failed)?;
// 0. we get the last task to cancel.
let to_cancel = self.queue.tasks.get_kind(rtxn, Kind::TaskCancelation)? & enqueued; let to_cancel = self.queue.tasks.get_kind(rtxn, Kind::TaskCancelation)? & enqueued;
// 1. we get the last task to cancel.
if let Some(task_id) = to_cancel.max() { if let Some(task_id) = to_cancel.max() {
let mut task = let mut task =
self.queue.tasks.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?; self.queue.tasks.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
current_batch.processing(Some(&mut task)); current_batch.processing(Some(&mut task));
current_batch.reason(BatchStopReason::TaskCannotBeBatched {
kind: Kind::TaskCancelation,
id: task_id,
});
return Ok(Some((Batch::TaskCancelation { task }, current_batch))); return Ok(Some((Batch::TaskCancelation { task }, current_batch)));
} }
// 1. We upgrade the instance
// There shouldn't be multiple upgrade tasks but just in case we're going to batch all of them at the same time
let upgrade = self.queue.tasks.get_kind(rtxn, Kind::UpgradeDatabase)? & (enqueued | failed);
if !upgrade.is_empty() {
let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, upgrade)?;
// In the case of an upgrade database batch, we want to find back the original batch that tried processing it
// and re-use its id
if let Some(batch_uid) = tasks.last().unwrap().batch_uid {
current_batch.uid = batch_uid;
}
current_batch.processing(&mut tasks);
current_batch
.reason(BatchStopReason::TaskKindCannotBeBatched { kind: Kind::UpgradeDatabase });
return Ok(Some((Batch::UpgradeDatabase { tasks }, current_batch)));
}
// check the version of the scheduler here.
// if the version is not the current, refuse to batch any additional task.
let version = self.version.get_version(rtxn)?;
let package_version = (
meilisearch_types::versioning::VERSION_MAJOR,
meilisearch_types::versioning::VERSION_MINOR,
meilisearch_types::versioning::VERSION_PATCH,
);
if version != Some(package_version) {
return Err(Error::UnrecoverableError(Box::new(
Error::IndexSchedulerVersionMismatch {
index_scheduler_version: version.unwrap_or((1, 12, 0)),
package_version,
},
)));
}
// 2. we get the next task to delete // 2. we get the next task to delete
let to_delete = self.queue.tasks.get_kind(rtxn, Kind::TaskDeletion)? & enqueued; let to_delete = self.queue.tasks.get_kind(rtxn, Kind::TaskDeletion)? & enqueued;
if !to_delete.is_empty() { if !to_delete.is_empty() {
let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, to_delete)?; let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, to_delete)?;
current_batch.processing(&mut tasks); current_batch.processing(&mut tasks);
current_batch
.reason(BatchStopReason::TaskKindCannotBeBatched { kind: Kind::TaskDeletion });
return Ok(Some((Batch::TaskDeletions(tasks), current_batch))); return Ok(Some((Batch::TaskDeletions(tasks), current_batch)));
} }
// 3. we batch the export. // 3. we batch the snapshot.
let to_export = self.queue.tasks.get_kind(rtxn, Kind::Export)? & enqueued;
if !to_export.is_empty() {
let task_id = to_export.iter().next().expect("There must be at least one export task");
let mut task = self.queue.tasks.get_task(rtxn, task_id)?.unwrap();
current_batch.processing([&mut task]);
current_batch.reason(BatchStopReason::TaskKindCannotBeBatched { kind: Kind::Export });
return Ok(Some((Batch::Export { task }, current_batch)));
}
// 4. we batch the snapshot.
let to_snapshot = self.queue.tasks.get_kind(rtxn, Kind::SnapshotCreation)? & enqueued; let to_snapshot = self.queue.tasks.get_kind(rtxn, Kind::SnapshotCreation)? & enqueued;
if !to_snapshot.is_empty() { if !to_snapshot.is_empty() {
let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, to_snapshot)?; let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, to_snapshot)?;
current_batch.processing(&mut tasks); current_batch.processing(&mut tasks);
current_batch
.reason(BatchStopReason::TaskKindCannotBeBatched { kind: Kind::SnapshotCreation });
return Ok(Some((Batch::SnapshotCreation(tasks), current_batch))); return Ok(Some((Batch::SnapshotCreation(tasks), current_batch)));
} }
// 5. we batch the dumps. // 4. we batch the dumps.
let to_dump = self.queue.tasks.get_kind(rtxn, Kind::DumpCreation)? & enqueued; let to_dump = self.queue.tasks.get_kind(rtxn, Kind::DumpCreation)? & enqueued;
if let Some(to_dump) = to_dump.min() { if let Some(to_dump) = to_dump.min() {
let mut task = let mut task =
self.queue.tasks.get_task(rtxn, to_dump)?.ok_or(Error::CorruptedTaskQueue)?; self.queue.tasks.get_task(rtxn, to_dump)?.ok_or(Error::CorruptedTaskQueue)?;
current_batch.processing(Some(&mut task)); current_batch.processing(Some(&mut task));
current_batch.reason(BatchStopReason::TaskCannotBeBatched {
kind: Kind::DumpCreation,
id: task.uid,
});
return Ok(Some((Batch::Dump(task), current_batch))); return Ok(Some((Batch::Dump(task), current_batch)));
} }
// 6. We make a batch from the unprioritised tasks. Start by taking the next enqueued task. // 5. We make a batch from the unprioritised tasks. Start by taking the next enqueued task.
let task_id = if let Some(task_id) = enqueued.min() { task_id } else { return Ok(None) }; let task_id = if let Some(task_id) = enqueued.min() { task_id } else { return Ok(None) };
let mut task = let mut task =
self.queue.tasks.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?; self.queue.tasks.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
@ -555,10 +476,6 @@ impl IndexScheduler {
} else { } else {
assert!(matches!(&task.kind, KindWithContent::IndexSwap { swaps } if swaps.is_empty())); assert!(matches!(&task.kind, KindWithContent::IndexSwap { swaps } if swaps.is_empty()));
current_batch.processing(Some(&mut task)); current_batch.processing(Some(&mut task));
current_batch.reason(BatchStopReason::TaskCannotBeBatched {
kind: Kind::IndexSwap,
id: task.uid,
});
return Ok(Some((Batch::IndexSwap { task }, current_batch))); return Ok(Some((Batch::IndexSwap { task }, current_batch)));
}; };
@ -580,14 +497,9 @@ impl IndexScheduler {
1 1
}; };
let mut stop_reason = BatchStopReason::default();
let mut enqueued = Vec::new(); let mut enqueued = Vec::new();
let mut total_size: u64 = 0; let mut total_size: u64 = 0;
for task_id in index_tasks.into_iter() { for task_id in index_tasks.into_iter().take(tasks_limit) {
if enqueued.len() >= tasks_limit {
stop_reason = BatchStopReason::ReachedTaskLimit { task_limit: tasks_limit };
break;
}
let task = self let task = self
.queue .queue
.tasks .tasks
@ -595,35 +507,20 @@ impl IndexScheduler {
.and_then(|task| task.ok_or(Error::CorruptedTaskQueue))?; .and_then(|task| task.ok_or(Error::CorruptedTaskQueue))?;
if let Some(uuid) = task.content_uuid() { if let Some(uuid) = task.content_uuid() {
let content_size = match self.queue.file_store.compute_size(uuid) { let content_size = self.queue.file_store.compute_size(uuid)?;
Ok(content_size) => content_size,
Err(file_store::Error::IoError(err)) if err.kind() == ErrorKind::NotFound => 0,
Err(otherwise) => return Err(otherwise.into()),
};
total_size = total_size.saturating_add(content_size); total_size = total_size.saturating_add(content_size);
} }
let size_limit = self.scheduler.batched_tasks_size_limit; if total_size > self.scheduler.batched_tasks_size_limit && !enqueued.is_empty() {
if total_size > size_limit && !enqueued.is_empty() {
stop_reason = BatchStopReason::ReachedSizeLimit { size_limit, size: total_size };
break; break;
} }
enqueued.push((task.uid, task.kind)); enqueued.push((task.uid, task.kind));
} }
stop_reason.replace_unspecified({ if let Some((batchkind, create_index)) =
if enqueued.len() == count_total_enqueued as usize {
BatchStopReason::ExhaustedEnqueuedTasks
} else {
BatchStopReason::ExhaustedEnqueuedTasksForIndex { index: index_name.to_owned() }
}
});
if let Some((batchkind, create_index, autobatch_stop_reason)) =
autobatcher::autobatch(enqueued, index_already_exists, primary_key.as_deref()) autobatcher::autobatch(enqueued, index_already_exists, primary_key.as_deref())
{ {
current_batch.reason(autobatch_stop_reason.unwrap_or(stop_reason));
return Ok(self return Ok(self
.create_next_batch_index( .create_next_batch_index(
rtxn, rtxn,

View File

@ -4,10 +4,8 @@ mod autobatcher_test;
mod create_batch; mod create_batch;
mod process_batch; mod process_batch;
mod process_dump_creation; mod process_dump_creation;
mod process_export;
mod process_index_operation; mod process_index_operation;
mod process_snapshot_creation; mod process_snapshot_creation;
mod process_upgrade;
#[cfg(test)] #[cfg(test)]
mod test; mod test;
#[cfg(test)] #[cfg(test)]
@ -21,12 +19,9 @@ use std::path::PathBuf;
use std::sync::atomic::{AtomicBool, AtomicU32, Ordering}; use std::sync::atomic::{AtomicBool, AtomicU32, Ordering};
use std::sync::Arc; use std::sync::Arc;
use convert_case::{Case, Casing as _};
use meilisearch_types::error::ResponseError; use meilisearch_types::error::ResponseError;
use meilisearch_types::heed::{Env, WithoutTls};
use meilisearch_types::milli; use meilisearch_types::milli;
use meilisearch_types::tasks::Status; use meilisearch_types::tasks::Status;
use process_batch::ProcessBatchInfo;
use rayon::current_num_threads; use rayon::current_num_threads;
use rayon::iter::{IntoParallelIterator, ParallelIterator}; use rayon::iter::{IntoParallelIterator, ParallelIterator};
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
@ -75,18 +70,10 @@ pub struct Scheduler {
pub(crate) snapshots_path: PathBuf, pub(crate) snapshots_path: PathBuf,
/// The path to the folder containing the auth LMDB env. /// The path to the folder containing the auth LMDB env.
pub(crate) auth_env: Env<WithoutTls>, pub(crate) auth_path: PathBuf,
/// The path to the version file of Meilisearch. /// The path to the version file of Meilisearch.
pub(crate) version_file_path: PathBuf, pub(crate) version_file_path: PathBuf,
/// The maximal number of entries in the search query cache of an embedder.
///
/// 0 disables the cache.
pub(crate) embedding_cache_cap: usize,
/// Snapshot compaction status.
pub(crate) experimental_no_snapshot_compaction: bool,
} }
impl Scheduler { impl Scheduler {
@ -99,14 +86,12 @@ impl Scheduler {
batched_tasks_size_limit: self.batched_tasks_size_limit, batched_tasks_size_limit: self.batched_tasks_size_limit,
dumps_path: self.dumps_path.clone(), dumps_path: self.dumps_path.clone(),
snapshots_path: self.snapshots_path.clone(), snapshots_path: self.snapshots_path.clone(),
auth_env: self.auth_env.clone(), auth_path: self.auth_path.clone(),
version_file_path: self.version_file_path.clone(), version_file_path: self.version_file_path.clone(),
embedding_cache_cap: self.embedding_cache_cap,
experimental_no_snapshot_compaction: self.experimental_no_snapshot_compaction,
} }
} }
pub fn new(options: &IndexSchedulerOptions, auth_env: Env<WithoutTls>) -> Scheduler { pub fn new(options: &IndexSchedulerOptions) -> Scheduler {
Scheduler { Scheduler {
must_stop_processing: MustStopProcessing::default(), must_stop_processing: MustStopProcessing::default(),
// we want to start the loop right away in case meilisearch was ctrl+Ced while processing things // we want to start the loop right away in case meilisearch was ctrl+Ced while processing things
@ -116,10 +101,8 @@ impl Scheduler {
batched_tasks_size_limit: options.batched_tasks_size_limit, batched_tasks_size_limit: options.batched_tasks_size_limit,
dumps_path: options.dumps_path.clone(), dumps_path: options.dumps_path.clone(),
snapshots_path: options.snapshots_path.clone(), snapshots_path: options.snapshots_path.clone(),
auth_env, auth_path: options.auth_path.clone(),
version_file_path: options.version_file_path.clone(), version_file_path: options.version_file_path.clone(),
embedding_cache_cap: options.embedding_cache_cap,
experimental_no_snapshot_compaction: options.experimental_no_snapshot_compaction,
} }
} }
} }
@ -182,41 +165,13 @@ impl IndexScheduler {
let processing_batch = &mut processing_batch; let processing_batch = &mut processing_batch;
let progress = progress.clone(); let progress = progress.clone();
std::thread::scope(|s| { std::thread::scope(|s| {
let p = progress.clone();
let handle = std::thread::Builder::new() let handle = std::thread::Builder::new()
.name(String::from("batch-operation")) .name(String::from("batch-operation"))
.spawn_scoped(s, move || { .spawn_scoped(s, move || {
cloned_index_scheduler.process_batch(batch, processing_batch, p) cloned_index_scheduler.process_batch(batch, processing_batch, progress)
}) })
.unwrap(); .unwrap();
handle.join().unwrap_or(Err(Error::ProcessBatchPanicked))
match handle.join() {
Ok(ret) => {
if ret.is_err() {
if let Ok(progress_view) =
serde_json::to_string(&progress.as_progress_view())
{
tracing::warn!("Batch failed while doing: {progress_view}")
}
}
ret
}
Err(panic) => {
if let Ok(progress_view) =
serde_json::to_string(&progress.as_progress_view())
{
tracing::warn!("Batch failed while doing: {progress_view}")
}
let msg = match panic.downcast_ref::<&'static str>() {
Some(s) => *s,
None => match panic.downcast_ref::<String>() {
Some(s) => &s[..],
None => "Box<dyn Any>",
},
};
Err(Error::ProcessBatchPanicked(msg.to_string()))
}
}
}) })
}; };
@ -228,19 +183,16 @@ impl IndexScheduler {
progress.update_progress(BatchProgress::WritingTasksToDisk); progress.update_progress(BatchProgress::WritingTasksToDisk);
processing_batch.finished(); processing_batch.finished();
let mut stop_scheduler_forever = false;
let mut wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?; let mut wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?;
let mut canceled = RoaringBitmap::new(); let mut canceled = RoaringBitmap::new();
let mut process_batch_info = ProcessBatchInfo::default();
match res { match res {
Ok((tasks, info)) => { Ok(tasks) => {
#[cfg(test)] #[cfg(test)]
self.breakpoint(crate::test_utils::Breakpoint::ProcessBatchSucceeded); self.breakpoint(crate::test_utils::Breakpoint::ProcessBatchSucceeded);
let (task_progress, task_progress_obj) = AtomicTaskStep::new(tasks.len() as u32); let (task_progress, task_progress_obj) = AtomicTaskStep::new(tasks.len() as u32);
progress.update_progress(task_progress_obj); progress.update_progress(task_progress_obj);
process_batch_info = info;
let mut success = 0; let mut success = 0;
let mut failure = 0; let mut failure = 0;
let mut canceled_by = None; let mut canceled_by = None;
@ -269,7 +221,7 @@ impl IndexScheduler {
self.queue self.queue
.tasks .tasks
.update_task(&mut wtxn, &task) .update_task(&mut wtxn, &task)
.map_err(|e| Error::UnrecoverableError(Box::new(e)))?; .map_err(|e| Error::TaskDatabaseUpdate(Box::new(e)))?;
} }
if let Some(canceled_by) = canceled_by { if let Some(canceled_by) = canceled_by {
self.queue.tasks.canceled_by.put(&mut wtxn, &canceled_by, &canceled)?; self.queue.tasks.canceled_by.put(&mut wtxn, &canceled_by, &canceled)?;
@ -320,12 +272,6 @@ impl IndexScheduler {
let (task_progress, task_progress_obj) = AtomicTaskStep::new(ids.len() as u32); let (task_progress, task_progress_obj) = AtomicTaskStep::new(ids.len() as u32);
progress.update_progress(task_progress_obj); progress.update_progress(task_progress_obj);
if matches!(err, Error::DatabaseUpgrade(_)) {
tracing::error!(
"Upgrade task failed, tasks won't be processed until the following issue is fixed: {err}"
);
stop_scheduler_forever = true;
}
let error: ResponseError = err.into(); let error: ResponseError = err.into();
for id in ids.iter() { for id in ids.iter() {
task_progress.fetch_add(1, Ordering::Relaxed); task_progress.fetch_add(1, Ordering::Relaxed);
@ -333,7 +279,7 @@ impl IndexScheduler {
.queue .queue
.tasks .tasks
.get_task(&wtxn, id) .get_task(&wtxn, id)
.map_err(|e| Error::UnrecoverableError(Box::new(e)))? .map_err(|e| Error::TaskDatabaseUpdate(Box::new(e)))?
.ok_or(Error::CorruptedTaskQueue)?; .ok_or(Error::CorruptedTaskQueue)?;
task.status = Status::Failed; task.status = Status::Failed;
task.error = Some(error.clone()); task.error = Some(error.clone());
@ -350,67 +296,13 @@ impl IndexScheduler {
self.queue self.queue
.tasks .tasks
.update_task(&mut wtxn, &task) .update_task(&mut wtxn, &task)
.map_err(|e| Error::UnrecoverableError(Box::new(e)))?; .map_err(|e| Error::TaskDatabaseUpdate(Box::new(e)))?;
} }
} }
} }
// We must re-add the canceled task so they're part of the same batch. // We must re-add the canceled task so they're part of the same batch.
ids |= canceled; ids |= canceled;
let ProcessBatchInfo { congestion, pre_commit_dabases_sizes, post_commit_dabases_sizes } =
process_batch_info;
processing_batch.stats.progress_trace =
progress.accumulated_durations().into_iter().map(|(k, v)| (k, v.into())).collect();
processing_batch.stats.write_channel_congestion = congestion.map(|congestion| {
let mut congestion_info = serde_json::Map::new();
congestion_info.insert("attempts".into(), congestion.attempts.into());
congestion_info.insert("blocking_attempts".into(), congestion.blocking_attempts.into());
congestion_info.insert("blocking_ratio".into(), congestion.congestion_ratio().into());
congestion_info
});
processing_batch.stats.internal_database_sizes = pre_commit_dabases_sizes
.iter()
.flat_map(|(dbname, pre_size)| {
post_commit_dabases_sizes
.get(dbname)
.map(|post_size| {
use std::cmp::Ordering::{Equal, Greater, Less};
use byte_unit::Byte;
use byte_unit::UnitType::Binary;
let post = Byte::from_u64(*post_size as u64).get_appropriate_unit(Binary);
let diff_size = post_size.abs_diff(*pre_size) as u64;
let diff = Byte::from_u64(diff_size).get_appropriate_unit(Binary);
let sign = match post_size.cmp(pre_size) {
Equal => return None,
Greater => "+",
Less => "-",
};
Some((
dbname.to_case(Case::Camel),
format!("{post:#.2} ({sign}{diff:#.2})").into(),
))
})
.into_iter()
.flatten()
})
.collect();
if let Some(congestion) = congestion {
tracing::debug!(
"Channel congestion metrics - Attempts: {}, Blocked attempts: {} ({:.1}% congestion)",
congestion.attempts,
congestion.blocking_attempts,
congestion.congestion_ratio(),
);
}
tracing::debug!("call trace: {:?}", progress.accumulated_durations());
self.queue.write_batch(&mut wtxn, processing_batch, &ids)?; self.queue.write_batch(&mut wtxn, processing_batch, &ids)?;
#[cfg(test)] #[cfg(test)]
@ -434,7 +326,7 @@ impl IndexScheduler {
.queue .queue
.tasks .tasks
.get_task(&rtxn, id) .get_task(&rtxn, id)
.map_err(|e| Error::UnrecoverableError(Box::new(e)))? .map_err(|e| Error::TaskDatabaseUpdate(Box::new(e)))?
.ok_or(Error::CorruptedTaskQueue)?; .ok_or(Error::CorruptedTaskQueue)?;
if let Err(e) = self.queue.delete_persisted_task_data(&task) { if let Err(e) = self.queue.delete_persisted_task_data(&task) {
tracing::error!( tracing::error!(
@ -452,10 +344,6 @@ impl IndexScheduler {
#[cfg(test)] #[cfg(test)]
self.breakpoint(crate::test_utils::Breakpoint::AfterProcessing); self.breakpoint(crate::test_utils::Breakpoint::AfterProcessing);
if stop_scheduler_forever { Ok(TickOutcome::TickAgain(processed_tasks))
Ok(TickOutcome::StopProcessingForever)
} else {
Ok(TickOutcome::TickAgain(processed_tasks))
}
} }
} }

View File

@ -1,38 +1,23 @@
use std::collections::{BTreeSet, HashMap, HashSet}; use std::collections::{BTreeSet, HashMap, HashSet};
use std::panic::{catch_unwind, AssertUnwindSafe};
use std::sync::atomic::Ordering; use std::sync::atomic::Ordering;
use meilisearch_types::batches::{BatchEnqueuedAt, BatchId}; use meilisearch_types::batches::BatchId;
use meilisearch_types::heed::{RoTxn, RwTxn}; use meilisearch_types::heed::{RoTxn, RwTxn};
use meilisearch_types::milli::progress::{Progress, VariableNameStep}; use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::{self, ChannelCongestion}; use meilisearch_types::milli::{self};
use meilisearch_types::tasks::{Details, IndexSwap, Kind, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task};
use meilisearch_types::versioning::{VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH};
use milli::update::Settings as MilliSettings; use milli::update::Settings as MilliSettings;
use roaring::RoaringBitmap; use roaring::RoaringBitmap;
use super::create_batch::Batch; use super::create_batch::Batch;
use crate::processing::{ use crate::processing::{
AtomicBatchStep, AtomicTaskStep, CreateIndexProgress, DeleteIndexProgress, FinalizingIndexStep, AtomicBatchStep, AtomicTaskStep, CreateIndexProgress, DeleteIndexProgress,
InnerSwappingTwoIndexes, SwappingTheIndexes, TaskCancelationProgress, TaskDeletionProgress, InnerSwappingTwoIndexes, SwappingTheIndexes, TaskCancelationProgress, TaskDeletionProgress,
UpdateIndexProgress, UpdateIndexProgress, VariableNameStep,
};
use crate::utils::{
self, remove_n_tasks_datetime_earlier_than, remove_task_datetime, swap_index_uid_in_task,
ProcessingBatch,
}; };
use crate::utils::{self, swap_index_uid_in_task, ProcessingBatch};
use crate::{Error, IndexScheduler, Result, TaskId}; use crate::{Error, IndexScheduler, Result, TaskId};
#[derive(Debug, Default)]
pub struct ProcessBatchInfo {
/// The write channel congestion. None when unavailable: settings update.
pub congestion: Option<ChannelCongestion>,
/// The sizes of the different databases before starting the indexation.
pub pre_commit_dabases_sizes: indexmap::IndexMap<&'static str, usize>,
/// The sizes of the different databases after commiting the indexation.
pub post_commit_dabases_sizes: indexmap::IndexMap<&'static str, usize>,
}
impl IndexScheduler { impl IndexScheduler {
/// Apply the operation associated with the given batch. /// Apply the operation associated with the given batch.
/// ///
@ -46,7 +31,7 @@ impl IndexScheduler {
batch: Batch, batch: Batch,
current_batch: &mut ProcessingBatch, current_batch: &mut ProcessingBatch,
progress: Progress, progress: Progress,
) -> Result<(Vec<Task>, ProcessBatchInfo)> { ) -> Result<Vec<Task>> {
#[cfg(test)] #[cfg(test)]
{ {
self.maybe_fail(crate::test_utils::FailureLocation::InsideProcessBatch)?; self.maybe_fail(crate::test_utils::FailureLocation::InsideProcessBatch)?;
@ -87,7 +72,7 @@ impl IndexScheduler {
canceled_tasks.push(task); canceled_tasks.push(task);
Ok((canceled_tasks, ProcessBatchInfo::default())) Ok(canceled_tasks)
} }
Batch::TaskDeletions(mut tasks) => { Batch::TaskDeletions(mut tasks) => {
// 1. Retrieve the tasks that matched the query at enqueue-time. // 1. Retrieve the tasks that matched the query at enqueue-time.
@ -126,14 +111,10 @@ impl IndexScheduler {
_ => unreachable!(), _ => unreachable!(),
} }
} }
Ok((tasks, ProcessBatchInfo::default())) Ok(tasks)
} }
Batch::SnapshotCreation(tasks) => self Batch::SnapshotCreation(tasks) => self.process_snapshot(progress, tasks),
.process_snapshot(progress, tasks) Batch::Dump(task) => self.process_dump_creation(progress, task),
.map(|tasks| (tasks, ProcessBatchInfo::default())),
Batch::Dump(task) => self
.process_dump_creation(progress, task)
.map(|tasks| (tasks, ProcessBatchInfo::default())),
Batch::IndexOperation { op, must_create_index } => { Batch::IndexOperation { op, must_create_index } => {
let index_uid = op.index_uid().to_string(); let index_uid = op.index_uid().to_string();
let index = if must_create_index { let index = if must_create_index {
@ -145,33 +126,14 @@ impl IndexScheduler {
self.index_mapper.index(&rtxn, &index_uid)? self.index_mapper.index(&rtxn, &index_uid)?
}; };
let mut index_wtxn = index.write_txn()?;
let index_version = index.get_version(&index_wtxn)?.unwrap_or((1, 12, 0));
let package_version = (VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH);
if index_version != package_version {
return Err(Error::IndexVersionMismatch {
index: index_uid,
index_version,
package_version,
});
}
// the index operation can take a long time, so save this handle to make it available to the search for the duration of the tick // the index operation can take a long time, so save this handle to make it available to the search for the duration of the tick
self.index_mapper self.index_mapper
.set_currently_updating_index(Some((index_uid.clone(), index.clone()))); .set_currently_updating_index(Some((index_uid.clone(), index.clone())));
let pre_commit_dabases_sizes = index.database_sizes(&index_wtxn)?; let mut index_wtxn = index.write_txn()?;
let (tasks, congestion) = self.apply_index_operation( let tasks = self.apply_index_operation(&mut index_wtxn, &index, op, progress)?;
&mut index_wtxn,
&index,
op,
&progress,
current_batch.embedder_stats.clone(),
)?;
{ {
progress.update_progress(FinalizingIndexStep::Committing);
let span = tracing::trace_span!(target: "indexing::scheduler", "commit"); let span = tracing::trace_span!(target: "indexing::scheduler", "commit");
let _entered = span.enter(); let _entered = span.enter();
@ -182,15 +144,12 @@ impl IndexScheduler {
// stats of the index. Since the tasks have already been processed and // stats of the index. Since the tasks have already been processed and
// this is a non-critical operation. If it fails, we should not fail // this is a non-critical operation. If it fails, we should not fail
// the entire batch. // the entire batch.
let mut post_commit_dabases_sizes = None;
let res = || -> Result<()> { let res = || -> Result<()> {
progress.update_progress(FinalizingIndexStep::ComputingStats);
let index_rtxn = index.read_txn()?; let index_rtxn = index.read_txn()?;
let stats = crate::index_mapper::IndexStats::new(&index, &index_rtxn) let stats = crate::index_mapper::IndexStats::new(&index, &index_rtxn)
.map_err(|e| Error::from_milli(e, Some(index_uid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(index_uid.to_string())))?;
let mut wtxn = self.env.write_txn()?; let mut wtxn = self.env.write_txn()?;
self.index_mapper.store_stats_of(&mut wtxn, &index_uid, &stats)?; self.index_mapper.store_stats_of(&mut wtxn, &index_uid, &stats)?;
post_commit_dabases_sizes = Some(index.database_sizes(&index_rtxn)?);
wtxn.commit()?; wtxn.commit()?;
Ok(()) Ok(())
}(); }();
@ -203,16 +162,7 @@ impl IndexScheduler {
), ),
} }
let info = ProcessBatchInfo { Ok(tasks)
congestion,
// In case we fail to the get post-commit sizes we decide
// that nothing changed and use the pre-commit sizes.
post_commit_dabases_sizes: post_commit_dabases_sizes
.unwrap_or_else(|| pre_commit_dabases_sizes.clone()),
pre_commit_dabases_sizes,
};
Ok((tasks, info))
} }
Batch::IndexCreation { index_uid, primary_key, task } => { Batch::IndexCreation { index_uid, primary_key, task } => {
progress.update_progress(CreateIndexProgress::CreatingTheIndex); progress.update_progress(CreateIndexProgress::CreatingTheIndex);
@ -243,12 +193,10 @@ impl IndexScheduler {
); );
builder.set_primary_key(primary_key); builder.set_primary_key(primary_key);
let must_stop_processing = self.scheduler.must_stop_processing.clone(); let must_stop_processing = self.scheduler.must_stop_processing.clone();
builder builder
.execute( .execute(
&|| must_stop_processing.get(), |indexing_step| tracing::debug!(update = ?indexing_step),
&progress, || must_stop_processing.get(),
current_batch.embedder_stats.clone(),
) )
.map_err(|e| Error::from_milli(e, Some(index_uid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(index_uid.to_string())))?;
index_wtxn.commit()?; index_wtxn.commit()?;
@ -282,7 +230,7 @@ impl IndexScheduler {
), ),
} }
Ok((vec![task], ProcessBatchInfo::default())) Ok(vec![task])
} }
Batch::IndexDeletion { index_uid, index_has_been_created, mut tasks } => { Batch::IndexDeletion { index_uid, index_has_been_created, mut tasks } => {
progress.update_progress(DeleteIndexProgress::DeletingTheIndex); progress.update_progress(DeleteIndexProgress::DeletingTheIndex);
@ -316,9 +264,7 @@ impl IndexScheduler {
}; };
} }
// Here we could also show that all the internal database sizes goes to 0 Ok(tasks)
// but it would mean opening the index and that's costly.
Ok((tasks, ProcessBatchInfo::default()))
} }
Batch::IndexSwap { mut task } => { Batch::IndexSwap { mut task } => {
progress.update_progress(SwappingTheIndexes::EnsuringCorrectnessOfTheSwap); progress.update_progress(SwappingTheIndexes::EnsuringCorrectnessOfTheSwap);
@ -351,7 +297,7 @@ impl IndexScheduler {
} }
progress.update_progress(SwappingTheIndexes::SwappingTheIndexes); progress.update_progress(SwappingTheIndexes::SwappingTheIndexes);
for (step, swap) in swaps.iter().enumerate() { for (step, swap) in swaps.iter().enumerate() {
progress.update_progress(VariableNameStep::<SwappingTheIndexes>::new( progress.update_progress(VariableNameStep::new(
format!("swapping index {} and {}", swap.indexes.0, swap.indexes.1), format!("swapping index {} and {}", swap.indexes.0, swap.indexes.1),
step as u32, step as u32,
swaps.len() as u32, swaps.len() as u32,
@ -366,79 +312,7 @@ impl IndexScheduler {
} }
wtxn.commit()?; wtxn.commit()?;
task.status = Status::Succeeded; task.status = Status::Succeeded;
Ok((vec![task], ProcessBatchInfo::default())) Ok(vec![task])
}
Batch::Export { mut task } => {
let KindWithContent::Export { url, api_key, payload_size, indexes } = &task.kind
else {
unreachable!()
};
let ret = catch_unwind(AssertUnwindSafe(|| {
self.process_export(
url,
api_key.as_deref(),
payload_size.as_ref(),
indexes,
progress,
)
}));
let stats = match ret {
Ok(Ok(stats)) => stats,
Ok(Err(Error::AbortedTask)) => return Err(Error::AbortedTask),
Ok(Err(e)) => return Err(Error::Export(Box::new(e))),
Err(e) => {
let msg = match e.downcast_ref::<&'static str>() {
Some(s) => *s,
None => match e.downcast_ref::<String>() {
Some(s) => &s[..],
None => "Box<dyn Any>",
},
};
return Err(Error::Export(Box::new(Error::ProcessBatchPanicked(
msg.to_string(),
))));
}
};
task.status = Status::Succeeded;
if let Some(Details::Export { indexes, .. }) = task.details.as_mut() {
*indexes = stats;
}
Ok((vec![task], ProcessBatchInfo::default()))
}
Batch::UpgradeDatabase { mut tasks } => {
let KindWithContent::UpgradeDatabase { from } = tasks.last().unwrap().kind else {
unreachable!();
};
let ret = catch_unwind(AssertUnwindSafe(|| self.process_upgrade(from, progress)));
match ret {
Ok(Ok(())) => (),
Ok(Err(Error::AbortedTask)) => return Err(Error::AbortedTask),
Ok(Err(e)) => return Err(Error::DatabaseUpgrade(Box::new(e))),
Err(e) => {
let msg = match e.downcast_ref::<&'static str>() {
Some(s) => *s,
None => match e.downcast_ref::<String>() {
Some(s) => &s[..],
None => "Box<dyn Any>",
},
};
return Err(Error::DatabaseUpgrade(Box::new(Error::ProcessBatchPanicked(
msg.to_string(),
))));
}
}
for task in tasks.iter_mut() {
task.status = Status::Succeeded;
// Since this task can be retried we must reset its error status
task.error = None;
}
Ok((tasks, ProcessBatchInfo::default()))
} }
} }
} }
@ -522,6 +396,7 @@ impl IndexScheduler {
to_delete_tasks -= &enqueued_tasks; to_delete_tasks -= &enqueued_tasks;
// 2. We now have a list of tasks to delete, delete them // 2. We now have a list of tasks to delete, delete them
let mut affected_indexes = HashSet::new(); let mut affected_indexes = HashSet::new();
let mut affected_statuses = HashSet::new(); let mut affected_statuses = HashSet::new();
let mut affected_kinds = HashSet::new(); let mut affected_kinds = HashSet::new();
@ -618,51 +493,9 @@ impl IndexScheduler {
tasks -= &to_delete_tasks; tasks -= &to_delete_tasks;
// We must remove the batch entirely // We must remove the batch entirely
if tasks.is_empty() { if tasks.is_empty() {
if let Some(batch) = self.queue.batches.get_batch(wtxn, batch_id)? { self.queue.batches.all_batches.delete(wtxn, &batch_id)?;
if let Some(BatchEnqueuedAt { earliest, oldest }) = batch.enqueued_at { self.queue.batch_to_tasks_mapping.delete(wtxn, &batch_id)?;
remove_task_datetime(
wtxn,
self.queue.batches.enqueued_at,
earliest,
batch_id,
)?;
remove_task_datetime(
wtxn,
self.queue.batches.enqueued_at,
oldest,
batch_id,
)?;
} else {
// If we don't have the enqueued at in the batch it means the database comes from the v1.12
// and we still need to find the date by scrolling the database
remove_n_tasks_datetime_earlier_than(
wtxn,
self.queue.batches.enqueued_at,
batch.started_at,
batch.stats.total_nb_tasks.clamp(1, 2) as usize,
batch_id,
)?;
}
remove_task_datetime(
wtxn,
self.queue.batches.started_at,
batch.started_at,
batch_id,
)?;
if let Some(finished_at) = batch.finished_at {
remove_task_datetime(
wtxn,
self.queue.batches.finished_at,
finished_at,
batch_id,
)?;
}
self.queue.batches.all_batches.delete(wtxn, &batch_id)?;
self.queue.batch_to_tasks_mapping.delete(wtxn, &batch_id)?;
}
} }
// Anyway, we must remove the batch from all its reverse indexes. // Anyway, we must remove the batch from all its reverse indexes.
// The only way to do that is to check // The only way to do that is to check
@ -714,81 +547,17 @@ impl IndexScheduler {
progress: &Progress, progress: &Progress,
) -> Result<Vec<Task>> { ) -> Result<Vec<Task>> {
progress.update_progress(TaskCancelationProgress::RetrievingTasks); progress.update_progress(TaskCancelationProgress::RetrievingTasks);
let mut tasks_to_cancel = RoaringBitmap::new();
let enqueued_tasks = &self.queue.tasks.get_status(rtxn, Status::Enqueued)?;
// 0. Check if any upgrade task was matched.
// If so, we cancel all the failed or enqueued upgrade tasks.
let upgrade_tasks = &self.queue.tasks.get_kind(rtxn, Kind::UpgradeDatabase)?;
let is_canceling_upgrade = !matched_tasks.is_disjoint(upgrade_tasks);
if is_canceling_upgrade {
let failed_tasks = self.queue.tasks.get_status(rtxn, Status::Failed)?;
tasks_to_cancel |= upgrade_tasks & (enqueued_tasks | failed_tasks);
}
// 1. Remove from this list the tasks that we are not allowed to cancel // 1. Remove from this list the tasks that we are not allowed to cancel
// Notice that only the _enqueued_ ones are cancelable and we should // Notice that only the _enqueued_ ones are cancelable and we should
// have already aborted the indexation of the _processing_ ones // have already aborted the indexation of the _processing_ ones
tasks_to_cancel |= enqueued_tasks & matched_tasks; let cancelable_tasks = self.queue.tasks.get_status(rtxn, Status::Enqueued)?;
let tasks_to_cancel = cancelable_tasks & matched_tasks;
// 2. If we're canceling an upgrade, attempt the rollback
if let Some(latest_upgrade_task) = (&tasks_to_cancel & upgrade_tasks).max() {
progress.update_progress(TaskCancelationProgress::CancelingUpgrade);
let task = self.queue.tasks.get_task(rtxn, latest_upgrade_task)?.unwrap();
let Some(Details::UpgradeDatabase { from, to }) = task.details else {
unreachable!("wrong details for upgrade task {latest_upgrade_task}")
};
// check that we are rollbacking an upgrade to the current Meilisearch
let bin_major: u32 = meilisearch_types::versioning::VERSION_MAJOR;
let bin_minor: u32 = meilisearch_types::versioning::VERSION_MINOR;
let bin_patch: u32 = meilisearch_types::versioning::VERSION_PATCH;
if to == (bin_major, bin_minor, bin_patch) {
tracing::warn!(
"Rollbacking from v{}.{}.{} to v{}.{}.{}",
to.0,
to.1,
to.2,
from.0,
from.1,
from.2
);
let ret = catch_unwind(std::panic::AssertUnwindSafe(|| {
self.process_rollback(from, progress)
}));
match ret {
Ok(Ok(())) => {}
Ok(Err(err)) => return Err(Error::DatabaseUpgrade(Box::new(err))),
Err(e) => {
let msg = match e.downcast_ref::<&'static str>() {
Some(s) => *s,
None => match e.downcast_ref::<String>() {
Some(s) => &s[..],
None => "Box<dyn Any>",
},
};
return Err(Error::DatabaseUpgrade(Box::new(Error::ProcessBatchPanicked(
msg.to_string(),
))));
}
}
} else {
tracing::debug!(
"Not rollbacking an upgrade targetting the earlier version v{}.{}.{}",
bin_major,
bin_minor,
bin_patch
)
}
}
// 3. We now have a list of tasks to cancel, cancel them
let (task_progress, progress_obj) = AtomicTaskStep::new(tasks_to_cancel.len() as u32); let (task_progress, progress_obj) = AtomicTaskStep::new(tasks_to_cancel.len() as u32);
progress.update_progress(progress_obj); progress.update_progress(progress_obj);
// 2. We now have a list of tasks to cancel, cancel them
let mut tasks = self.queue.tasks.get_existing_tasks( let mut tasks = self.queue.tasks.get_existing_tasks(
rtxn, rtxn,
tasks_to_cancel.iter().inspect(|_| { tasks_to_cancel.iter().inspect(|_| {

View File

@ -1,12 +1,11 @@
use std::collections::BTreeMap;
use std::fs::File; use std::fs::File;
use std::io::BufWriter; use std::io::BufWriter;
use std::sync::atomic::Ordering; use std::sync::atomic::Ordering;
use dump::IndexMetadata; use dump::IndexMetadata;
use meilisearch_types::milli::constants::RESERVED_VECTORS_FIELD_NAME; use meilisearch_types::milli::constants::RESERVED_VECTORS_FIELD_NAME;
use meilisearch_types::milli::index::EmbeddingsWithMetadata; use meilisearch_types::milli::documents::{obkv_to_object, DocumentsBatchReader};
use meilisearch_types::milli::progress::{Progress, VariableNameStep}; use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::vector::parsed_vectors::{ExplicitVectors, VectorOrArrayOfVectors}; use meilisearch_types::milli::vector::parsed_vectors::{ExplicitVectors, VectorOrArrayOfVectors};
use meilisearch_types::milli::{self}; use meilisearch_types::milli::{self};
use meilisearch_types::tasks::{Details, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Details, KindWithContent, Status, Task};
@ -14,7 +13,7 @@ use time::macros::format_description;
use time::OffsetDateTime; use time::OffsetDateTime;
use crate::processing::{ use crate::processing::{
AtomicBatchStep, AtomicDocumentStep, AtomicTaskStep, DumpCreationProgress, AtomicDocumentStep, AtomicTaskStep, DumpCreationProgress, VariableNameStep,
}; };
use crate::{Error, IndexScheduler, Result}; use crate::{Error, IndexScheduler, Result};
@ -44,16 +43,7 @@ impl IndexScheduler {
let rtxn = self.env.read_txn()?; let rtxn = self.env.read_txn()?;
// 2. dump the chat completion settings // 2. dump the tasks
// TODO should I skip the export if the chat completion has been disabled?
progress.update_progress(DumpCreationProgress::DumpTheChatCompletionSettings);
let mut dump_chat_completion_settings = dump.create_chat_completions_settings()?;
for result in self.chat_settings.iter(&rtxn)? {
let (name, chat_settings) = result?;
dump_chat_completion_settings.push_settings(name, &chat_settings)?;
}
// 3. dump the tasks
progress.update_progress(DumpCreationProgress::DumpTheTasks); progress.update_progress(DumpCreationProgress::DumpTheTasks);
let mut dump_tasks = dump.create_tasks_queue()?; let mut dump_tasks = dump.create_tasks_queue()?;
@ -82,16 +72,9 @@ impl IndexScheduler {
t.started_at = Some(started_at); t.started_at = Some(started_at);
t.finished_at = Some(finished_at); t.finished_at = Some(finished_at);
} }
// Patch the task to remove the batch uid, because as of v1.12.5 batches are not persisted.
// This prevent from referencing *future* batches not actually associated with the task.
//
// See <https://github.com/meilisearch/meilisearch/issues/5247> for details.
t.batch_uid = None;
let mut dump_content_file = dump_tasks.push_task(&t.into())?; let mut dump_content_file = dump_tasks.push_task(&t.into())?;
// 3.1. Dump the `content_file` associated with the task if there is one and the task is not finished yet. // 2.1. Dump the `content_file` associated with the task if there is one and the task is not finished yet.
if let Some(content_file) = content_file { if let Some(content_file) = content_file {
if self.scheduler.must_stop_processing.get() { if self.scheduler.must_stop_processing.get() {
return Err(Error::AbortedTask); return Err(Error::AbortedTask);
@ -99,15 +82,19 @@ impl IndexScheduler {
if status == Status::Enqueued { if status == Status::Enqueued {
let content_file = self.queue.file_store.get_update(content_file)?; let content_file = self.queue.file_store.get_update(content_file)?;
for document in let reader = DocumentsBatchReader::from_reader(content_file)
serde_json::de::Deserializer::from_reader(content_file).into_iter() .map_err(|e| Error::from_milli(e.into(), None))?;
{
let document = document.map_err(|e| {
Error::from_milli(milli::InternalError::SerdeJson(e).into(), None)
})?;
dump_content_file.push_document(&document)?;
}
let (mut cursor, documents_batch_index) = reader.into_cursor_and_fields_index();
while let Some(doc) =
cursor.next_document().map_err(|e| Error::from_milli(e.into(), None))?
{
dump_content_file.push_document(
&obkv_to_object(doc, &documents_batch_index)
.map_err(|e| Error::from_milli(e, None))?,
)?;
}
dump_content_file.flush()?; dump_content_file.flush()?;
} }
} }
@ -115,49 +102,12 @@ impl IndexScheduler {
} }
dump_tasks.flush()?; dump_tasks.flush()?;
// 4. dump the batches // 3. Dump the indexes
progress.update_progress(DumpCreationProgress::DumpTheBatches);
let mut dump_batches = dump.create_batches_queue()?;
let (atomic_batch_progress, update_batch_progress) =
AtomicBatchStep::new(self.queue.batches.all_batches.len(&rtxn)? as u32);
progress.update_progress(update_batch_progress);
for ret in self.queue.batches.all_batches.iter(&rtxn)? {
if self.scheduler.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
let (_, mut b) = ret?;
// In the case we're dumping ourselves we want to be marked as finished
// to not loop over ourselves indefinitely.
if b.uid == task.uid {
let finished_at = OffsetDateTime::now_utc();
// We're going to fake the date because we don't know if everything is going to go well.
// But we need to dump the task as finished and successful.
// If something fail everything will be set appropriately in the end.
let mut statuses = BTreeMap::new();
statuses.insert(Status::Succeeded, b.stats.total_nb_tasks);
b.stats.status = statuses;
b.finished_at = Some(finished_at);
}
dump_batches.push_batch(&b)?;
atomic_batch_progress.fetch_add(1, Ordering::Relaxed);
}
dump_batches.flush()?;
// 5. Dump the indexes
progress.update_progress(DumpCreationProgress::DumpTheIndexes); progress.update_progress(DumpCreationProgress::DumpTheIndexes);
let nb_indexes = self.index_mapper.index_mapping.len(&rtxn)? as u32; let nb_indexes = self.index_mapper.index_mapping.len(&rtxn)? as u32;
let mut count = 0; let mut count = 0;
let () = self.index_mapper.try_for_each_index(&rtxn, |uid, index| -> Result<()> { let () = self.index_mapper.try_for_each_index(&rtxn, |uid, index| -> Result<()> {
progress.update_progress(VariableNameStep::<DumpCreationProgress>::new( progress.update_progress(VariableNameStep::new(uid.to_string(), count, nb_indexes));
uid.to_string(),
count,
nb_indexes,
));
count += 1; count += 1;
let rtxn = index.read_txn()?; let rtxn = index.read_txn()?;
@ -175,6 +125,10 @@ impl IndexScheduler {
let fields_ids_map = index.fields_ids_map(&rtxn)?; let fields_ids_map = index.fields_ids_map(&rtxn)?;
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect(); let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
let embedding_configs = index
.embedding_configs(&rtxn)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let decompression_dictionary = index.document_decompression_dictionary(&rtxn)?;
let nb_documents = index let nb_documents = index
.number_of_documents(&rtxn) .number_of_documents(&rtxn)
@ -182,16 +136,21 @@ impl IndexScheduler {
as u32; as u32;
let (atomic, update_document_progress) = AtomicDocumentStep::new(nb_documents); let (atomic, update_document_progress) = AtomicDocumentStep::new(nb_documents);
progress.update_progress(update_document_progress); progress.update_progress(update_document_progress);
let doc_alloc = bumpalo::Bump::new();
let documents = index let documents = index
.all_documents(&rtxn) .all_compressed_documents(&rtxn)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// 5.1. Dump the documents // 3.1. Dump the documents
for ret in documents { for ret in documents {
if self.scheduler.must_stop_processing.get() { if self.scheduler.must_stop_processing.get() {
return Err(Error::AbortedTask); return Err(Error::AbortedTask);
} }
let (id, doc) = ret.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?; let (id, doc) = ret.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let doc = match decompression_dictionary.as_ref() {
Some(dict) => doc.decompress_into_bump(&doc_alloc, dict)?,
None => doc.as_non_compressed(),
};
let mut document = milli::obkv_to_json(&all_fields, &fields_ids_map, doc) let mut document = milli::obkv_to_json(&all_fields, &fields_ids_map, doc)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?; .map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
@ -228,21 +187,16 @@ impl IndexScheduler {
return Err(Error::from_milli(user_err, Some(uid.to_string()))); return Err(Error::from_milli(user_err, Some(uid.to_string())));
}; };
for ( for (embedder_name, embeddings) in embeddings {
embedder_name, let user_provided = embedding_configs
EmbeddingsWithMetadata { embeddings, regenerate, has_fragments }, .iter()
) in embeddings .find(|conf| conf.name == embedder_name)
{ .is_some_and(|conf| conf.user_provided.contains(id));
let embeddings = ExplicitVectors { let embeddings = ExplicitVectors {
embeddings: Some(VectorOrArrayOfVectors::from_array_of_vectors( embeddings: Some(VectorOrArrayOfVectors::from_array_of_vectors(
embeddings, embeddings,
)), )),
regenerate: regenerate && regenerate: !user_provided,
// Meilisearch does not handle well dumps with fragments, because as the fragments
// are marked as user-provided,
// all embeddings would be regenerated on any settings change or document update.
// To prevent this, we mark embeddings has non regenerate in this case.
!has_fragments,
}; };
vectors.insert(embedder_name, serde_json::to_value(embeddings).unwrap()); vectors.insert(embedder_name, serde_json::to_value(embeddings).unwrap());
} }
@ -252,7 +206,7 @@ impl IndexScheduler {
atomic.fetch_add(1, Ordering::Relaxed); atomic.fetch_add(1, Ordering::Relaxed);
} }
// 5.2. Dump the settings // 3.2. Dump the settings
let settings = meilisearch_types::settings::settings( let settings = meilisearch_types::settings::settings(
index, index,
&rtxn, &rtxn,
@ -263,12 +217,10 @@ impl IndexScheduler {
Ok(()) Ok(())
})?; })?;
// 6. Dump experimental feature settings // 4. Dump experimental feature settings
progress.update_progress(DumpCreationProgress::DumpTheExperimentalFeatures); progress.update_progress(DumpCreationProgress::DumpTheExperimentalFeatures);
let features = self.features().runtime_features(); let features = self.features().runtime_features();
dump.create_experimental_features(features)?; dump.create_experimental_features(features)?;
let network = self.network();
dump.create_network(network)?;
let dump_uid = started_at.format(format_description!( let dump_uid = started_at.format(format_description!(
"[year repr:full][month repr:numerical][day padding:zero]-[hour padding:zero][minute padding:zero][second padding:zero][subsecond digits:3]" "[year repr:full][month repr:numerical][day padding:zero]-[hour padding:zero][minute padding:zero][second padding:zero][subsecond digits:3]"

View File

@ -1,377 +0,0 @@
use std::collections::BTreeMap;
use std::io::{self, Write as _};
use std::sync::atomic;
use std::time::Duration;
use backoff::ExponentialBackoff;
use byte_unit::Byte;
use flate2::write::GzEncoder;
use flate2::Compression;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::milli::constants::RESERVED_VECTORS_FIELD_NAME;
use meilisearch_types::milli::index::EmbeddingsWithMetadata;
use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use meilisearch_types::milli::update::{request_threads, Setting};
use meilisearch_types::milli::vector::parsed_vectors::{ExplicitVectors, VectorOrArrayOfVectors};
use meilisearch_types::milli::{self, obkv_to_json, Filter, InternalError};
use meilisearch_types::settings::{self, SecretPolicy};
use meilisearch_types::tasks::{DetailsExportIndexSettings, ExportIndexSettings};
use serde::Deserialize;
use ureq::{json, Response};
use super::MustStopProcessing;
use crate::processing::AtomicDocumentStep;
use crate::{Error, IndexScheduler, Result};
impl IndexScheduler {
pub(super) fn process_export(
&self,
base_url: &str,
api_key: Option<&str>,
payload_size: Option<&Byte>,
indexes: &BTreeMap<IndexUidPattern, ExportIndexSettings>,
progress: Progress,
) -> Result<BTreeMap<IndexUidPattern, DetailsExportIndexSettings>> {
#[cfg(test)]
self.maybe_fail(crate::test_utils::FailureLocation::ProcessExport)?;
let indexes: Vec<_> = self
.index_names()?
.into_iter()
.flat_map(|uid| {
indexes
.iter()
.find(|(pattern, _)| pattern.matches_str(&uid))
.map(|(pattern, settings)| (pattern, uid, settings))
})
.collect();
let mut output = BTreeMap::new();
let agent = ureq::AgentBuilder::new().timeout(Duration::from_secs(5)).build();
let must_stop_processing = self.scheduler.must_stop_processing.clone();
for (i, (_pattern, uid, export_settings)) in indexes.iter().enumerate() {
if must_stop_processing.get() {
return Err(Error::AbortedTask);
}
progress.update_progress(VariableNameStep::<ExportIndex>::new(
format!("Exporting index `{uid}`"),
i as u32,
indexes.len() as u32,
));
let ExportIndexSettings { filter, override_settings } = export_settings;
let index = self.index(uid)?;
let index_rtxn = index.read_txn()?;
let bearer = api_key.map(|api_key| format!("Bearer {api_key}"));
// First, check if the index already exists
let url = format!("{base_url}/indexes/{uid}");
let response = retry(&must_stop_processing, || {
let mut request = agent.get(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(Default::default()).map_err(into_backoff_error)
});
let index_exists = match response {
Ok(response) => response.status() == 200,
Err(Error::FromRemoteWhenExporting { code, .. }) if code == "index_not_found" => {
false
}
Err(e) => return Err(e),
};
let primary_key = index
.primary_key(&index_rtxn)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
// Create the index
if !index_exists {
let url = format!("{base_url}/indexes");
retry(&must_stop_processing, || {
let mut request = agent.post(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
let index_param = json!({ "uid": uid, "primaryKey": primary_key });
request.send_json(&index_param).map_err(into_backoff_error)
})?;
}
// Patch the index primary key
if index_exists && *override_settings {
let url = format!("{base_url}/indexes/{uid}");
retry(&must_stop_processing, || {
let mut request = agent.patch(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
let index_param = json!({ "primaryKey": primary_key });
request.send_json(&index_param).map_err(into_backoff_error)
})?;
}
// Send the index settings
if !index_exists || *override_settings {
let mut settings =
settings::settings(&index, &index_rtxn, SecretPolicy::RevealSecrets)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// Remove the experimental chat setting if not enabled
if self.features().check_chat_completions("exporting chat settings").is_err() {
settings.chat = Setting::NotSet;
}
// Retry logic for sending settings
let url = format!("{base_url}/indexes/{uid}/settings");
retry(&must_stop_processing, || {
let mut request = agent.patch(&url);
if let Some(bearer) = bearer.as_ref() {
request = request.set("Authorization", bearer);
}
request.send_json(settings.clone()).map_err(into_backoff_error)
})?;
}
let filter = filter
.as_ref()
.map(Filter::from_json)
.transpose()
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?
.flatten();
let filter_universe = filter
.map(|f| f.evaluate(&index_rtxn, &index))
.transpose()
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let whole_universe = index
.documents_ids(&index_rtxn)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
let universe = filter_universe.unwrap_or(whole_universe);
let fields_ids_map = index.fields_ids_map(&index_rtxn)?;
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
// We don't need to keep this one alive as we will
// spawn many threads to process the documents
drop(index_rtxn);
let total_documents = universe.len() as u32;
let (step, progress_step) = AtomicDocumentStep::new(total_documents);
progress.update_progress(progress_step);
output.insert(
IndexUidPattern::new_unchecked(uid.clone()),
DetailsExportIndexSettings {
settings: (*export_settings).clone(),
matched_documents: Some(total_documents as u64),
},
);
let limit = payload_size.map(|ps| ps.as_u64() as usize).unwrap_or(20 * 1024 * 1024); // defaults to 20 MiB
let documents_url = format!("{base_url}/indexes/{uid}/documents");
let results = request_threads()
.broadcast(|ctx| {
let index_rtxn = index
.read_txn()
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
let mut buffer = Vec::new();
let mut tmp_buffer = Vec::new();
let mut compressed_buffer = Vec::new();
for (i, docid) in universe.iter().enumerate() {
if i % ctx.num_threads() != ctx.index() {
continue;
}
let document = index
.document(&index_rtxn, docid)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let mut document = obkv_to_json(&all_fields, &fields_ids_map, document)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// TODO definitely factorize this code
'inject_vectors: {
let embeddings = index
.embeddings(&index_rtxn, docid)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
if embeddings.is_empty() {
break 'inject_vectors;
}
let vectors = document
.entry(RESERVED_VECTORS_FIELD_NAME)
.or_insert(serde_json::Value::Object(Default::default()));
let serde_json::Value::Object(vectors) = vectors else {
return Err(Error::from_milli(
milli::Error::UserError(
milli::UserError::InvalidVectorsMapType {
document_id: {
if let Ok(Some(Ok(index))) = index
.external_id_of(
&index_rtxn,
std::iter::once(docid),
)
.map(|it| it.into_iter().next())
{
index
} else {
format!("internal docid={docid}")
}
},
value: vectors.clone(),
},
),
Some(uid.to_string()),
));
};
for (
embedder_name,
EmbeddingsWithMetadata { embeddings, regenerate, has_fragments },
) in embeddings
{
let embeddings = ExplicitVectors {
embeddings: Some(
VectorOrArrayOfVectors::from_array_of_vectors(embeddings),
),
regenerate: regenerate &&
// Meilisearch does not handle well dumps with fragments, because as the fragments
// are marked as user-provided,
// all embeddings would be regenerated on any settings change or document update.
// To prevent this, we mark embeddings has non regenerate in this case.
!has_fragments,
};
vectors.insert(
embedder_name,
serde_json::to_value(embeddings).unwrap(),
);
}
}
tmp_buffer.clear();
serde_json::to_writer(&mut tmp_buffer, &document)
.map_err(milli::InternalError::from)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
// Make sure we put at least one document in the buffer even
// though we might go above the buffer limit before sending
if !buffer.is_empty() && buffer.len() + tmp_buffer.len() > limit {
// We compress the documents before sending them
let mut encoder =
GzEncoder::new(&mut compressed_buffer, Compression::default());
encoder
.write_all(&buffer)
.map_err(|e| Error::from_milli(e.into(), Some(uid.clone())))?;
encoder
.finish()
.map_err(|e| Error::from_milli(e.into(), Some(uid.clone())))?;
retry(&must_stop_processing, || {
let mut request = agent.post(&documents_url);
request = request.set("Content-Type", "application/x-ndjson");
request = request.set("Content-Encoding", "gzip");
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(&compressed_buffer).map_err(into_backoff_error)
})?;
buffer.clear();
compressed_buffer.clear();
}
buffer.extend_from_slice(&tmp_buffer);
if i > 0 && i % 100 == 0 {
step.fetch_add(100, atomic::Ordering::Relaxed);
}
}
retry(&must_stop_processing, || {
let mut request = agent.post(&documents_url);
request = request.set("Content-Type", "application/x-ndjson");
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(&buffer).map_err(into_backoff_error)
})?;
Ok(())
})
.map_err(|e| {
Error::from_milli(
milli::Error::InternalError(InternalError::PanicInThreadPool(e)),
Some(uid.to_string()),
)
})?;
for result in results {
result?;
}
step.store(total_documents, atomic::Ordering::Relaxed);
}
Ok(output)
}
}
fn retry<F>(must_stop_processing: &MustStopProcessing, send_request: F) -> Result<ureq::Response>
where
F: Fn() -> Result<ureq::Response, backoff::Error<ureq::Error>>,
{
match backoff::retry(ExponentialBackoff::default(), || {
if must_stop_processing.get() {
return Err(backoff::Error::Permanent(ureq::Error::Status(
u16::MAX,
// 444: Connection Closed Without Response
Response::new(444, "Abort", "Aborted task").unwrap(),
)));
}
send_request()
}) {
Ok(response) => Ok(response),
Err(backoff::Error::Permanent(e)) => Err(ureq_error_into_error(e)),
Err(backoff::Error::Transient { err, retry_after: _ }) => Err(ureq_error_into_error(err)),
}
}
fn into_backoff_error(err: ureq::Error) -> backoff::Error<ureq::Error> {
match err {
// Those code status must trigger an automatic retry
// <https://www.restapitutorial.com/advanced/responses/retries>
ureq::Error::Status(408 | 429 | 500 | 502 | 503 | 504, _) => {
backoff::Error::Transient { err, retry_after: None }
}
ureq::Error::Status(_, _) => backoff::Error::Permanent(err),
ureq::Error::Transport(_) => backoff::Error::Transient { err, retry_after: None },
}
}
/// Converts a `ureq::Error` into an `Error`.
fn ureq_error_into_error(error: ureq::Error) -> Error {
#[derive(Deserialize)]
struct MeiliError {
message: String,
code: String,
r#type: String,
link: String,
}
match error {
// This is a workaround to handle task abortion - the error propagation path
// makes it difficult to cleanly surface the abortion at this level.
ureq::Error::Status(u16::MAX, _) => Error::AbortedTask,
ureq::Error::Status(_, response) => match response.into_json() {
Ok(MeiliError { message, code, r#type, link }) => {
Error::FromRemoteWhenExporting { message, code, r#type, link }
}
Err(e) => e.into(),
},
ureq::Error::Transport(transport) => io::Error::new(io::ErrorKind::Other, transport).into(),
}
}
enum ExportIndex {}

View File

@ -1,13 +1,11 @@
use std::sync::Arc;
use bumpalo::collections::CollectIn; use bumpalo::collections::CollectIn;
use bumpalo::Bump; use bumpalo::Bump;
use meilisearch_types::heed::RwTxn; use meilisearch_types::heed::RwTxn;
use meilisearch_types::milli::documents::PrimaryKey; use meilisearch_types::milli::documents::PrimaryKey;
use meilisearch_types::milli::progress::{EmbedderStats, Progress}; use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::update::new::indexer::{self, UpdateByFunction}; use meilisearch_types::milli::update::new::indexer::{self, UpdateByFunction};
use meilisearch_types::milli::update::DocumentAdditionResult; use meilisearch_types::milli::update::DocumentAdditionResult;
use meilisearch_types::milli::{self, ChannelCongestion, Filter}; use meilisearch_types::milli::{self, Filter, ThreadPoolNoAbortBuilder};
use meilisearch_types::settings::apply_settings_to_builder; use meilisearch_types::settings::apply_settings_to_builder;
use meilisearch_types::tasks::{Details, KindWithContent, Status, Task}; use meilisearch_types::tasks::{Details, KindWithContent, Status, Task};
use meilisearch_types::Index; use meilisearch_types::Index;
@ -26,7 +24,7 @@ impl IndexScheduler {
/// The list of processed tasks. /// The list of processed tasks.
#[tracing::instrument( #[tracing::instrument(
level = "trace", level = "trace",
skip(self, index_wtxn, index, progress, embedder_stats), skip(self, index_wtxn, index, progress),
target = "indexing::scheduler" target = "indexing::scheduler"
)] )]
pub(crate) fn apply_index_operation<'i>( pub(crate) fn apply_index_operation<'i>(
@ -34,10 +32,10 @@ impl IndexScheduler {
index_wtxn: &mut RwTxn<'i>, index_wtxn: &mut RwTxn<'i>,
index: &'i Index, index: &'i Index,
operation: IndexOperation, operation: IndexOperation,
progress: &Progress, progress: Progress,
embedder_stats: Arc<EmbedderStats>, ) -> Result<Vec<Task>> {
) -> Result<(Vec<Task>, Option<ChannelCongestion>)> {
let indexer_alloc = Bump::new(); let indexer_alloc = Bump::new();
let started_processing_at = std::time::Instant::now(); let started_processing_at = std::time::Instant::now();
let must_stop_processing = self.scheduler.must_stop_processing.clone(); let must_stop_processing = self.scheduler.must_stop_processing.clone();
@ -62,23 +60,25 @@ impl IndexScheduler {
}; };
} }
Ok((tasks, None)) Ok(tasks)
} }
IndexOperation::DocumentOperation { index_uid, primary_key, operations, mut tasks } => { IndexOperation::DocumentOperation {
index_uid,
primary_key,
method,
operations,
mut tasks,
} => {
progress.update_progress(DocumentOperationProgress::RetrievingConfig); progress.update_progress(DocumentOperationProgress::RetrievingConfig);
// TODO: at some point, for better efficiency we might want to reuse the bumpalo for successive batches. // TODO: at some point, for better efficiency we might want to reuse the bumpalo for successive batches.
// this is made difficult by the fact we're doing private clones of the index scheduler and sending it // this is made difficult by the fact we're doing private clones of the index scheduler and sending it
// to a fresh thread. // to a fresh thread.
let mut content_files = Vec::new(); let mut content_files = Vec::new();
for operation in &operations { for operation in &operations {
match operation { if let DocumentOperation::Add(content_uuid) = operation {
DocumentOperation::Replace(content_uuid) let content_file = self.queue.file_store.get_update(*content_uuid)?;
| DocumentOperation::Update(content_uuid) => { let mmap = unsafe { memmap2::Mmap::map(&content_file)? };
let content_file = self.queue.file_store.get_update(*content_uuid)?; content_files.push(mmap);
let mmap = unsafe { memmap2::Mmap::map(&content_file)? };
content_files.push(mmap);
}
_ => (),
} }
} }
@ -87,24 +87,17 @@ impl IndexScheduler {
let mut new_fields_ids_map = db_fields_ids_map.clone(); let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut content_files_iter = content_files.iter(); let mut content_files_iter = content_files.iter();
let mut indexer = indexer::DocumentOperation::new(); let mut indexer = indexer::DocumentOperation::new(method);
let embedders = index let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn) .embedding_configs(index_wtxn)
.map_err(|e| Error::from_milli(e.into(), Some(index_uid.clone())))?; .map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?; let embedders = self.embedders(index_uid.clone(), embedders)?;
for operation in operations { for operation in operations {
match operation { match operation {
DocumentOperation::Replace(_content_uuid) => { DocumentOperation::Add(_content_uuid) => {
let mmap = content_files_iter.next().unwrap(); let mmap = content_files_iter.next().unwrap();
indexer indexer
.replace_documents(mmap) .add_documents(mmap)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
}
DocumentOperation::Update(_content_uuid) => {
let mmap = content_files_iter.next().unwrap();
indexer
.update_documents(mmap)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?; .map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
} }
DocumentOperation::Delete(document_ids) => { DocumentOperation::Delete(document_ids) => {
@ -117,8 +110,18 @@ impl IndexScheduler {
} }
} }
let local_pool;
let indexer_config = self.index_mapper.indexer_config(); let indexer_config = self.index_mapper.indexer_config();
let pool = &indexer_config.thread_pool; let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
progress.update_progress(DocumentOperationProgress::ComputingDocumentChanges); progress.update_progress(DocumentOperationProgress::ComputingDocumentChanges);
let (document_changes, operation_stats, primary_key) = indexer let (document_changes, operation_stats, primary_key) = indexer
@ -166,25 +169,21 @@ impl IndexScheduler {
} }
progress.update_progress(DocumentOperationProgress::Indexing); progress.update_progress(DocumentOperationProgress::Indexing);
let mut congestion = None;
if tasks.iter().any(|res| res.error.is_none()) { if tasks.iter().any(|res| res.error.is_none()) {
congestion = Some( indexer::index(
indexer::index( index_wtxn,
index_wtxn, index,
index, pool,
pool, indexer_config.grenad_parameters(),
indexer_config.grenad_parameters(), &db_fields_ids_map,
&db_fields_ids_map, new_fields_ids_map,
new_fields_ids_map, primary_key,
primary_key, &document_changes,
&document_changes, embedders,
embedders, &|| must_stop_processing.get(),
&|| must_stop_processing.get(), &progress,
progress, )
&embedder_stats, .map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?,
);
let addition = DocumentAdditionResult { let addition = DocumentAdditionResult {
indexed_documents: candidates_count, indexed_documents: candidates_count,
@ -196,7 +195,7 @@ impl IndexScheduler {
tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done"); tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done");
} }
Ok((tasks, congestion)) Ok(tasks)
} }
IndexOperation::DocumentEdition { index_uid, mut task } => { IndexOperation::DocumentEdition { index_uid, mut task } => {
progress.update_progress(DocumentEditionProgress::RetrievingConfig); progress.update_progress(DocumentEditionProgress::RetrievingConfig);
@ -244,7 +243,7 @@ impl IndexScheduler {
edited_documents: Some(0), edited_documents: Some(0),
}); });
return Ok((vec![task], None)); return Ok(vec![task]);
} }
let rtxn = index.read_txn()?; let rtxn = index.read_txn()?;
@ -259,10 +258,19 @@ impl IndexScheduler {
let result_count = Ok((candidates.len(), candidates.len())) as Result<_>; let result_count = Ok((candidates.len(), candidates.len())) as Result<_>;
let mut congestion = None;
if task.error.is_none() { if task.error.is_none() {
let local_pool;
let indexer_config = self.index_mapper.indexer_config(); let indexer_config = self.index_mapper.indexer_config();
let pool = &indexer_config.thread_pool; let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
let candidates_count = candidates.len(); let candidates_count = candidates.len();
progress.update_progress(DocumentEditionProgress::ComputingDocumentChanges); progress.update_progress(DocumentEditionProgress::ComputingDocumentChanges);
@ -275,29 +283,25 @@ impl IndexScheduler {
}) })
.unwrap()?; .unwrap()?;
let embedders = index let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn) .embedding_configs(index_wtxn)
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?; .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?; let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentEditionProgress::Indexing); progress.update_progress(DocumentEditionProgress::Indexing);
congestion = Some( indexer::index(
indexer::index( index_wtxn,
index_wtxn, index,
index, pool,
pool, indexer_config.grenad_parameters(),
indexer_config.grenad_parameters(), &db_fields_ids_map,
&db_fields_ids_map, new_fields_ids_map,
new_fields_ids_map, None, // cannot change primary key in DocumentEdition
None, // cannot change primary key in DocumentEdition &document_changes,
&document_changes, embedders,
embedders, &|| must_stop_processing.get(),
&|| must_stop_processing.get(), &progress,
progress, )
&embedder_stats, .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
let addition = DocumentAdditionResult { let addition = DocumentAdditionResult {
indexed_documents: candidates_count, indexed_documents: candidates_count,
@ -333,7 +337,7 @@ impl IndexScheduler {
} }
} }
Ok((vec![task], congestion)) Ok(vec![task])
} }
IndexOperation::DocumentDeletion { mut tasks, index_uid } => { IndexOperation::DocumentDeletion { mut tasks, index_uid } => {
progress.update_progress(DocumentDeletionProgress::RetrievingConfig); progress.update_progress(DocumentDeletionProgress::RetrievingConfig);
@ -400,7 +404,7 @@ impl IndexScheduler {
} }
if to_delete.is_empty() { if to_delete.is_empty() {
return Ok((tasks, None)); return Ok(tasks);
} }
let rtxn = index.read_txn()?; let rtxn = index.read_txn()?;
@ -414,10 +418,19 @@ impl IndexScheduler {
PrimaryKey::new_or_insert(primary_key, &mut new_fields_ids_map) PrimaryKey::new_or_insert(primary_key, &mut new_fields_ids_map)
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?; .map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?;
let mut congestion = None;
if !tasks.iter().all(|res| res.error.is_some()) { if !tasks.iter().all(|res| res.error.is_some()) {
let local_pool;
let indexer_config = self.index_mapper.indexer_config(); let indexer_config = self.index_mapper.indexer_config();
let pool = &indexer_config.thread_pool; let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
progress.update_progress(DocumentDeletionProgress::DeleteDocuments); progress.update_progress(DocumentDeletionProgress::DeleteDocuments);
let mut indexer = indexer::DocumentDeletion::new(); let mut indexer = indexer::DocumentDeletion::new();
@ -425,29 +438,25 @@ impl IndexScheduler {
indexer.delete_documents_by_docids(to_delete); indexer.delete_documents_by_docids(to_delete);
let document_changes = indexer.into_changes(&indexer_alloc, primary_key); let document_changes = indexer.into_changes(&indexer_alloc, primary_key);
let embedders = index let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn) .embedding_configs(index_wtxn)
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?; .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?; let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentDeletionProgress::Indexing); progress.update_progress(DocumentDeletionProgress::Indexing);
congestion = Some( indexer::index(
indexer::index( index_wtxn,
index_wtxn, index,
index, pool,
pool, indexer_config.grenad_parameters(),
indexer_config.grenad_parameters(), &db_fields_ids_map,
&db_fields_ids_map, new_fields_ids_map,
new_fields_ids_map, None, // document deletion never changes primary key
None, // document deletion never changes primary key &document_changes,
&document_changes, embedders,
embedders, &|| must_stop_processing.get(),
&|| must_stop_processing.get(), &progress,
progress, )
&embedder_stats, .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
let addition = DocumentAdditionResult { let addition = DocumentAdditionResult {
indexed_documents: candidates_count, indexed_documents: candidates_count,
@ -459,7 +468,7 @@ impl IndexScheduler {
tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done"); tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done");
} }
Ok((tasks, congestion)) Ok(tasks)
} }
IndexOperation::Settings { index_uid, settings, mut tasks } => { IndexOperation::Settings { index_uid, settings, mut tasks } => {
progress.update_progress(SettingsProgress::RetrievingAndMergingTheSettings); progress.update_progress(SettingsProgress::RetrievingAndMergingTheSettings);
@ -477,11 +486,14 @@ impl IndexScheduler {
} }
progress.update_progress(SettingsProgress::ApplyTheSettings); progress.update_progress(SettingsProgress::ApplyTheSettings);
let congestion = builder builder
.execute(&|| must_stop_processing.get(), progress, embedder_stats) .execute(
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?; .map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
Ok((tasks, congestion)) Ok(tasks)
} }
IndexOperation::DocumentClearAndSetting { IndexOperation::DocumentClearAndSetting {
index_uid, index_uid,
@ -489,28 +501,26 @@ impl IndexScheduler {
settings, settings,
settings_tasks, settings_tasks,
} => { } => {
let (mut import_tasks, _congestion) = self.apply_index_operation( let mut import_tasks = self.apply_index_operation(
index_wtxn, index_wtxn,
index, index,
IndexOperation::DocumentClear { IndexOperation::DocumentClear {
index_uid: index_uid.clone(), index_uid: index_uid.clone(),
tasks: cleared_tasks, tasks: cleared_tasks,
}, },
progress, progress.clone(),
embedder_stats.clone(),
)?; )?;
let (settings_tasks, _congestion) = self.apply_index_operation( let settings_tasks = self.apply_index_operation(
index_wtxn, index_wtxn,
index, index,
IndexOperation::Settings { index_uid, settings, tasks: settings_tasks }, IndexOperation::Settings { index_uid, settings, tasks: settings_tasks },
progress, progress,
embedder_stats,
)?; )?;
let mut tasks = settings_tasks; let mut tasks = settings_tasks;
tasks.append(&mut import_tasks); tasks.append(&mut import_tasks);
Ok((tasks, None)) Ok(tasks)
} }
} }
} }

View File

@ -3,11 +3,12 @@ use std::fs;
use std::sync::atomic::Ordering; use std::sync::atomic::Ordering;
use meilisearch_types::heed::CompactionOption; use meilisearch_types::heed::CompactionOption;
use meilisearch_types::milli::progress::{Progress, VariableNameStep}; use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::{self};
use meilisearch_types::tasks::{Status, Task}; use meilisearch_types::tasks::{Status, Task};
use meilisearch_types::{compression, VERSION_FILE_NAME}; use meilisearch_types::{compression, VERSION_FILE_NAME};
use crate::processing::{AtomicUpdateFileStep, SnapshotCreationProgress}; use crate::processing::{AtomicUpdateFileStep, SnapshotCreationProgress, VariableNameStep};
use crate::{Error, IndexScheduler, Result}; use crate::{Error, IndexScheduler, Result};
impl IndexScheduler { impl IndexScheduler {
@ -27,7 +28,7 @@ impl IndexScheduler {
// 2. Snapshot the index-scheduler LMDB env // 2. Snapshot the index-scheduler LMDB env
// //
// When we call copy_to_path, LMDB opens a read transaction by itself, // When we call copy_to_file, LMDB opens a read transaction by itself,
// we can't provide our own. It is an issue as we would like to know // we can't provide our own. It is an issue as we would like to know
// the update files to copy but new ones can be enqueued between the copy // the update files to copy but new ones can be enqueued between the copy
// of the env and the new transaction we open to retrieve the enqueued tasks. // of the env and the new transaction we open to retrieve the enqueued tasks.
@ -41,12 +42,7 @@ impl IndexScheduler {
progress.update_progress(SnapshotCreationProgress::SnapshotTheIndexScheduler); progress.update_progress(SnapshotCreationProgress::SnapshotTheIndexScheduler);
let dst = temp_snapshot_dir.path().join("tasks"); let dst = temp_snapshot_dir.path().join("tasks");
fs::create_dir_all(&dst)?; fs::create_dir_all(&dst)?;
let compaction_option = if self.scheduler.experimental_no_snapshot_compaction { self.env.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
CompactionOption::Disabled
} else {
CompactionOption::Enabled
};
self.env.copy_to_path(dst.join("data.mdb"), compaction_option)?;
// 2.2 Create a read transaction on the index-scheduler // 2.2 Create a read transaction on the index-scheduler
let rtxn = self.env.read_txn()?; let rtxn = self.env.read_txn()?;
@ -78,14 +74,12 @@ impl IndexScheduler {
for (i, result) in index_mapping.iter(&rtxn)?.enumerate() { for (i, result) in index_mapping.iter(&rtxn)?.enumerate() {
let (name, uuid) = result?; let (name, uuid) = result?;
progress.update_progress(VariableNameStep::<SnapshotCreationProgress>::new( progress.update_progress(VariableNameStep::new(name, i as u32, nb_indexes));
name, i as u32, nb_indexes,
));
let index = self.index_mapper.index(&rtxn, name)?; let index = self.index_mapper.index(&rtxn, name)?;
let dst = temp_snapshot_dir.path().join("indexes").join(uuid.to_string()); let dst = temp_snapshot_dir.path().join("indexes").join(uuid.to_string());
fs::create_dir_all(&dst)?; fs::create_dir_all(&dst)?;
index index
.copy_to_path(dst.join("data.mdb"), compaction_option) .copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)
.map_err(|e| Error::from_milli(e, Some(name.to_string())))?; .map_err(|e| Error::from_milli(e, Some(name.to_string())))?;
} }
@ -95,7 +89,14 @@ impl IndexScheduler {
progress.update_progress(SnapshotCreationProgress::SnapshotTheApiKeys); progress.update_progress(SnapshotCreationProgress::SnapshotTheApiKeys);
let dst = temp_snapshot_dir.path().join("auth"); let dst = temp_snapshot_dir.path().join("auth");
fs::create_dir_all(&dst)?; fs::create_dir_all(&dst)?;
self.scheduler.auth_env.copy_to_path(dst.join("data.mdb"), compaction_option)?; // TODO We can't use the open_auth_store_env function here but we should
let auth = unsafe {
milli::heed::EnvOpenOptions::new()
.map_size(1024 * 1024 * 1024) // 1 GiB
.max_dbs(2)
.open(&self.scheduler.auth_path)
}?;
auth.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
// 5. Copy and tarball the flat snapshot // 5. Copy and tarball the flat snapshot
progress.update_progress(SnapshotCreationProgress::CreateTheTarball); progress.update_progress(SnapshotCreationProgress::CreateTheTarball);

View File

@ -1,92 +0,0 @@
use meilisearch_types::milli;
use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use crate::{Error, IndexScheduler, Result};
impl IndexScheduler {
pub(super) fn process_upgrade(
&self,
db_version: (u32, u32, u32),
progress: Progress,
) -> Result<()> {
#[cfg(test)]
self.maybe_fail(crate::test_utils::FailureLocation::ProcessUpgrade)?;
let indexes = self.index_names()?;
for (i, uid) in indexes.iter().enumerate() {
let must_stop_processing = self.scheduler.must_stop_processing.clone();
if must_stop_processing.get() {
return Err(Error::AbortedTask);
}
progress.update_progress(VariableNameStep::<UpgradeIndex>::new(
format!("Upgrading index `{uid}`"),
i as u32,
indexes.len() as u32,
));
let index = self.index(uid)?;
let mut index_wtxn = index.write_txn()?;
let regen_stats = milli::update::upgrade::upgrade(
&mut index_wtxn,
&index,
db_version,
|| must_stop_processing.get(),
progress.clone(),
)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
if regen_stats {
let stats = crate::index_mapper::IndexStats::new(&index, &index_wtxn)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
index_wtxn.commit()?;
// Release wtxn as soon as possible because it stops us from registering tasks
let mut index_schd_wtxn = self.env.write_txn()?;
self.index_mapper.store_stats_of(&mut index_schd_wtxn, uid, &stats)?;
index_schd_wtxn.commit()?;
} else {
index_wtxn.commit()?;
}
}
Ok(())
}
pub fn process_rollback(&self, db_version: (u32, u32, u32), progress: &Progress) -> Result<()> {
let mut wtxn = self.env.write_txn()?;
tracing::info!(?db_version, "roll back index scheduler version");
self.version.set_version(&mut wtxn, db_version)?;
let db_path = self.scheduler.version_file_path.parent().unwrap();
wtxn.commit()?;
let indexes = self.index_names()?;
tracing::info!("roll backing all indexes");
for (i, uid) in indexes.iter().enumerate() {
progress.update_progress(VariableNameStep::<UpgradeIndex>::new(
format!("Rollbacking index `{uid}`"),
i as u32,
indexes.len() as u32,
));
let index_schd_rtxn = self.env.read_txn()?;
let rollback_outcome =
self.index_mapper.rollback_index(&index_schd_rtxn, uid, db_version)?;
if !rollback_outcome.succeeded() {
return Err(crate::Error::RollbackFailed { index: uid.clone(), rollback_outcome });
}
}
tracing::info!(?db_path, ?db_version, "roll back version file");
meilisearch_types::versioning::create_version_file(
db_path,
db_version.0,
db_version.1,
db_version.2,
)?;
Ok(())
}
}
enum UpgradeIndex {}

View File

@ -1,17 +0,0 @@
---
source: crates/index-scheduler/src/scheduler/test.rs
expression: config.embedder_options
---
{
"Rest": {
"api_key": "My super secret",
"distribution": null,
"dimensions": 4,
"url": "http://localhost:7777",
"request": "{{text}}",
"search_fragments": {},
"indexing_fragments": {},
"response": "{{embedding}}",
"headers": {}
}
}

View File

@ -1,12 +0,0 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: simple_hf_config.embedder_options
---
{
"HuggingFace": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"distribution": null,
"pooling": "useModel"
}
}

View File

@ -1,15 +0,0 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: doc
---
{
"doggo": "Intel",
"breed": "beagle",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@ -1,15 +0,0 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: doc
---
{
"doggo": "kefir",
"breed": "patou",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@ -1,17 +1,12 @@
--- ---
source: crates/index-scheduler/src/scheduler/test_embedders.rs source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: fakerest_config.embedder_options expression: simple_hf_config.embedder_options
snapshot_kind: text
--- ---
{ {
"Rest": { "HuggingFace": {
"api_key": "My super secret", "model": "sentence-transformers/all-MiniLM-L6-v2",
"distribution": null, "revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"dimensions": 384, "distribution": null
"url": "http://localhost:7777",
"request": "{{text}}",
"search_fragments": {},
"indexing_fragments": {},
"response": "{{embedding}}",
"headers": {}
} }
} }

View File

@ -1,5 +1,6 @@
--- ---
source: crates/index-scheduler/src/scheduler/test.rs source: crates/index-scheduler/src/scheduler/test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch None: ### Processing batch None:
@ -39,7 +40,7 @@ catto [0,]
[timestamp] [0,1,] [timestamp] [0,1,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 1 of type `taskCancelation` that cannot be batched with any other task.", } 0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,1,] 0 [0,1,]

View File

@ -1,10 +1,11 @@
--- ---
source: crates/index-scheduler/src/scheduler/test.rs source: crates/index-scheduler/src/scheduler/test.rs
snapshot_kind: text
--- ---
### Autobatching Enabled = true ### Autobatching Enabled = true
### Processing batch Some(1): ### Processing batch Some(1):
[1,] [1,]
{uid: 1, details: {"receivedDocuments":1,"indexedDocuments":null}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"beavero":1}}, stop reason: "batched all enqueued tasks for index `beavero`", } {uid: 1, details: {"receivedDocuments":1,"indexedDocuments":null}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"beavero":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Tasks: ### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "catto", primary_key: None, method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }} 0 {uid: 0, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "catto", primary_key: None, method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
@ -46,7 +47,7 @@ catto: { number_of_documents: 1, field_distribution: {"id": 1} }
[timestamp] [0,] [timestamp] [0,]
---------------------------------------------------------------------- ----------------------------------------------------------------------
### All Batches: ### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"catto":1}}, stop reason: "batched all enqueued tasks for index `catto`", } 0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"catto":1}}, }
---------------------------------------------------------------------- ----------------------------------------------------------------------
### Batch to tasks mapping: ### Batch to tasks mapping:
0 [0,] 0 [0,]

Some files were not shown because too many files have changed in this diff Show More