mirror of
https://github.com/meilisearch/meilisearch.git
synced 2025-11-22 12:46:53 +00:00
Compare commits
1 Commits
prototype-
...
introduce-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d14790824d |
@@ -1,29 +1,27 @@
|
||||
---
|
||||
name: New feature issue
|
||||
about: ⚠️ Should only be used by the internal Meili team ⚠️
|
||||
name: New sprint issue
|
||||
about: ⚠️ Should only be used by the engine team ⚠️
|
||||
title: ''
|
||||
labels: 'impacts docs, impacts integrations'
|
||||
labels: 'missing usage in PRD, impacts docs'
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
Related product team resources: [PRD]() (_internal only_)
|
||||
Related product discussion:
|
||||
|
||||
## Motivation
|
||||
|
||||
<!---Copy/paste the information in PRD or briefly detail the product motivation. Ask product team if any hesitation.-->
|
||||
|
||||
## Usage
|
||||
|
||||
<!---Link to the public part of the PRD, or to the related product discussion for experimental features-->
|
||||
|
||||
TBD
|
||||
|
||||
## TODO
|
||||
|
||||
<!---If necessary, create a list with technical/product steps-->
|
||||
|
||||
### Are you modifying a database?
|
||||
|
||||
- [ ] If not, add the `no db change` label to your PR, and you're good to merge.
|
||||
- [ ] If yes, add the `db change` label to your PR. You'll receive a message explaining you what to do.
|
||||
|
||||
### Reminders when modifying the API
|
||||
|
||||
- [ ] Update the openAPI file with utoipa:
|
||||
@@ -34,6 +32,18 @@ TBD
|
||||
- [ ] Once everything is done, start Meilisearch with the swagger flag: `cargo run --features swagger`, open `http://localhost:7700/scalar` on your browser, and ensure everything works as expected.
|
||||
- For more info, refer to [this presentation](https://pitch.com/v/generating-the-openapi-file-jrn3nh).
|
||||
|
||||
### Reminders when modifying a database
|
||||
|
||||
- [ ] If modifying the API keys => Reach out to the team we've not planned anything for that yet
|
||||
- [ ] If modifying a milli index =>
|
||||
- [ ] You can add the upgrade code to go from the previous to the latest version [here](https://github.com/meilisearch/meilisearch/blob/main/crates/milli/src/update/upgrade/mod.rs)
|
||||
- [ ] Don't forget to return `true` to update the stats if needed
|
||||
- [ ] Did the read read part requires a change? (gettings documents, search, facet search etc)
|
||||
The index must **always** be able to read the old version of the database. Make everything with that in mind
|
||||
and if the old and new database are not compatible you may need to match on the version of the database and keep both the old and new code. (The version of the index is stored in the main database under the `VERSION_KEY` key)
|
||||
- [ ] If modifying the task queue or the index-mapper =>
|
||||
- [ ] Update
|
||||
|
||||
### Reminders when modifying the Setting API
|
||||
|
||||
<!--- Special steps to remind when adding a new index setting -->
|
||||
@@ -52,5 +62,5 @@ TBD
|
||||
|
||||
## Impacted teams
|
||||
|
||||
<!---Ping the related teams. Ask on Slack if any hesitation-->
|
||||
<!---@meilisearch/docs-team and @meilisearch/integration-team when there is any API change, e.g. settings addition-->
|
||||
<!---Ping the related teams. Ask for the engine manager if any hesitation-->
|
||||
<!---@meilisearch/docs-team when there is any API change, e.g. settings addition-->
|
||||
1
.github/dependabot.yml
vendored
1
.github/dependabot.yml
vendored
@@ -7,5 +7,6 @@ updates:
|
||||
schedule:
|
||||
interval: "monthly"
|
||||
labels:
|
||||
- 'skip changelog'
|
||||
- 'dependencies'
|
||||
rebase-strategy: disabled
|
||||
|
||||
16
.github/pull_request_template.md
vendored
16
.github/pull_request_template.md
vendored
@@ -1,16 +0,0 @@
|
||||
## Related issue
|
||||
|
||||
Fixes #...
|
||||
|
||||
## Requirements
|
||||
|
||||
⚠️ Ensure the following requirements before merging ⚠️
|
||||
- [ ] Automated tests have been added.
|
||||
- [ ] If some tests cannot be automated, manual rigorous tests should be applied.
|
||||
- [ ] ⚠️ If there is any change in the DB:
|
||||
- [ ] Test that any impacted DB still works as expected after using `--experimental-dumpless-upgrade` on a DB created with the last released Meilisearch
|
||||
- [ ] Test that during the upgrade, **search is still available** (artificially make the upgrade longer if needed)
|
||||
- [ ] Set the `db change` label.
|
||||
- [ ] If necessary, the feature have been tested in the Cloud production environment (with [prototypes](./documentation/prototypes.md)) and the Cloud UI is ready.
|
||||
- [ ] If necessary, the [documentation](https://github.com/meilisearch/documentation) related to the implemented feature in the PR is ready.
|
||||
- [ ] If necessary, the [integrations](https://github.com/meilisearch/integration-guides) related to the implemented feature in the PR are ready.
|
||||
29
.github/release-draft-template.yml
vendored
29
.github/release-draft-template.yml
vendored
@@ -1,29 +0,0 @@
|
||||
name-template: 'v$RESOLVED_VERSION'
|
||||
tag-template: 'v$RESOLVED_VERSION'
|
||||
exclude-labels:
|
||||
- 'skip changelog'
|
||||
version-resolver:
|
||||
minor:
|
||||
labels:
|
||||
- 'enhancement'
|
||||
default: patch
|
||||
categories:
|
||||
- title: '⚠️ Breaking changes'
|
||||
label: 'breaking-change'
|
||||
- title: '🚀 Enhancements'
|
||||
label: 'enhancement'
|
||||
- title: '🐛 Bug Fixes'
|
||||
label: 'bug'
|
||||
- title: '🔒 Security'
|
||||
label: 'security'
|
||||
- title: '⚙️ Maintenance/misc'
|
||||
label:
|
||||
- 'dependencies'
|
||||
- 'maintenance'
|
||||
- 'documentation'
|
||||
template: |
|
||||
$CHANGES
|
||||
|
||||
❤️ Huge thanks to our contributors: $CONTRIBUTORS.
|
||||
no-changes-template: 'Changes are coming soon 😎'
|
||||
sort-direction: 'ascending'
|
||||
22
.github/templates/dependency-issue.md
vendored
22
.github/templates/dependency-issue.md
vendored
@@ -1,22 +0,0 @@
|
||||
This issue is about updating Meilisearch dependencies:
|
||||
- [ ] Update Meilisearch dependencies with the help of `cargo +nightly udeps --all-targets` (remove unused dependencies) and `cargo upgrade` (upgrade dependencies versions) - ⚠️ Some repositories may contain subdirectories (like heed, charabia, or deserr). Take care of updating these in the main crate as well. This won't be done automatically by `cargo upgrade`.
|
||||
- [ ] [deserr](https://github.com/meilisearch/deserr)
|
||||
- [ ] [charabia](https://github.com/meilisearch/charabia/)
|
||||
- [ ] [heed](https://github.com/meilisearch/heed/)
|
||||
- [ ] [roaring-rs](https://github.com/RoaringBitmap/roaring-rs/)
|
||||
- [ ] [obkv](https://github.com/meilisearch/obkv)
|
||||
- [ ] [grenad](https://github.com/meilisearch/grenad/)
|
||||
- [ ] [arroy](https://github.com/meilisearch/arroy/)
|
||||
- [ ] [segment](https://github.com/meilisearch/segment)
|
||||
- [ ] [bumparaw-collections](https://github.com/meilisearch/bumparaw-collections)
|
||||
- [ ] [bbqueue](https://github.com/meilisearch/bbqueue)
|
||||
- [ ] Finally, [Meilisearch](https://github.com/meilisearch/MeiliSearch)
|
||||
- [ ] If new Rust versions have been released, update the minimal Rust version in use at Meilisearch:
|
||||
- [ ] in this [GitHub Action file](https://github.com/meilisearch/meilisearch/blob/main/.github/workflows/test-suite.yml), by changing the `toolchain` field of the `rustfmt` job to the latest available nightly (of the day before or the current day).
|
||||
- [ ] in every [GitHub Action files](https://github.com/meilisearch/meilisearch/blob/main/.github/workflows), by changing all the `dtolnay/rust-toolchain@` references to use the latest stable version.
|
||||
- [ ] in this [`rust-toolchain.toml`](https://github.com/meilisearch/meilisearch/blob/main/rust-toolchain.toml), by changing the `channel` field to the latest stable version.
|
||||
- [ ] in the [Dockerfile](https://github.com/meilisearch/meilisearch/blob/main/Dockerfile), by changing the base image to `rust:<target_rust_version>-alpine<alpine_version>`. Check that the image exists on [Dockerhub](https://hub.docker.com/_/rust/tags?page=1&name=alpine). Also, build and run the image to check everything still works!
|
||||
|
||||
⚠️ This issue should be prioritized to avoid any deprecation and vulnerability issues.
|
||||
|
||||
The GitHub action dependencies are managed by [Dependabot](https://github.com/meilisearch/meilisearch/blob/main/.github/dependabot.yml), so no need to update them when solving this issue.
|
||||
39
.github/workflows/bench-manual.yml
vendored
39
.github/workflows/bench-manual.yml
vendored
@@ -1,27 +1,28 @@
|
||||
name: Bench (manual)
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
workload:
|
||||
description: "The path to the workloads to execute (workloads/...)"
|
||||
required: true
|
||||
default: "workloads/movies.json"
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
workload:
|
||||
description: 'The path to the workloads to execute (workloads/...)'
|
||||
required: true
|
||||
default: 'workloads/movies.json'
|
||||
|
||||
env:
|
||||
WORKLOAD_NAME: ${{ github.event.inputs.workload }}
|
||||
WORKLOAD_NAME: ${{ github.event.inputs.workload }}
|
||||
|
||||
jobs:
|
||||
benchmarks:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
with:
|
||||
profile: minimal
|
||||
benchmarks:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}
|
||||
|
||||
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}
|
||||
|
||||
136
.github/workflows/bench-pr.yml
vendored
136
.github/workflows/bench-pr.yml
vendored
@@ -1,82 +1,82 @@
|
||||
name: Bench (PR)
|
||||
on:
|
||||
issue_comment:
|
||||
types: [created]
|
||||
issue_comment:
|
||||
types: [created]
|
||||
|
||||
permissions:
|
||||
issues: write
|
||||
issues: write
|
||||
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
|
||||
jobs:
|
||||
run-benchmarks-on-comment:
|
||||
if: startsWith(github.event.comment.body, '/bench')
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- name: Check permissions
|
||||
id: permission
|
||||
env:
|
||||
PR_AUTHOR: ${{github.event.issue.user.login }}
|
||||
COMMENT_AUTHOR: ${{github.event.comment.user.login }}
|
||||
REPOSITORY: ${{github.repository}}
|
||||
PR_ID: ${{github.event.issue.number}}
|
||||
run: |
|
||||
PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name)
|
||||
if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push)
|
||||
then
|
||||
echo "::notice title=Authentication success::PR author authenticated"
|
||||
else
|
||||
echo "::error title=Authentication error::PR author doesn't have push permission on this repository"
|
||||
exit 1
|
||||
fi
|
||||
if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push)
|
||||
then
|
||||
echo "::notice title=Authentication success::Comment author authenticated"
|
||||
else
|
||||
echo "::error title=Authentication error::Comment author doesn't have push permission on this repository"
|
||||
exit 1
|
||||
fi
|
||||
if [ "$PR_REPOSITORY" = "$REPOSITORY" ]
|
||||
then
|
||||
echo "::notice title=Authentication success::PR started from main repository"
|
||||
else
|
||||
echo "::error title=Authentication error::PR started from a fork"
|
||||
exit 1
|
||||
fi
|
||||
run-benchmarks-on-comment:
|
||||
if: startsWith(github.event.comment.body, '/bench')
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- name: Check permissions
|
||||
id: permission
|
||||
env:
|
||||
PR_AUTHOR: ${{github.event.issue.user.login }}
|
||||
COMMENT_AUTHOR: ${{github.event.comment.user.login }}
|
||||
REPOSITORY: ${{github.repository}}
|
||||
PR_ID: ${{github.event.issue.number}}
|
||||
run: |
|
||||
PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name)
|
||||
if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push)
|
||||
then
|
||||
echo "::notice title=Authentication success::PR author authenticated"
|
||||
else
|
||||
echo "::error title=Authentication error::PR author doesn't have push permission on this repository"
|
||||
exit 1
|
||||
fi
|
||||
if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push)
|
||||
then
|
||||
echo "::notice title=Authentication success::Comment author authenticated"
|
||||
else
|
||||
echo "::error title=Authentication error::Comment author doesn't have push permission on this repository"
|
||||
exit 1
|
||||
fi
|
||||
if [ "$PR_REPOSITORY" = "$REPOSITORY" ]
|
||||
then
|
||||
echo "::notice title=Authentication success::PR started from main repository"
|
||||
else
|
||||
echo "::error title=Authentication error::PR started from a fork"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Check for Command
|
||||
id: command
|
||||
uses: xt0rted/slash-command-action@v2
|
||||
with:
|
||||
command: bench
|
||||
reaction-type: "rocket"
|
||||
repo-token: ${{ env.GH_TOKEN }}
|
||||
- name: Check for Command
|
||||
id: command
|
||||
uses: xt0rted/slash-command-action@v2
|
||||
with:
|
||||
command: bench
|
||||
reaction-type: "rocket"
|
||||
repo-token: ${{ env.GH_TOKEN }}
|
||||
|
||||
- uses: xt0rted/pull-request-comment-branch@v3
|
||||
id: comment-branch
|
||||
with:
|
||||
repo_token: ${{ env.GH_TOKEN }}
|
||||
- uses: xt0rted/pull-request-comment-branch@v3
|
||||
id: comment-branch
|
||||
with:
|
||||
repo_token: ${{ env.GH_TOKEN }}
|
||||
|
||||
- uses: actions/checkout@v5
|
||||
if: success()
|
||||
with:
|
||||
fetch-depth: 0 # fetch full history to be able to get main commit sha
|
||||
ref: ${{ steps.comment-branch.outputs.head_ref }}
|
||||
- uses: actions/checkout@v3
|
||||
if: success()
|
||||
with:
|
||||
fetch-depth: 0 # fetch full history to be able to get main commit sha
|
||||
ref: ${{ steps.comment-branch.outputs.head_ref }}
|
||||
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
with:
|
||||
profile: minimal
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
- name: Run benchmarks on PR ${{ github.event.issue.id }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" \
|
||||
--dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" \
|
||||
--reason "[Comment](${{ github.event.comment.html_url }}) on [#${{ github.event.issue.number }}](${{ github.event.issue.html_url }})" \
|
||||
-- ${{ steps.command.outputs.command-arguments }} > benchlinks.txt
|
||||
- name: Run benchmarks on PR ${{ github.event.issue.id }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" \
|
||||
--dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" \
|
||||
--reason "[Comment](${{ github.event.comment.html_url }}) on [#${{ github.event.issue.number }}](${{ github.event.issue.html_url }})" \
|
||||
-- ${{ steps.command.outputs.command-arguments }} > benchlinks.txt
|
||||
|
||||
- name: Send comment in PR
|
||||
run: |
|
||||
gh pr comment ${{github.event.issue.number}} --body-file benchlinks.txt
|
||||
- name: Send comment in PR
|
||||
run: |
|
||||
gh pr comment ${{github.event.issue.number}} --body-file benchlinks.txt
|
||||
|
||||
33
.github/workflows/bench-push-indexing.yml
vendored
33
.github/workflows/bench-push-indexing.yml
vendored
@@ -1,22 +1,23 @@
|
||||
name: Indexing bench (push)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
|
||||
jobs:
|
||||
benchmarks:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
with:
|
||||
profile: minimal
|
||||
benchmarks:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 180 # 3h
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
# Run benchmarks
|
||||
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json
|
||||
|
||||
# Run benchmarks
|
||||
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
|
||||
run: |
|
||||
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json
|
||||
|
||||
10
.github/workflows/benchmarks-manual.yml
vendored
10
.github/workflows/benchmarks-manual.yml
vendored
@@ -4,9 +4,9 @@ on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
dataset_name:
|
||||
description: "The name of the dataset used to benchmark (search_songs, search_wiki, search_geo or indexing)"
|
||||
description: 'The name of the dataset used to benchmark (search_songs, search_wiki, search_geo or indexing)'
|
||||
required: false
|
||||
default: "search_songs"
|
||||
default: 'search_songs'
|
||||
|
||||
env:
|
||||
BENCH_NAME: ${{ github.event.inputs.dataset_name }}
|
||||
@@ -17,8 +17,8 @@ jobs:
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 4320 # 72h
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -67,7 +67,7 @@ jobs:
|
||||
out_dir: critcmp_results
|
||||
|
||||
# Helper
|
||||
- name: "README: compare with another benchmark"
|
||||
- name: 'README: compare with another benchmark'
|
||||
run: |
|
||||
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
|
||||
echo 'How to compare this benchmark with another one?'
|
||||
|
||||
4
.github/workflows/benchmarks-pr.yml
vendored
4
.github/workflows/benchmarks-pr.yml
vendored
@@ -44,7 +44,7 @@ jobs:
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -61,7 +61,7 @@ jobs:
|
||||
with:
|
||||
repo_token: ${{ env.GH_TOKEN }}
|
||||
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
if: success()
|
||||
with:
|
||||
fetch-depth: 0 # fetch full history to be able to get main commit sha
|
||||
|
||||
@@ -15,8 +15,8 @@ jobs:
|
||||
runs-on: benchmarks
|
||||
timeout-minutes: 4320 # 72h
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -69,7 +69,7 @@ jobs:
|
||||
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
|
||||
|
||||
# Helper
|
||||
- name: "README: compare with another benchmark"
|
||||
- name: 'README: compare with another benchmark'
|
||||
run: |
|
||||
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
|
||||
echo 'How to compare this benchmark with another one?'
|
||||
|
||||
@@ -14,8 +14,8 @@ jobs:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -68,7 +68,7 @@ jobs:
|
||||
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
|
||||
|
||||
# Helper
|
||||
- name: "README: compare with another benchmark"
|
||||
- name: 'README: compare with another benchmark'
|
||||
run: |
|
||||
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
|
||||
echo 'How to compare this benchmark with another one?'
|
||||
|
||||
@@ -14,8 +14,8 @@ jobs:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -68,7 +68,7 @@ jobs:
|
||||
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
|
||||
|
||||
# Helper
|
||||
- name: "README: compare with another benchmark"
|
||||
- name: 'README: compare with another benchmark'
|
||||
run: |
|
||||
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
|
||||
echo 'How to compare this benchmark with another one?'
|
||||
|
||||
@@ -14,8 +14,8 @@ jobs:
|
||||
name: Run and upload benchmarks
|
||||
runs-on: benchmarks
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
@@ -68,7 +68,7 @@ jobs:
|
||||
run: telegraf --config https://eu-central-1-1.aws.cloud2.influxdata.com/api/v2/telegrafs/08b52e34a370b000 --once --debug
|
||||
|
||||
# Helper
|
||||
- name: "README: compare with another benchmark"
|
||||
- name: 'README: compare with another benchmark'
|
||||
run: |
|
||||
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
|
||||
echo 'How to compare this benchmark with another one?'
|
||||
|
||||
57
.github/workflows/db-change-comments.yml
vendored
57
.github/workflows/db-change-comments.yml
vendored
@@ -1,57 +0,0 @@
|
||||
name: Comment when db change labels are added
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types: [labeled]
|
||||
|
||||
env:
|
||||
MESSAGE: |
|
||||
### Hello, I'm a bot 🤖
|
||||
|
||||
You are receiving this message because you declared that this PR make changes to the Meilisearch database.
|
||||
Depending on the nature of the change, additional actions might be required on your part. The following sections detail the additional actions depending on the nature of the change, please copy the relevant section in the description of your PR, and make sure to perform the required actions.
|
||||
|
||||
Thank you for contributing to Meilisearch :heart:
|
||||
|
||||
## This PR makes forward-compatible changes
|
||||
|
||||
*Forward-compatible changes are changes to the database such that databases created in an older version of Meilisearch are still valid in the new version of Meilisearch. They usually represent additive changes, like adding a new optional attribute or setting.*
|
||||
|
||||
- [ ] Detail the change to the DB format and why they are forward compatible
|
||||
- [ ] Forward-compatibility: A database created before this PR and using the features touched by this PR was able to be opened by a Meilisearch produced by the code of this PR.
|
||||
|
||||
|
||||
## This PR makes breaking changes
|
||||
|
||||
*Breaking changes are changes to the database such that databases created in an older version of Meilisearch need changes to remain valid in the new version of Meilisearch. This typically happens when the way to store the data changed (change of database, new required key, etc). This can also happen due to breaking changes in the API of an experimental feature. ⚠️ This kind of changes are more difficult to achieve safely, so proceed with caution and test dumpless upgrade right before merging the PR.*
|
||||
|
||||
- [ ] Detail the changes to the DB format,
|
||||
- [ ] which are compatible, and why
|
||||
- [ ] which are not compatible, why, and how they will be fixed up in the upgrade
|
||||
- [ ] /!\ Ensure all the read operations still work!
|
||||
- If the change happened in milli, you may need to check the version of the database before doing any read operation
|
||||
- If the change happened in the index-scheduler, make sure the new code can immediately read the old database
|
||||
- If the change happened in the meilisearch-auth database, reach out to the team; we don't know yet how to handle these changes
|
||||
- [ ] Write the code to go from the old database to the new one
|
||||
- If the change happened in milli, the upgrade function should be written and called [here](https://github.com/meilisearch/meilisearch/blob/3fd86e8d76d7d468b0095d679adb09211ca3b6c0/crates/milli/src/update/upgrade/mod.rs#L24-L47)
|
||||
- If the change happened in the index-scheduler, we've never done it yet, but the right place to do it should be [here](https://github.com/meilisearch/meilisearch/blob/3fd86e8d76d7d468b0095d679adb09211ca3b6c0/crates/index-scheduler/src/scheduler/process_upgrade/mod.rs#L13)
|
||||
- [ ] Write an integration test [here](https://github.com/meilisearch/meilisearch/blob/main/crates/meilisearch/tests/upgrade/mod.rs) ensuring you can read the old database, upgrade to the new database, and read the new database as expected
|
||||
|
||||
|
||||
jobs:
|
||||
add-comment:
|
||||
runs-on: ubuntu-latest
|
||||
if: github.event.label.name == 'db change'
|
||||
steps:
|
||||
- name: Add comment
|
||||
uses: actions/github-script@v7
|
||||
with:
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
script: |
|
||||
const message = process.env.MESSAGE;
|
||||
github.rest.issues.createComment({
|
||||
issue_number: context.issue.number,
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
body: message
|
||||
})
|
||||
28
.github/workflows/db-change-missing.yml
vendored
28
.github/workflows/db-change-missing.yml
vendored
@@ -1,28 +0,0 @@
|
||||
name: Check db change labels
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types: [opened, synchronize, reopened, labeled, unlabeled]
|
||||
|
||||
jobs:
|
||||
check-labels:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v5
|
||||
- name: Check db change labels
|
||||
id: check_labels
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
run: |
|
||||
URL=/repos/meilisearch/meilisearch/pulls/${{ github.event.pull_request.number }}/labels
|
||||
echo ${{ github.event.pull_request.number }}
|
||||
echo $URL
|
||||
LABELS=$(gh api -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" /repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels -q .[].name)
|
||||
echo "Labels: $LABELS"
|
||||
if [[ ! "$LABELS" =~ "db change" && ! "$LABELS" =~ "no db change" ]]; then
|
||||
echo "::error::Pull request must contain either the 'db change' or 'no db change' label."
|
||||
exit 1
|
||||
else
|
||||
echo "The label is set"
|
||||
fi
|
||||
4
.github/workflows/dependency-issue.yml
vendored
4
.github/workflows/dependency-issue.yml
vendored
@@ -13,9 +13,9 @@ jobs:
|
||||
ISSUE_TEMPLATE: issue-template.md
|
||||
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Download the issue template
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/meilisearch/main/.github/templates/dependency-issue.md > $ISSUE_TEMPLATE
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/dependency-issue.md > $ISSUE_TEMPLATE
|
||||
- name: Create issue
|
||||
run: |
|
||||
gh issue create \
|
||||
|
||||
38
.github/workflows/flaky-tests.yml
vendored
38
.github/workflows/flaky-tests.yml
vendored
@@ -3,28 +3,28 @@ name: Look for flaky tests
|
||||
on:
|
||||
workflow_dispatch:
|
||||
schedule:
|
||||
- cron: '0 4 * * *' # Every day at 4:00AM
|
||||
- cron: "0 12 * * FRI" # Every Friday at 12:00PM
|
||||
|
||||
jobs:
|
||||
flaky:
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- name: Install cargo-flaky
|
||||
run: cargo install cargo-flaky
|
||||
- name: Run cargo flaky in the dumps
|
||||
run: cd crates/dump; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in the index-scheduler
|
||||
run: cd crates/index-scheduler; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in the auth
|
||||
run: cd crates/meilisearch-auth; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in meilisearch
|
||||
run: cd crates/meilisearch; cargo flaky -i 100 --release
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Install cargo-flaky
|
||||
run: cargo install cargo-flaky
|
||||
- name: Run cargo flaky in the dumps
|
||||
run: cd crates/dump; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in the index-scheduler
|
||||
run: cd crates/index-scheduler; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in the auth
|
||||
run: cd crates/meilisearch-auth; cargo flaky -i 100 --release
|
||||
- name: Run cargo flaky in meilisearch
|
||||
run: cd crates/meilisearch; cargo flaky -i 100 --release
|
||||
|
||||
4
.github/workflows/fuzzer-indexing.yml
vendored
4
.github/workflows/fuzzer-indexing.yml
vendored
@@ -11,8 +11,8 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 4320 # 72h
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
|
||||
|
||||
4
.github/workflows/latest-git-tag.yml
vendored
4
.github/workflows/latest-git-tag.yml
vendored
@@ -10,7 +10,7 @@ jobs:
|
||||
name: Check the version validity
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Check release validity
|
||||
if: github.event_name == 'release'
|
||||
run: bash .github/scripts/check-release.sh
|
||||
@@ -19,7 +19,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
needs: check-version
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- uses: rickstaa/action-create-tag@v1
|
||||
with:
|
||||
tag: "latest"
|
||||
|
||||
192
.github/workflows/milestone-workflow.yml
vendored
Normal file
192
.github/workflows/milestone-workflow.yml
vendored
Normal file
@@ -0,0 +1,192 @@
|
||||
name: Milestone's workflow
|
||||
|
||||
# /!\ No git flow are handled here
|
||||
|
||||
# For each Milestone created (not opened!), and if the release is NOT a patch release (only the patch changed)
|
||||
# - the roadmap issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/roadmap-issue.md
|
||||
# - the changelog issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/changelog-issue.md
|
||||
|
||||
# For each Milestone closed
|
||||
# - the `release_version` label is created
|
||||
# - this label is applied to all issues/PRs in the Milestone
|
||||
|
||||
on:
|
||||
milestone:
|
||||
types: [created, closed]
|
||||
|
||||
env:
|
||||
MILESTONE_VERSION: ${{ github.event.milestone.title }}
|
||||
MILESTONE_URL: ${{ github.event.milestone.html_url }}
|
||||
MILESTONE_DUE_ON: ${{ github.event.milestone.due_on }}
|
||||
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
|
||||
jobs:
|
||||
|
||||
# -----------------
|
||||
# MILESTONE CREATED
|
||||
# -----------------
|
||||
|
||||
get-release-version:
|
||||
if: github.event.action == 'created'
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
is-patch: ${{ steps.check-patch.outputs.is-patch }}
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Check if this release is a patch release only
|
||||
id: check-patch
|
||||
run: |
|
||||
echo version: $MILESTONE_VERSION
|
||||
if [[ $MILESTONE_VERSION =~ ^v[0-9]+\.[0-9]+\.0$ ]]; then
|
||||
echo 'This is NOT a patch release'
|
||||
echo "is-patch=false" >> $GITHUB_OUTPUT
|
||||
elif [[ $MILESTONE_VERSION =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
|
||||
echo 'This is a patch release'
|
||||
echo "is-patch=true" >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "Not a valid format of release, check the Milestone's title."
|
||||
echo 'Should be vX.Y.Z'
|
||||
exit 1
|
||||
fi
|
||||
|
||||
create-roadmap-issue:
|
||||
needs: get-release-version
|
||||
# Create the roadmap issue if the release is not only a patch release
|
||||
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
ISSUE_TEMPLATE: issue-template.md
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Download the issue template
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/roadmap-issue.md > $ISSUE_TEMPLATE
|
||||
- name: Replace all empty occurrences in the templates
|
||||
run: |
|
||||
# Replace all <<version>> occurrences
|
||||
sed -i "s/<<version>>/$MILESTONE_VERSION/g" $ISSUE_TEMPLATE
|
||||
|
||||
# Replace all <<milestone_id>> occurrences
|
||||
milestone_id=$(echo $MILESTONE_URL | cut -d '/' -f 7)
|
||||
sed -i "s/<<milestone_id>>/$milestone_id/g" $ISSUE_TEMPLATE
|
||||
|
||||
# Replace release date if exists
|
||||
if [[ ! -z $MILESTONE_DUE_ON ]]; then
|
||||
date=$(echo $MILESTONE_DUE_ON | cut -d 'T' -f 1)
|
||||
sed -i "s/Release date\: 20XX-XX-XX/Release date\: $date/g" $ISSUE_TEMPLATE
|
||||
fi
|
||||
- name: Create the issue
|
||||
run: |
|
||||
gh issue create \
|
||||
--title "$MILESTONE_VERSION ROADMAP" \
|
||||
--label 'epic,impacts docs,impacts integrations,impacts cloud' \
|
||||
--body-file $ISSUE_TEMPLATE \
|
||||
--milestone $MILESTONE_VERSION
|
||||
|
||||
create-changelog-issue:
|
||||
needs: get-release-version
|
||||
# Create the changelog issue if the release is not only a patch release
|
||||
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
ISSUE_TEMPLATE: issue-template.md
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Download the issue template
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/changelog-issue.md > $ISSUE_TEMPLATE
|
||||
- name: Replace all empty occurrences in the templates
|
||||
run: |
|
||||
# Replace all <<version>> occurrences
|
||||
sed -i "s/<<version>>/$MILESTONE_VERSION/g" $ISSUE_TEMPLATE
|
||||
|
||||
# Replace all <<milestone_id>> occurrences
|
||||
milestone_id=$(echo $MILESTONE_URL | cut -d '/' -f 7)
|
||||
sed -i "s/<<milestone_id>>/$milestone_id/g" $ISSUE_TEMPLATE
|
||||
- name: Create the issue
|
||||
run: |
|
||||
gh issue create \
|
||||
--title "Create release changelogs for $MILESTONE_VERSION" \
|
||||
--label 'impacts docs,documentation' \
|
||||
--body-file $ISSUE_TEMPLATE \
|
||||
--milestone $MILESTONE_VERSION \
|
||||
--assignee curquiza
|
||||
|
||||
create-update-version-issue:
|
||||
needs: get-release-version
|
||||
# Create the update-version issue even if the release is a patch release
|
||||
if: github.event.action == 'created'
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
ISSUE_TEMPLATE: issue-template.md
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Download the issue template
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-version-issue.md > $ISSUE_TEMPLATE
|
||||
- name: Create the issue
|
||||
run: |
|
||||
gh issue create \
|
||||
--title "Update version in Cargo.toml for $MILESTONE_VERSION" \
|
||||
--label 'maintenance' \
|
||||
--body-file $ISSUE_TEMPLATE \
|
||||
--milestone $MILESTONE_VERSION
|
||||
|
||||
create-update-openapi-issue:
|
||||
needs: get-release-version
|
||||
# Create the openAPI issue if the release is not only a patch release
|
||||
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
ISSUE_TEMPLATE: issue-template.md
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Download the issue template
|
||||
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-openapi-issue.md > $ISSUE_TEMPLATE
|
||||
- name: Create the issue
|
||||
run: |
|
||||
gh issue create \
|
||||
--title "Update Open API file for $MILESTONE_VERSION" \
|
||||
--label 'maintenance' \
|
||||
--body-file $ISSUE_TEMPLATE \
|
||||
--milestone $MILESTONE_VERSION
|
||||
|
||||
# ----------------
|
||||
# MILESTONE CLOSED
|
||||
# ----------------
|
||||
|
||||
create-release-label:
|
||||
if: github.event.action == 'closed'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Create the ${{ env.MILESTONE_VERSION }} label
|
||||
run: |
|
||||
label_description="PRs/issues solved in $MILESTONE_VERSION"
|
||||
if [[ ! -z $MILESTONE_DUE_ON ]]; then
|
||||
date=$(echo $MILESTONE_DUE_ON | cut -d 'T' -f 1)
|
||||
label_description="$label_description released on $date"
|
||||
fi
|
||||
|
||||
gh api repos/meilisearch/meilisearch/labels \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-f name="$MILESTONE_VERSION" \
|
||||
-f description="$label_description" \
|
||||
-f color='ff5ba3'
|
||||
|
||||
labelize-all-milestone-content:
|
||||
if: github.event.action == 'closed'
|
||||
needs: create-release-label
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Add label ${{ env.MILESTONE_VERSION }} to all PRs in the Milestone
|
||||
run: |
|
||||
prs=$(gh pr list --search milestone:"$MILESTONE_VERSION" --limit 1000 --state all --json number --template '{{range .}}{{tablerow (printf "%v" .number)}}{{end}}')
|
||||
for pr in $prs; do
|
||||
gh pr edit $pr --add-label $MILESTONE_VERSION
|
||||
done
|
||||
- name: Add label ${{ env.MILESTONE_VERSION }} to all issues in the Milestone
|
||||
run: |
|
||||
issues=$(gh issue list --search milestone:"$MILESTONE_VERSION" --limit 1000 --state all --json number --template '{{range .}}{{tablerow (printf "%v" .number)}}{{end}}')
|
||||
for issue in $issues; do
|
||||
gh issue edit $issue --add-label $MILESTONE_VERSION
|
||||
done
|
||||
44
.github/workflows/publish-apt-brew-pkg.yml
vendored
44
.github/workflows/publish-apt-brew-pkg.yml
vendored
@@ -9,7 +9,7 @@ jobs:
|
||||
name: Check the version validity
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Check release validity
|
||||
run: bash .github/scripts/check-release.sh
|
||||
|
||||
@@ -18,28 +18,28 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
needs: check-version
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
steps:
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- name: Install cargo-deb
|
||||
run: cargo install cargo-deb
|
||||
- uses: actions/checkout@v5
|
||||
- name: Build deb package
|
||||
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
|
||||
- name: Upload debian pkg to release
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/debian/meilisearch.deb
|
||||
asset_name: meilisearch.deb
|
||||
tag: ${{ github.ref }}
|
||||
- name: Upload debian pkg to apt repository
|
||||
run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Install cargo-deb
|
||||
run: cargo install cargo-deb
|
||||
- uses: actions/checkout@v3
|
||||
- name: Build deb package
|
||||
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
|
||||
- name: Upload debian pkg to release
|
||||
uses: svenstaro/upload-release-action@2.7.0
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/debian/meilisearch.deb
|
||||
asset_name: meilisearch.deb
|
||||
tag: ${{ github.ref }}
|
||||
- name: Upload debian pkg to apt repository
|
||||
run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/
|
||||
|
||||
homebrew:
|
||||
name: Bump Homebrew formula
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
name: Publish assets to GitHub release
|
||||
name: Publish binaries to GitHub release
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
schedule:
|
||||
- cron: "0 2 * * *" # Every day at 2:00am
|
||||
- cron: '0 2 * * *' # Every day at 2:00am
|
||||
release:
|
||||
types: [published]
|
||||
|
||||
@@ -11,9 +11,9 @@ jobs:
|
||||
check-version:
|
||||
name: Check the version validity
|
||||
runs-on: ubuntu-latest
|
||||
# No need to check the version for dry run (cron or workflow_dispatch)
|
||||
# No need to check the version for dry run (cron)
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
# Check if the tag has the v<nmumber>.<number>.<number> format.
|
||||
# If yes, it means we are publishing an official release.
|
||||
# If no, we are releasing a RC, so no need to check the version.
|
||||
@@ -37,26 +37,26 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
needs: check-version
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- name: Build
|
||||
run: cargo build --release --locked
|
||||
# No need to upload binaries for dry run (cron or workflow_dispatch)
|
||||
- name: Upload binaries to release
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/release/meilisearch
|
||||
asset_name: meilisearch-linux-amd64
|
||||
tag: ${{ github.ref }}
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Build
|
||||
run: cargo build --release --locked
|
||||
# No need to upload binaries for dry run (cron)
|
||||
- name: Upload binaries to release
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.7.0
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/release/meilisearch
|
||||
asset_name: meilisearch-linux-amd64
|
||||
tag: ${{ github.ref }}
|
||||
|
||||
publish-macos-windows:
|
||||
name: Publish binary for ${{ matrix.os }}
|
||||
@@ -74,19 +74,19 @@ jobs:
|
||||
artifact_name: meilisearch.exe
|
||||
asset_name: meilisearch-windows-amd64.exe
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- name: Build
|
||||
run: cargo build --release --locked
|
||||
# No need to upload binaries for dry run (cron or workflow_dispatch)
|
||||
- name: Upload binaries to release
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/release/${{ matrix.artifact_name }}
|
||||
asset_name: ${{ matrix.asset_name }}
|
||||
tag: ${{ github.ref }}
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Build
|
||||
run: cargo build --release --locked
|
||||
# No need to upload binaries for dry run (cron)
|
||||
- name: Upload binaries to release
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.7.0
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/release/${{ matrix.artifact_name }}
|
||||
asset_name: ${{ matrix.asset_name }}
|
||||
tag: ${{ github.ref }}
|
||||
|
||||
publish-macos-apple-silicon:
|
||||
name: Publish binary for macOS silicon
|
||||
@@ -99,9 +99,9 @@ jobs:
|
||||
asset_name: meilisearch-macos-apple-silicon
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v3
|
||||
- name: Installing Rust toolchain
|
||||
uses: dtolnay/rust-toolchain@1.89
|
||||
uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
target: ${{ matrix.target }}
|
||||
@@ -111,9 +111,9 @@ jobs:
|
||||
command: build
|
||||
args: --release --target ${{ matrix.target }}
|
||||
- name: Upload the binary to release
|
||||
# No need to upload binaries for dry run (cron or workflow_dispatch)
|
||||
# No need to upload binaries for dry run (cron)
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
uses: svenstaro/upload-release-action@2.7.0
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/${{ matrix.target }}/release/meilisearch
|
||||
@@ -127,8 +127,8 @@ jobs:
|
||||
env:
|
||||
DEBIAN_FRONTEND: noninteractive
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
strategy:
|
||||
matrix:
|
||||
include:
|
||||
@@ -136,7 +136,7 @@ jobs:
|
||||
asset_name: meilisearch-linux-aarch64
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update -y && apt upgrade -y
|
||||
@@ -148,7 +148,7 @@ jobs:
|
||||
add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
|
||||
apt-get update -y && apt-get install -y docker-ce
|
||||
- name: Installing Rust toolchain
|
||||
uses: dtolnay/rust-toolchain@1.89
|
||||
uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
target: ${{ matrix.target }}
|
||||
@@ -176,37 +176,11 @@ jobs:
|
||||
- name: List target output files
|
||||
run: ls -lR ./target
|
||||
- name: Upload the binary to release
|
||||
# No need to upload binaries for dry run (cron or workflow_dispatch)
|
||||
# No need to upload binaries for dry run (cron)
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
uses: svenstaro/upload-release-action@2.7.0
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: target/${{ matrix.target }}/release/meilisearch
|
||||
asset_name: ${{ matrix.asset_name }}
|
||||
tag: ${{ github.ref }}
|
||||
|
||||
publish-openapi-file:
|
||||
name: Publish OpenAPI file
|
||||
needs: check-version
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v5
|
||||
- name: Setup Rust
|
||||
uses: actions-rs/toolchain@v1
|
||||
with:
|
||||
toolchain: stable
|
||||
override: true
|
||||
- name: Generate OpenAPI file
|
||||
run: |
|
||||
cd crates/openapi-generator
|
||||
cargo run --release -- --pretty --output ../../meilisearch.json
|
||||
- name: Upload OpenAPI to Release
|
||||
# No need to upload for dry run (cron or workflow_dispatch)
|
||||
if: github.event_name == 'release'
|
||||
uses: svenstaro/upload-release-action@2.11.2
|
||||
with:
|
||||
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
|
||||
file: ./meilisearch.json
|
||||
asset_name: meilisearch-openapi.json
|
||||
tag: ${{ github.ref }}
|
||||
38
.github/workflows/publish-docker-images.yml
vendored
38
.github/workflows/publish-docker-images.yml
vendored
@@ -16,10 +16,8 @@ on:
|
||||
jobs:
|
||||
docker:
|
||||
runs-on: docker
|
||||
permissions:
|
||||
id-token: write # This is needed to use Cosign in keyless mode
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
|
||||
# If we are running a cron or manual job ('schedule' or 'workflow_dispatch' event), it means we are publishing the `nightly` tag, so not considered stable.
|
||||
# If we have pushed a tag, and the tag has the v<nmumber>.<number>.<number> format, it means we are publishing an official release, so considered stable.
|
||||
@@ -64,9 +62,6 @@ jobs:
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Install cosign
|
||||
uses: sigstore/cosign-installer@d7543c93d881b35a8faa02e8e3605f69b7a1ce62 # tag=v3.10.0
|
||||
|
||||
- name: Login to Docker Hub
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
@@ -90,7 +85,6 @@ jobs:
|
||||
|
||||
- name: Build and push
|
||||
uses: docker/build-push-action@v6
|
||||
id: build-and-push
|
||||
with:
|
||||
push: true
|
||||
platforms: linux/amd64,linux/arm64
|
||||
@@ -100,17 +94,6 @@ jobs:
|
||||
COMMIT_DATE=${{ steps.build-metadata.outputs.date }}
|
||||
GIT_TAG=${{ github.ref_name }}
|
||||
|
||||
- name: Sign the images with GitHub OIDC Token
|
||||
env:
|
||||
DIGEST: ${{ steps.build-and-push.outputs.digest }}
|
||||
TAGS: ${{ steps.meta.outputs.tags }}
|
||||
run: |
|
||||
images=""
|
||||
for tag in ${TAGS}; do
|
||||
images+="${tag}@${DIGEST} "
|
||||
done
|
||||
cosign sign --yes ${images}
|
||||
|
||||
# /!\ Don't touch this without checking with Cloud team
|
||||
- name: Send CI information to Cloud team
|
||||
# Do not send if nightly build (i.e. 'schedule' or 'workflow_dispatch' event)
|
||||
@@ -121,22 +104,3 @@ jobs:
|
||||
repository: meilisearch/meilisearch-cloud
|
||||
event-type: cloud-docker-build
|
||||
client-payload: '{ "meilisearch_version": "${{ github.ref_name }}", "stable": "${{ steps.check-tag-format.outputs.stable }}" }'
|
||||
|
||||
# Send notification to Swarmia to notify of a deployment: https://app.swarmia.com
|
||||
# - name: 'Setup jq'
|
||||
# uses: dcarbone/install-jq-action
|
||||
# - name: Send deployment to Swarmia
|
||||
# if: github.event_name == 'push' && success()
|
||||
# run: |
|
||||
# JSON_STRING=$( jq --null-input --compact-output \
|
||||
# --arg version "${{ github.ref_name }}" \
|
||||
# --arg appName "meilisearch" \
|
||||
# --arg environment "production" \
|
||||
# --arg commitSha "${{ github.sha }}" \
|
||||
# --arg repositoryFullName "${{ github.repository }}" \
|
||||
# '{"version": $version, "appName": $appName, "environment": $environment, "commitSha": $commitSha, "repositoryFullName": $repositoryFullName}' )
|
||||
|
||||
# curl -H "Authorization: ${{ secrets.SWARMIA_DEPLOYMENTS_AUTHORIZATION }}" \
|
||||
# -H "Content-Type: application/json" \
|
||||
# -d "$JSON_STRING" \
|
||||
# https://hook.swarmia.com/deployments
|
||||
|
||||
20
.github/workflows/release-drafter.yml
vendored
20
.github/workflows/release-drafter.yml
vendored
@@ -1,20 +0,0 @@
|
||||
name: Release Drafter
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: write
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
|
||||
jobs:
|
||||
update_release_draft:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: release-drafter/release-drafter@v6
|
||||
with:
|
||||
config-name: release-draft-template.yml
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.RELEASE_DRAFTER_TOKEN }}
|
||||
62
.github/workflows/sdks-tests.yml
vendored
62
.github/workflows/sdks-tests.yml
vendored
@@ -9,7 +9,7 @@ on:
|
||||
required: false
|
||||
default: nightly
|
||||
schedule:
|
||||
- cron: '0 6 * * *' # Every day at 6:00am
|
||||
- cron: "0 6 * * MON" # Every Monday at 6:00AM
|
||||
|
||||
env:
|
||||
MEILI_MASTER_KEY: 'masterKey'
|
||||
@@ -22,7 +22,7 @@ jobs:
|
||||
outputs:
|
||||
docker-image: ${{ steps.define-image.outputs.docker-image }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
- name: Define the Docker image we need to use
|
||||
id: define-image
|
||||
run: |
|
||||
@@ -46,13 +46,13 @@ jobs:
|
||||
MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-dotnet
|
||||
- name: Setup .NET Core
|
||||
uses: actions/setup-dotnet@v5
|
||||
uses: actions/setup-dotnet@v4
|
||||
with:
|
||||
dotnet-version: "8.0.x"
|
||||
dotnet-version: "6.0.x"
|
||||
- name: Install dependencies
|
||||
run: dotnet restore
|
||||
- name: Build
|
||||
@@ -75,7 +75,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-dart
|
||||
- uses: dart-lang/setup-dart@v1
|
||||
@@ -100,10 +100,10 @@ jobs:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v6
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: stable
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-go
|
||||
- name: Get dependencies
|
||||
@@ -114,7 +114,7 @@ jobs:
|
||||
dep ensure
|
||||
fi
|
||||
- name: Run integration tests
|
||||
run: go test --race -v ./integration
|
||||
run: go test -v ./...
|
||||
|
||||
meilisearch-java-tests:
|
||||
needs: define-docker-image
|
||||
@@ -129,19 +129,19 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-java
|
||||
- name: Set up Java
|
||||
uses: actions/setup-java@v5
|
||||
uses: actions/setup-java@v4
|
||||
with:
|
||||
java-version: 17
|
||||
distribution: 'temurin'
|
||||
java-version: 8
|
||||
distribution: 'zulu'
|
||||
cache: gradle
|
||||
- name: Grant execute permission for gradlew
|
||||
run: chmod +x gradlew
|
||||
- name: Build and run unit and integration tests
|
||||
run: ./gradlew build integrationTest --info
|
||||
run: ./gradlew build integrationTest
|
||||
|
||||
meilisearch-js-tests:
|
||||
needs: define-docker-image
|
||||
@@ -156,11 +156,11 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-js
|
||||
- name: Setup node
|
||||
uses: actions/setup-node@v5
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
cache: 'yarn'
|
||||
- name: Install dependencies
|
||||
@@ -191,7 +191,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-php
|
||||
- name: Install PHP
|
||||
@@ -220,11 +220,11 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-python
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v6
|
||||
uses: actions/setup-python@v5
|
||||
- name: Install pipenv
|
||||
uses: dschep/install-pipenv-action@v1
|
||||
- name: Install dependencies
|
||||
@@ -245,7 +245,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-ruby
|
||||
- name: Set up Ruby 3
|
||||
@@ -270,7 +270,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-rust
|
||||
- name: Build
|
||||
@@ -291,7 +291,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-swift
|
||||
- name: Run tests
|
||||
@@ -314,11 +314,11 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-js-plugins
|
||||
- name: Setup node
|
||||
uses: actions/setup-node@v5
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
cache: yarn
|
||||
- name: Install dependencies
|
||||
@@ -344,23 +344,15 @@ jobs:
|
||||
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
|
||||
ports:
|
||||
- '7700:7700'
|
||||
env:
|
||||
RAILS_VERSION: '7.0'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-rails
|
||||
- name: Install SQLite dependencies
|
||||
run: sudo apt-get update && sudo apt-get install -y libsqlite3-dev
|
||||
- name: Set up Ruby
|
||||
- name: Set up Ruby 3
|
||||
uses: ruby/setup-ruby@v1
|
||||
with:
|
||||
ruby-version: 3
|
||||
bundler-cache: true
|
||||
- name: Start MongoDB
|
||||
uses: supercharge/mongodb-github-action@1.12.0
|
||||
with:
|
||||
mongodb-version: 8.0
|
||||
- name: Run tests
|
||||
run: bundle exec rspec
|
||||
|
||||
@@ -377,7 +369,7 @@ jobs:
|
||||
ports:
|
||||
- '7700:7700'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: meilisearch/meilisearch-symfony
|
||||
- name: Install PHP
|
||||
|
||||
64
.github/workflows/test-suite.yml
vendored
64
.github/workflows/test-suite.yml
vendored
@@ -3,10 +3,14 @@ name: Test suite
|
||||
on:
|
||||
workflow_dispatch:
|
||||
schedule:
|
||||
# Every day at 5:00am
|
||||
# Everyday at 5:00am
|
||||
- cron: "0 5 * * *"
|
||||
pull_request:
|
||||
merge_group:
|
||||
push:
|
||||
# trying and staging branches are for Bors config
|
||||
branches:
|
||||
- trying
|
||||
- staging
|
||||
|
||||
env:
|
||||
CARGO_TERM_COLOR: always
|
||||
@@ -15,21 +19,21 @@ env:
|
||||
|
||||
jobs:
|
||||
test-linux:
|
||||
name: Tests on ubuntu-22.04
|
||||
name: Tests on ubuntu-20.04
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- name: Setup test with Rust stable
|
||||
uses: dtolnay/rust-toolchain@1.89
|
||||
uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Cache dependencies
|
||||
uses: Swatinem/rust-cache@v2.8.0
|
||||
uses: Swatinem/rust-cache@v2.7.7
|
||||
- name: Run cargo check without any default features
|
||||
uses: actions-rs/cargo@v1
|
||||
with:
|
||||
@@ -49,10 +53,10 @@ jobs:
|
||||
matrix:
|
||||
os: [macos-13, windows-2022]
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Cache dependencies
|
||||
uses: Swatinem/rust-cache@v2.8.0
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
uses: Swatinem/rust-cache@v2.7.7
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Run cargo check without any default features
|
||||
uses: actions-rs/cargo@v1
|
||||
with:
|
||||
@@ -68,16 +72,16 @@ jobs:
|
||||
name: Tests almost all features
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update
|
||||
apt-get install --assume-yes build-essential curl
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Run cargo build with almost all features
|
||||
run: |
|
||||
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda,test-ollama)"
|
||||
@@ -91,7 +95,7 @@ jobs:
|
||||
env:
|
||||
MEILI_TEST_OLLAMA_SERVER: "http://localhost:11434"
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v1
|
||||
- name: Install Ollama
|
||||
run: |
|
||||
curl -fsSL https://ollama.com/install.sh | sudo -E sh
|
||||
@@ -121,15 +125,15 @@ jobs:
|
||||
name: Test disabled tokenization
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: ubuntu:22.04
|
||||
image: ubuntu:20.04
|
||||
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update
|
||||
apt-get install --assume-yes build-essential curl
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Run cargo tree without default features and check lindera is not present
|
||||
run: |
|
||||
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then
|
||||
@@ -145,17 +149,17 @@ jobs:
|
||||
name: Run tests in debug
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
# Use ubuntu-22.04 to compile with glibc 2.35
|
||||
image: ubuntu:22.04
|
||||
# Use ubuntu-20.04 to compile with glibc 2.28
|
||||
image: ubuntu:20.04
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v3
|
||||
- name: Install needed dependencies
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
apt-get install build-essential -y
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
- name: Cache dependencies
|
||||
uses: Swatinem/rust-cache@v2.8.0
|
||||
uses: Swatinem/rust-cache@v2.7.7
|
||||
- name: Run tests in debug
|
||||
uses: actions-rs/cargo@v1
|
||||
with:
|
||||
@@ -166,13 +170,13 @@ jobs:
|
||||
name: Run Clippy
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
components: clippy
|
||||
- name: Cache dependencies
|
||||
uses: Swatinem/rust-cache@v2.8.0
|
||||
uses: Swatinem/rust-cache@v2.7.7
|
||||
- name: Run cargo clippy
|
||||
uses: actions-rs/cargo@v1
|
||||
with:
|
||||
@@ -183,15 +187,15 @@ jobs:
|
||||
name: Run Rustfmt
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
toolchain: nightly-2024-07-09
|
||||
override: true
|
||||
components: rustfmt
|
||||
- name: Cache dependencies
|
||||
uses: Swatinem/rust-cache@v2.8.0
|
||||
uses: Swatinem/rust-cache@v2.7.7
|
||||
- name: Run cargo fmt
|
||||
# Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file.
|
||||
# Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate
|
||||
|
||||
@@ -4,7 +4,7 @@ on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
new_version:
|
||||
description: "The new version (vX.Y.Z)"
|
||||
description: 'The new version (vX.Y.Z)'
|
||||
required: true
|
||||
|
||||
env:
|
||||
@@ -17,8 +17,8 @@ jobs:
|
||||
name: Update version in Cargo.toml
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: dtolnay/rust-toolchain@1.89
|
||||
- uses: actions/checkout@v3
|
||||
- uses: dtolnay/rust-toolchain@1.81
|
||||
with:
|
||||
profile: minimal
|
||||
- name: Install sd
|
||||
@@ -41,4 +41,5 @@ jobs:
|
||||
--title "Update version for the next release ($NEW_VERSION) in Cargo.toml" \
|
||||
--body '⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.' \
|
||||
--label 'skip changelog' \
|
||||
--milestone $NEW_VERSION \
|
||||
--base $GITHUB_REF_NAME
|
||||
|
||||
13
.gitignore
vendored
13
.gitignore
vendored
@@ -5,27 +5,18 @@
|
||||
**/*.json_lines
|
||||
**/*.rs.bk
|
||||
/*.mdb
|
||||
/*.ms
|
||||
/data.ms
|
||||
/snapshots
|
||||
/dumps
|
||||
/bench
|
||||
/_xtask_benchmark.ms
|
||||
/benchmarks
|
||||
.DS_Store
|
||||
|
||||
# Snapshots
|
||||
## ... large
|
||||
*.full.snap
|
||||
## ... unreviewed
|
||||
## ... unreviewed
|
||||
*.snap.new
|
||||
## ... pending
|
||||
*.pending-snap
|
||||
|
||||
# Tmp files
|
||||
.tmp*
|
||||
|
||||
# Database snapshot
|
||||
crates/meilisearch/db.snapshot
|
||||
|
||||
# Fuzzcheck data for the facet indexing fuzz test
|
||||
crates/milli/fuzz/update::facet::incremental::fuzz::fuzz/
|
||||
|
||||
@@ -57,17 +57,9 @@ This command will be triggered to each PR as a requirement for merging it.
|
||||
You can set the `LINDERA_CACHE` environment variable to speed up your successive builds by up to 2 minutes.
|
||||
It'll store some built artifacts in the directory of your choice.
|
||||
|
||||
We recommend using the `$HOME/.cache/meili/lindera` directory:
|
||||
We recommend using the standard `$HOME/.cache/lindera` directory:
|
||||
```sh
|
||||
export LINDERA_CACHE=$HOME/.cache/meili/lindera
|
||||
```
|
||||
|
||||
You can set the `MILLI_BENCH_DATASETS_PATH` environment variable to further speed up your builds.
|
||||
It'll store some big files used for the benchmarks in the directory of your choice.
|
||||
|
||||
We recommend using the `$HOME/.cache/meili/benches` directory:
|
||||
```sh
|
||||
export MILLI_BENCH_DATASETS_PATH=$HOME/.cache/meili/benches
|
||||
export LINDERA_CACHE=$HOME/.cache/lindera
|
||||
```
|
||||
|
||||
Furthermore, you can improve incremental compilation by setting the `MEILI_NO_VERGEN` environment variable.
|
||||
@@ -103,23 +95,6 @@ Meilisearch follows the [cargo xtask](https://github.com/matklad/cargo-xtask) wo
|
||||
|
||||
Run `cargo xtask --help` from the root of the repository to find out what is available.
|
||||
|
||||
#### Update the openAPI file if the API changed
|
||||
|
||||
To update the openAPI file in the code, see [sprint_issue.md](https://github.com/meilisearch/meilisearch/blob/main/.github/ISSUE_TEMPLATE/sprint_issue.md#reminders-when-modifying-the-api).
|
||||
|
||||
If you want to generate OpenAPI file manually:
|
||||
|
||||
With swagger:
|
||||
- Starts Meilisearch with the `swagger` feature flag: `cargo run --features swagger`
|
||||
- On a browser, open the following URL: http://localhost:7700/scalar
|
||||
- Click the « Download openAPI file »
|
||||
|
||||
With the internal crate:
|
||||
```bash
|
||||
cd crates/openapi-generator
|
||||
cargo run --release -- --pretty --output meilisearch.json
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
Meilisearch uses [`tracing`](https://lib.rs/crates/tracing) for logging purposes. Tracing logs are structured and can be displayed as JSON to the end user, so prefer passing arguments as fields rather than interpolating them in the message.
|
||||
@@ -170,39 +145,28 @@ Some notes on GitHub PRs:
|
||||
- The PR title should be accurate and descriptive of the changes.
|
||||
- [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br>
|
||||
The draft PRs are recommended when you want to show that you are working on something and make your work visible.
|
||||
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [GitHub Merge Queues](https://github.blog/news-insights/product-news/github-merge-queue-is-generally-available/) to automatically enforce this requirement without the PR author having to rebase manually.
|
||||
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [Bors](https://github.com/bors-ng/bors-ng) to automatically enforce this requirement without the PR author having to rebase manually.
|
||||
|
||||
## Merging PRs
|
||||
|
||||
This project uses GitHub Merge Queues that helps us manage pull requests merging.
|
||||
|
||||
Before merging a PR, the maintainer should ensure the following requirements are met
|
||||
- Automated tests have been added.
|
||||
- If some tests cannot be automated, manual rigorous tests should be applied.
|
||||
- ⚠️ If there is an change in the DB: it's mandatory to manually test the `--experimental-dumpless-upgrade` on a DB of the previous Meilisearch minor version (e.g. v1.13 for the v1.14 release).
|
||||
- If necessary, the feature have been tested in the Cloud production environment (with [prototypes](./documentation/prototypes.md)) and the Cloud UI is ready.
|
||||
- If necessary, the [documentation](https://github.com/meilisearch/documentation) related to the implemented feature in the PR is ready.
|
||||
- If necessary, the [integrations](https://github.com/meilisearch/integration-guides) related to the implemented feature in the PR are ready.
|
||||
|
||||
## Publish Process (for internal team only)
|
||||
## Release Process (for internal team only)
|
||||
|
||||
Meilisearch tools follow the [Semantic Versioning Convention](https://semver.org/).
|
||||
|
||||
### How to publish a new release
|
||||
### Automation to rebase and Merge the PRs
|
||||
|
||||
The full Meilisearch release process is described in [this guide](./documentation/release.md).
|
||||
This project integrates a bot that helps us manage pull requests merging.<br>
|
||||
_[Read more about this](https://github.com/meilisearch/integration-guides/blob/main/resources/bors.md)._
|
||||
|
||||
### How to Publish a new Release
|
||||
|
||||
The full Meilisearch release process is described in [this guide](https://github.com/meilisearch/engine-team/blob/main/resources/meilisearch-release.md). Please follow it carefully before doing any release.
|
||||
|
||||
### How to publish a prototype
|
||||
|
||||
Depending on the developed feature, you might need to provide a prototyped version of Meilisearch to make it easier to test by the users.
|
||||
|
||||
This happens in two steps:
|
||||
- [Release the prototype](./documentation/prototypes.md#how-to-publish-a-prototype)
|
||||
- [Communicate about it](./documentation/prototypes.md#communication)
|
||||
|
||||
### How to implement and publish an experimental feature
|
||||
|
||||
Here is our [guidelines and process](./documentation/experimental-features.md) to implement and publish an experimental feature.
|
||||
- [Release the prototype](https://github.com/meilisearch/engine-team/blob/main/resources/prototypes.md#how-to-publish-a-prototype)
|
||||
- [Communicate about it](https://github.com/meilisearch/engine-team/blob/main/resources/prototypes.md#communication)
|
||||
|
||||
### Release assets
|
||||
|
||||
|
||||
4127
Cargo.lock
generated
4127
Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -19,11 +19,10 @@ members = [
|
||||
"crates/tracing-trace",
|
||||
"crates/xtask",
|
||||
"crates/build-info",
|
||||
"crates/openapi-generator",
|
||||
]
|
||||
|
||||
[workspace.package]
|
||||
version = "1.22.1"
|
||||
version = "1.13.0"
|
||||
authors = [
|
||||
"Quentin de Quelen <quentin@dequelen.me>",
|
||||
"Clément Renault <clement@meilisearch.com>",
|
||||
@@ -37,12 +36,6 @@ license = "MIT"
|
||||
[profile.release]
|
||||
codegen-units = 1
|
||||
|
||||
# We now compile heed without the NDEBUG define for better performance.
|
||||
# However, we still enable debug assertions for a better detection of
|
||||
# disk corruption on the cloud or in OSS.
|
||||
[profile.release.package.heed]
|
||||
debug-assertions = true
|
||||
|
||||
[profile.dev.package.flate2]
|
||||
opt-level = 3
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
# Compile
|
||||
FROM rust:1.89-alpine3.20 AS compiler
|
||||
FROM rust:1.81.0-alpine3.20 AS compiler
|
||||
|
||||
RUN apk add -q --no-cache build-base openssl-dev
|
||||
|
||||
|
||||
8
LICENSE
8
LICENSE
@@ -19,11 +19,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
---
|
||||
|
||||
🔒 Meilisearch Enterprise Edition (EE)
|
||||
|
||||
Certain parts of this codebase are not licensed under the MIT license and governed by the Business Source License 1.1.
|
||||
|
||||
See the LICENSE-EE file for details.
|
||||
|
||||
67
LICENSE-EE
67
LICENSE-EE
@@ -1,67 +0,0 @@
|
||||
Business Source License 1.1 – Adapted for Meili SAS
|
||||
This license is based on the Business Source License version 1.1, as published by MariaDB Corporation Ab.
|
||||
|
||||
Parameters
|
||||
|
||||
Licensor: Meili SAS
|
||||
|
||||
Licensed Work: Any file explicitly marked as “Enterprise Edition (EE)” or “governed by the Business Source License” residing in enterprise_editions modules/folders.
|
||||
|
||||
Additional Use Grant:
|
||||
You may use, modify, and distribute the Licensed Work for non-production purposes only, such as testing, development, or evaluation.
|
||||
|
||||
Production use of the Licensed Work requires a commercial license agreement with Meilisearch. Contact bonjour@meilisearch.com for licensing.
|
||||
|
||||
Change License: MIT
|
||||
|
||||
Change Date: Four years from the date the Licensed Work is published.
|
||||
|
||||
This License does not apply to any code outside of the Licensed Work, which remains under the MIT license.
|
||||
|
||||
For information about alternative licensing arrangements for the Licensed Work,
|
||||
please contact bonjour@meilisearch.com or sales@meilisearch.com.
|
||||
|
||||
Notice
|
||||
|
||||
Business Source License 1.1
|
||||
|
||||
Terms
|
||||
|
||||
The Licensor hereby grants you the right to copy, modify, create derivative
|
||||
works, redistribute, and make non-production use of the Licensed Work. The
|
||||
Licensor may make an Additional Use Grant, above, permitting limited production use.
|
||||
|
||||
Effective on the Change Date, or the fourth anniversary of the first publicly
|
||||
available distribution of a specific version of the Licensed Work under this
|
||||
License, whichever comes first, the Licensor hereby grants you rights under
|
||||
the terms of the Change License, and the rights granted in the paragraph
|
||||
above terminate.
|
||||
|
||||
If your use of the Licensed Work does not comply with the requirements
|
||||
currently in effect as described in this License, you must purchase a
|
||||
commercial license from the Licensor, its affiliated entities, or authorized
|
||||
resellers, or you must refrain from using the Licensed Work.
|
||||
|
||||
All copies of the original and modified Licensed Work, and derivative works
|
||||
of the Licensed Work, are subject to this License. This License applies
|
||||
separately for each version of the Licensed Work and the Change Date may vary
|
||||
for each version of the Licensed Work released by Licensor.
|
||||
|
||||
You must conspicuously display this License on each original or modified copy
|
||||
of the Licensed Work. If you receive the Licensed Work in original or
|
||||
modified form from a third party, the terms and conditions set forth in this
|
||||
License apply to your use of that work.
|
||||
|
||||
Any use of the Licensed Work in violation of this License will automatically
|
||||
terminate your rights under this License for the current and all other
|
||||
versions of the Licensed Work.
|
||||
|
||||
This License does not grant you any right in any trademark or logo of
|
||||
Licensor or its affiliates (provided that you may use a trademark or logo of
|
||||
Licensor as expressly required by this License).
|
||||
|
||||
TO THE EXTENT PERMITTED BY APPLICABLE LAW, THE LICENSED WORK IS PROVIDED ON
|
||||
AN "AS IS" BASIS. LICENSOR HEREBY DISCLAIMS ALL WARRANTIES AND CONDITIONS,
|
||||
EXPRESS OR IMPLIED, INCLUDING (WITHOUT LIMITATION) WARRANTIES OF
|
||||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, AND
|
||||
TITLE.
|
||||
30
README.md
30
README.md
@@ -20,7 +20,7 @@
|
||||
<p align="center">
|
||||
<a href="https://deps.rs/repo/github/meilisearch/meilisearch"><img src="https://deps.rs/repo/github/meilisearch/meilisearch/status.svg" alt="Dependency status"></a>
|
||||
<a href="https://github.com/meilisearch/meilisearch/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-informational" alt="License"></a>
|
||||
<a href="https://github.com/meilisearch/meilisearch/queue"><img alt="Merge Queues enabled" src="https://img.shields.io/badge/Merge_Queues-enabled-%2357cf60?logo=github"></a>
|
||||
<a href="https://ms-bors.herokuapp.com/repositories/52"><img src="https://bors.tech/images/badge_small.svg" alt="Bors enabled"></a>
|
||||
</p>
|
||||
|
||||
<p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p>
|
||||
@@ -41,7 +41,7 @@
|
||||
- [**Movies**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=organization) — An application to help you find streaming platforms to watch movies using [hybrid search](https://www.meilisearch.com/solutions/hybrid-search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos).
|
||||
- [**Ecommerce**](https://ecommerce.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Ecommerce website using disjunctive [facets](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos), range and rating filtering, and pagination.
|
||||
- [**Songs**](https://music.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search through 47 million of songs.
|
||||
- [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application.
|
||||
- [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application.
|
||||
|
||||
See the list of all our example apps in our [demos repository](https://github.com/meilisearch/demos).
|
||||
|
||||
@@ -89,26 +89,6 @@ We also offer a wide range of dedicated guides to all Meilisearch features, such
|
||||
|
||||
Finally, for more in-depth information, refer to our articles explaining fundamental Meilisearch concepts such as [documents](https://www.meilisearch.com/docs/learn/core_concepts/documents?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=advanced) and [indexes](https://www.meilisearch.com/docs/learn/core_concepts/indexes?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=advanced).
|
||||
|
||||
## 🧾 Editions & Licensing
|
||||
|
||||
Meilisearch is available in two editions:
|
||||
|
||||
### 🧪 Community Edition (CE)
|
||||
|
||||
- Fully open source under the [MIT license](./LICENSE)
|
||||
- Core search engine with fast and relevant full-text, semantic or hybrid search
|
||||
- Free to use for anyone, including commercial usage
|
||||
|
||||
### 🏢 Enterprise Edition (EE)
|
||||
|
||||
- Includes advanced features such as:
|
||||
- Sharding
|
||||
- Governed by a [commercial license](./LICENSE-EE) or the [Business Source License 1.1](https://mariadb.com/bsl11)
|
||||
- Not allowed in production without a commercial agreement with Meilisearch.
|
||||
- You may use, modify, and distribute the Licensed Work for non-production purposes only, such as testing, development, or evaluation.
|
||||
|
||||
Want access to Enterprise features? → Contact us at [sales@meilisearch.com](maito:sales@meilisearch.com).
|
||||
|
||||
## 📊 Telemetry
|
||||
|
||||
Meilisearch collects **anonymized** user data to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want.
|
||||
@@ -119,9 +99,9 @@ If you want to know more about the kind of data we collect and what we use it fo
|
||||
|
||||
## 📫 Get in touch!
|
||||
|
||||
Meilisearch is a search engine created by [Meili](https://www.meilisearch.com/careers), a software development company headquartered in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
|
||||
Meilisearch is a search engine created by [Meili]([https://www.welcometothejungle.com/en/companies/meilisearch](https://www.meilisearch.com/careers)), a software development company headquartered in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
|
||||
|
||||
🗞 [Subscribe to our newsletter](https://share-eu1.hsforms.com/1LN5N0x_GQgq7ss7tXmSykwfg3aq) if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months.
|
||||
🗞 [Subscribe to our newsletter](https://meilisearch.us2.list-manage.com/subscribe?u=27870f7b71c908a8b359599fb&id=79582d828e) if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months.
|
||||
|
||||
💌 Want to make a suggestion or give feedback? Here are some of the channels where you can reach us:
|
||||
|
||||
@@ -139,6 +119,6 @@ Meilisearch is, and will always be, open-source! If you want to contribute to th
|
||||
|
||||
Meilisearch releases and their associated binaries are available on the project's [releases page](https://github.com/meilisearch/meilisearch/releases).
|
||||
|
||||
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](./documentation/versioning-policy.md).
|
||||
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](https://github.com/meilisearch/engine-team/blob/main/resources/versioning-policy.md).
|
||||
|
||||
Differently from the binaries, crates in this repository are not currently available on [crates.io](https://crates.io/) and do not follow [SemVer conventions](https://semver.org).
|
||||
|
||||
@@ -1501,300 +1501,6 @@
|
||||
"title": "Task queue latency",
|
||||
"type": "timeseries"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "palette-classic"
|
||||
},
|
||||
"custom": {
|
||||
"axisBorderShow": false,
|
||||
"axisCenteredZero": false,
|
||||
"axisColorMode": "text",
|
||||
"axisLabel": "",
|
||||
"axisPlacement": "auto",
|
||||
"barAlignment": 0,
|
||||
"drawStyle": "line",
|
||||
"fillOpacity": 15,
|
||||
"gradientMode": "none",
|
||||
"hideFrom": {
|
||||
"legend": false,
|
||||
"tooltip": false,
|
||||
"viz": false
|
||||
},
|
||||
"insertNulls": false,
|
||||
"lineInterpolation": "linear",
|
||||
"lineWidth": 1,
|
||||
"pointSize": 5,
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "never",
|
||||
"spanNulls": false,
|
||||
"stacking": {
|
||||
"group": "A",
|
||||
"mode": "none"
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "off"
|
||||
}
|
||||
},
|
||||
"mappings": [],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green"
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 80
|
||||
}
|
||||
]
|
||||
},
|
||||
"unit": "none"
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"gridPos": {
|
||||
"h": 11,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 51
|
||||
},
|
||||
"id": 29,
|
||||
"interval": "5s",
|
||||
"options": {
|
||||
"legend": {
|
||||
"calcs": [],
|
||||
"displayMode": "list",
|
||||
"placement": "right",
|
||||
"showLegend": true
|
||||
},
|
||||
"tooltip": {
|
||||
"mode": "single",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"pluginVersion": "8.1.4",
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "builder",
|
||||
"exemplar": true,
|
||||
"expr": "meilisearch_task_queue_used_size{instance=\"$instance\", job=\"$job\"}",
|
||||
"interval": "",
|
||||
"legendFormat": "{{value}} ",
|
||||
"range": true,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"title": "Task queue used size in bytes",
|
||||
"type": "timeseries"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "palette-classic"
|
||||
},
|
||||
"custom": {
|
||||
"axisBorderShow": false,
|
||||
"axisCenteredZero": false,
|
||||
"axisColorMode": "text",
|
||||
"axisLabel": "",
|
||||
"axisPlacement": "auto",
|
||||
"barAlignment": 0,
|
||||
"drawStyle": "line",
|
||||
"fillOpacity": 15,
|
||||
"gradientMode": "none",
|
||||
"hideFrom": {
|
||||
"legend": false,
|
||||
"tooltip": false,
|
||||
"viz": false
|
||||
},
|
||||
"insertNulls": false,
|
||||
"lineInterpolation": "linear",
|
||||
"lineWidth": 1,
|
||||
"pointSize": 5,
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "never",
|
||||
"spanNulls": false,
|
||||
"stacking": {
|
||||
"group": "A",
|
||||
"mode": "none"
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "off"
|
||||
}
|
||||
},
|
||||
"mappings": [],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green"
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 80
|
||||
}
|
||||
]
|
||||
},
|
||||
"unit": "none"
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"gridPos": {
|
||||
"h": 11,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 51
|
||||
},
|
||||
"id": 29,
|
||||
"interval": "5s",
|
||||
"options": {
|
||||
"legend": {
|
||||
"calcs": [],
|
||||
"displayMode": "list",
|
||||
"placement": "right",
|
||||
"showLegend": true
|
||||
},
|
||||
"tooltip": {
|
||||
"mode": "single",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"pluginVersion": "8.1.4",
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "builder",
|
||||
"exemplar": true,
|
||||
"expr": "meilisearch_task_queue_size_until_stop_registering{instance=\"$instance\", job=\"$job\"}",
|
||||
"interval": "",
|
||||
"legendFormat": "{{value}} ",
|
||||
"range": true,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"title": "Task queue available size until it stop receiving tasks.",
|
||||
"type": "timeseries"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "palette-classic"
|
||||
},
|
||||
"custom": {
|
||||
"axisBorderShow": false,
|
||||
"axisCenteredZero": false,
|
||||
"axisColorMode": "text",
|
||||
"axisLabel": "",
|
||||
"axisPlacement": "auto",
|
||||
"barAlignment": 0,
|
||||
"drawStyle": "line",
|
||||
"fillOpacity": 15,
|
||||
"gradientMode": "none",
|
||||
"hideFrom": {
|
||||
"legend": false,
|
||||
"tooltip": false,
|
||||
"viz": false
|
||||
},
|
||||
"insertNulls": false,
|
||||
"lineInterpolation": "linear",
|
||||
"lineWidth": 1,
|
||||
"pointSize": 5,
|
||||
"scaleDistribution": {
|
||||
"type": "linear"
|
||||
},
|
||||
"showPoints": "never",
|
||||
"spanNulls": false,
|
||||
"stacking": {
|
||||
"group": "A",
|
||||
"mode": "none"
|
||||
},
|
||||
"thresholdsStyle": {
|
||||
"mode": "off"
|
||||
}
|
||||
},
|
||||
"mappings": [],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green"
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 80
|
||||
}
|
||||
]
|
||||
},
|
||||
"unit": "none"
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"gridPos": {
|
||||
"h": 11,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 51
|
||||
},
|
||||
"id": 29,
|
||||
"interval": "5s",
|
||||
"options": {
|
||||
"legend": {
|
||||
"calcs": [],
|
||||
"displayMode": "list",
|
||||
"placement": "right",
|
||||
"showLegend": true
|
||||
},
|
||||
"tooltip": {
|
||||
"mode": "single",
|
||||
"sort": "none"
|
||||
}
|
||||
},
|
||||
"pluginVersion": "8.1.4",
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "builder",
|
||||
"exemplar": true,
|
||||
"expr": "meilisearch_task_queue_max_size{instance=\"$instance\", job=\"$job\"}",
|
||||
"interval": "",
|
||||
"legendFormat": "{{value}} ",
|
||||
"range": true,
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"title": "Task queue maximum possible size",
|
||||
"type": "stat"
|
||||
},
|
||||
{
|
||||
"collapsed": true,
|
||||
"datasource": {
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 578 KiB |
11
bors.toml
Normal file
11
bors.toml
Normal file
@@ -0,0 +1,11 @@
|
||||
status = [
|
||||
'Tests on ubuntu-20.04',
|
||||
'Tests on macos-13',
|
||||
'Tests on windows-2022',
|
||||
'Run Clippy',
|
||||
'Run Rustfmt',
|
||||
'Run tests in debug',
|
||||
]
|
||||
pr_status = ['Milestone Check']
|
||||
# 3 hours timeout
|
||||
timeout-sec = 10800
|
||||
@@ -11,27 +11,27 @@ edition.workspace = true
|
||||
license.workspace = true
|
||||
|
||||
[dependencies]
|
||||
anyhow = "1.0.98"
|
||||
bumpalo = "3.18.1"
|
||||
anyhow = "1.0.95"
|
||||
bumpalo = "3.16.0"
|
||||
csv = "1.3.1"
|
||||
memmap2 = "0.9.7"
|
||||
memmap2 = "0.9.5"
|
||||
milli = { path = "../milli" }
|
||||
mimalloc = { version = "0.1.47", default-features = false }
|
||||
serde_json = { version = "1.0.140", features = ["preserve_order"] }
|
||||
tempfile = "3.20.0"
|
||||
mimalloc = { version = "0.1.43", default-features = false }
|
||||
serde_json = { version = "1.0.135", features = ["preserve_order"] }
|
||||
tempfile = "3.15.0"
|
||||
|
||||
[dev-dependencies]
|
||||
criterion = { version = "0.6.0", features = ["html_reports"] }
|
||||
criterion = { version = "0.5.1", features = ["html_reports"] }
|
||||
rand = "0.8.5"
|
||||
rand_chacha = "0.3.1"
|
||||
roaring = "0.10.12"
|
||||
roaring = "0.10.10"
|
||||
|
||||
[build-dependencies]
|
||||
anyhow = "1.0.98"
|
||||
bytes = "1.10.1"
|
||||
convert_case = "0.8.0"
|
||||
flate2 = "1.1.2"
|
||||
reqwest = { version = "0.12.20", features = ["blocking", "rustls-tls"], default-features = false }
|
||||
anyhow = "1.0.95"
|
||||
bytes = "1.9.0"
|
||||
convert_case = "0.6.0"
|
||||
flate2 = "1.0.35"
|
||||
reqwest = { version = "0.12.12", features = ["blocking", "rustls-tls"], default-features = false }
|
||||
|
||||
[features]
|
||||
default = ["milli/all-tokenizations"]
|
||||
@@ -51,11 +51,3 @@ harness = false
|
||||
[[bench]]
|
||||
name = "indexing"
|
||||
harness = false
|
||||
|
||||
[[bench]]
|
||||
name = "sort"
|
||||
harness = false
|
||||
|
||||
[[bench]]
|
||||
name = "filter_starts_with"
|
||||
harness = false
|
||||
|
||||
@@ -1,66 +0,0 @@
|
||||
mod datasets_paths;
|
||||
mod utils;
|
||||
|
||||
use criterion::{criterion_group, criterion_main};
|
||||
use milli::update::Settings;
|
||||
use milli::FilterableAttributesRule;
|
||||
use utils::Conf;
|
||||
|
||||
#[cfg(not(windows))]
|
||||
#[global_allocator]
|
||||
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
|
||||
|
||||
fn base_conf(builder: &mut Settings) {
|
||||
let displayed_fields = ["geonameid", "name"].iter().map(|s| s.to_string()).collect();
|
||||
builder.set_displayed_fields(displayed_fields);
|
||||
|
||||
let filterable_fields =
|
||||
["name"].iter().map(|s| FilterableAttributesRule::Field(s.to_string())).collect();
|
||||
builder.set_filterable_fields(filterable_fields);
|
||||
}
|
||||
|
||||
#[rustfmt::skip]
|
||||
const BASE_CONF: Conf = Conf {
|
||||
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
|
||||
dataset_format: "jsonl",
|
||||
queries: &[
|
||||
"",
|
||||
],
|
||||
configure: base_conf,
|
||||
primary_key: Some("geonameid"),
|
||||
..Conf::BASE
|
||||
};
|
||||
|
||||
fn filter_starts_with(c: &mut criterion::Criterion) {
|
||||
#[rustfmt::skip]
|
||||
let confs = &[
|
||||
utils::Conf {
|
||||
group_name: "1 letter",
|
||||
filter: Some("name STARTS WITH e"),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "2 letters",
|
||||
filter: Some("name STARTS WITH es"),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "3 letters",
|
||||
filter: Some("name STARTS WITH est"),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "6 letters",
|
||||
filter: Some("name STARTS WITH estoni"),
|
||||
..BASE_CONF
|
||||
}
|
||||
];
|
||||
|
||||
utils::run_benches(c, confs);
|
||||
}
|
||||
|
||||
criterion_group!(benches, filter_starts_with);
|
||||
criterion_main!(benches);
|
||||
@@ -11,8 +11,8 @@ use milli::heed::{EnvOpenOptions, RwTxn};
|
||||
use milli::progress::Progress;
|
||||
use milli::update::new::indexer;
|
||||
use milli::update::{IndexerConfig, Settings};
|
||||
use milli::vector::RuntimeEmbedders;
|
||||
use milli::{FilterableAttributesRule, Index};
|
||||
use milli::vector::EmbeddingConfigs;
|
||||
use milli::Index;
|
||||
use rand::seq::SliceRandom;
|
||||
use rand_chacha::rand_core::SeedableRng;
|
||||
use roaring::RoaringBitmap;
|
||||
@@ -35,8 +35,7 @@ fn setup_dir(path: impl AsRef<Path>) {
|
||||
fn setup_index() -> Index {
|
||||
let path = "benches.mmdb";
|
||||
setup_dir(path);
|
||||
let options = EnvOpenOptions::new();
|
||||
let mut options = options.read_txn_without_tls();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(100 * 1024 * 1024 * 1024); // 100 GB
|
||||
options.max_readers(100);
|
||||
Index::new(options, path, true).unwrap()
|
||||
@@ -58,14 +57,13 @@ fn setup_settings<'t>(
|
||||
let searchable_fields = searchable_fields.iter().map(|s| s.to_string()).collect();
|
||||
builder.set_searchable_fields(searchable_fields);
|
||||
|
||||
let filterable_fields =
|
||||
filterable_fields.iter().map(|s| FilterableAttributesRule::Field(s.to_string())).collect();
|
||||
let filterable_fields = filterable_fields.iter().map(|s| s.to_string()).collect();
|
||||
builder.set_filterable_fields(filterable_fields);
|
||||
|
||||
let sortable_fields = sortable_fields.iter().map(|s| s.to_string()).collect();
|
||||
builder.set_sortable_fields(sortable_fields);
|
||||
|
||||
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap();
|
||||
builder.execute(|_| (), || false).unwrap();
|
||||
}
|
||||
|
||||
fn setup_index_with_settings(
|
||||
@@ -154,7 +152,6 @@ fn indexing_songs_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -167,10 +164,9 @@ fn indexing_songs_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -222,7 +218,6 @@ fn reindexing_songs_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -235,10 +230,9 @@ fn reindexing_songs_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -268,7 +262,6 @@ fn reindexing_songs_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -281,10 +274,9 @@ fn reindexing_songs_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -338,7 +330,6 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -351,10 +342,9 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -416,7 +406,6 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -429,10 +418,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -462,7 +450,6 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -475,10 +462,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -504,7 +490,6 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -517,10 +502,9 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -573,7 +557,6 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -586,10 +569,9 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -641,7 +623,6 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -654,10 +635,9 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -709,7 +689,6 @@ fn indexing_wiki(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -722,10 +701,9 @@ fn indexing_wiki(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -776,7 +754,6 @@ fn reindexing_wiki(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -789,10 +766,9 @@ fn reindexing_wiki(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -822,7 +798,6 @@ fn reindexing_wiki(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -835,10 +810,9 @@ fn reindexing_wiki(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -891,7 +865,6 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -904,10 +877,9 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -969,7 +941,6 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -982,10 +953,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1016,7 +986,6 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1029,10 +998,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1059,7 +1027,6 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1072,10 +1039,9 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1127,7 +1093,6 @@ fn indexing_movies_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1140,10 +1105,9 @@ fn indexing_movies_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1194,7 +1158,6 @@ fn reindexing_movies_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1207,10 +1170,9 @@ fn reindexing_movies_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1240,7 +1202,6 @@ fn reindexing_movies_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1253,10 +1214,9 @@ fn reindexing_movies_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1309,7 +1269,6 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1322,10 +1281,9 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1371,10 +1329,9 @@ fn delete_documents_from_ids(index: Index, document_ids_to_delete: Vec<RoaringBi
|
||||
new_fields_ids_map,
|
||||
Some(primary_key),
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1424,7 +1381,6 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1437,10 +1393,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1470,7 +1425,6 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1483,10 +1437,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1512,7 +1465,6 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1525,10 +1477,9 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1603,7 +1554,6 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1616,10 +1566,9 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1695,7 +1644,6 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1708,10 +1656,9 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1779,7 +1726,6 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1792,10 +1738,9 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1847,7 +1792,6 @@ fn indexing_geo(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1860,10 +1804,9 @@ fn indexing_geo(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1914,7 +1857,6 @@ fn reindexing_geo(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1927,10 +1869,9 @@ fn reindexing_geo(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1960,7 +1901,6 @@ fn reindexing_geo(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -1973,10 +1913,9 @@ fn reindexing_geo(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -2029,7 +1968,6 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -2042,10 +1980,9 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
|
||||
@@ -3,7 +3,6 @@ mod utils;
|
||||
|
||||
use criterion::{criterion_group, criterion_main};
|
||||
use milli::update::Settings;
|
||||
use milli::FilterableAttributesRule;
|
||||
use utils::Conf;
|
||||
|
||||
#[cfg(not(windows))]
|
||||
@@ -22,10 +21,8 @@ fn base_conf(builder: &mut Settings) {
|
||||
["name", "alternatenames", "elevation"].iter().map(|s| s.to_string()).collect();
|
||||
builder.set_searchable_fields(searchable_fields);
|
||||
|
||||
let filterable_fields = ["_geo", "population", "elevation"]
|
||||
.iter()
|
||||
.map(|s| FilterableAttributesRule::Field(s.to_string()))
|
||||
.collect();
|
||||
let filterable_fields =
|
||||
["_geo", "population", "elevation"].iter().map(|s| s.to_string()).collect();
|
||||
builder.set_filterable_fields(filterable_fields);
|
||||
|
||||
let sortable_fields =
|
||||
|
||||
@@ -3,7 +3,6 @@ mod utils;
|
||||
|
||||
use criterion::{criterion_group, criterion_main};
|
||||
use milli::update::Settings;
|
||||
use milli::FilterableAttributesRule;
|
||||
use utils::Conf;
|
||||
|
||||
#[cfg(not(windows))]
|
||||
@@ -23,7 +22,7 @@ fn base_conf(builder: &mut Settings) {
|
||||
|
||||
let faceted_fields = ["released-timestamp", "duration-float", "genre", "country", "artist"]
|
||||
.iter()
|
||||
.map(|s| FilterableAttributesRule::Field(s.to_string()))
|
||||
.map(|s| s.to_string())
|
||||
.collect();
|
||||
builder.set_filterable_fields(faceted_fields);
|
||||
}
|
||||
|
||||
@@ -1,114 +0,0 @@
|
||||
//! This benchmark module is used to compare the performance of sorting documents in /search VS /documents
|
||||
//!
|
||||
//! The tests/benchmarks were designed in the context of a query returning only 20 documents.
|
||||
|
||||
mod datasets_paths;
|
||||
mod utils;
|
||||
|
||||
use criterion::{criterion_group, criterion_main};
|
||||
use milli::update::Settings;
|
||||
use utils::Conf;
|
||||
|
||||
#[cfg(not(windows))]
|
||||
#[global_allocator]
|
||||
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
|
||||
|
||||
fn base_conf(builder: &mut Settings) {
|
||||
let displayed_fields =
|
||||
["geonameid", "name", "asciiname", "alternatenames", "_geo", "population"]
|
||||
.iter()
|
||||
.map(|s| s.to_string())
|
||||
.collect();
|
||||
builder.set_displayed_fields(displayed_fields);
|
||||
|
||||
let sortable_fields =
|
||||
["_geo", "name", "population", "elevation", "timezone", "modification-date"]
|
||||
.iter()
|
||||
.map(|s| s.to_string())
|
||||
.collect();
|
||||
builder.set_sortable_fields(sortable_fields);
|
||||
}
|
||||
|
||||
#[rustfmt::skip]
|
||||
const BASE_CONF: Conf = Conf {
|
||||
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
|
||||
dataset_format: "jsonl",
|
||||
configure: base_conf,
|
||||
primary_key: Some("geonameid"),
|
||||
queries: &[""],
|
||||
offsets: &[
|
||||
Some((0, 20)), // The most common query in the real world
|
||||
Some((0, 500)), // A query that ranges over many documents
|
||||
Some((980, 20)), // The worst query that could happen in the real world
|
||||
Some((800_000, 20)) // The worst query
|
||||
],
|
||||
get_documents: true,
|
||||
..Conf::BASE
|
||||
};
|
||||
|
||||
fn bench_sort(c: &mut criterion::Criterion) {
|
||||
#[rustfmt::skip]
|
||||
let confs = &[
|
||||
utils::Conf {
|
||||
group_name: "without sort",
|
||||
sort: None,
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many different values",
|
||||
sort: Some(vec!["name:asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many similar values",
|
||||
sort: Some(vec!["timezone:desc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many similar then different values",
|
||||
sort: Some(vec!["timezone:desc", "name:asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many different then similar values",
|
||||
sort: Some(vec!["timezone:desc", "name:asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "geo sort",
|
||||
sample_size: Some(10),
|
||||
sort: Some(vec!["_geoPoint(45.4777599, 9.1967508):asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many similar values then geo sort",
|
||||
sample_size: Some(50),
|
||||
sort: Some(vec!["timezone:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many different values then geo sort",
|
||||
sample_size: Some(50),
|
||||
sort: Some(vec!["name:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
|
||||
utils::Conf {
|
||||
group_name: "sort on many fields",
|
||||
sort: Some(vec!["population:asc", "name:asc", "elevation:asc", "timezone:asc"]),
|
||||
..BASE_CONF
|
||||
},
|
||||
];
|
||||
|
||||
utils::run_benches(c, confs);
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_sort);
|
||||
criterion_main!(benches);
|
||||
@@ -9,12 +9,11 @@ use anyhow::Context;
|
||||
use bumpalo::Bump;
|
||||
use criterion::BenchmarkId;
|
||||
use memmap2::Mmap;
|
||||
use milli::documents::sort::recursive_sort;
|
||||
use milli::heed::EnvOpenOptions;
|
||||
use milli::progress::Progress;
|
||||
use milli::update::new::indexer;
|
||||
use milli::update::{IndexerConfig, Settings};
|
||||
use milli::vector::RuntimeEmbedders;
|
||||
use milli::vector::EmbeddingConfigs;
|
||||
use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy};
|
||||
use serde_json::Value;
|
||||
|
||||
@@ -36,12 +35,6 @@ pub struct Conf<'a> {
|
||||
pub configure: fn(&mut Settings),
|
||||
pub filter: Option<&'a str>,
|
||||
pub sort: Option<Vec<&'a str>>,
|
||||
/// set to skip documents (offset, limit)
|
||||
pub offsets: &'a [Option<(usize, usize)>],
|
||||
/// enable if you want to bench getting documents without querying
|
||||
pub get_documents: bool,
|
||||
/// configure the benchmark sample size
|
||||
pub sample_size: Option<usize>,
|
||||
/// enable or disable the optional words on the query
|
||||
pub optional_words: bool,
|
||||
/// primary key, if there is None we'll auto-generate docids for every documents
|
||||
@@ -59,9 +52,6 @@ impl Conf<'_> {
|
||||
configure: |_| (),
|
||||
filter: None,
|
||||
sort: None,
|
||||
offsets: &[None],
|
||||
get_documents: false,
|
||||
sample_size: None,
|
||||
optional_words: true,
|
||||
primary_key: None,
|
||||
};
|
||||
@@ -75,8 +65,7 @@ pub fn base_setup(conf: &Conf) -> Index {
|
||||
}
|
||||
create_dir_all(conf.database_name).unwrap();
|
||||
|
||||
let options = EnvOpenOptions::new();
|
||||
let mut options = options.read_txn_without_tls();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(100 * 1024 * 1024 * 1024); // 100 GB
|
||||
options.max_readers(100);
|
||||
let index = Index::new(options, conf.database_name, true).unwrap();
|
||||
@@ -100,7 +89,7 @@ pub fn base_setup(conf: &Conf) -> Index {
|
||||
|
||||
(conf.configure)(&mut builder);
|
||||
|
||||
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap();
|
||||
builder.execute(|_| (), || false).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let config = IndexerConfig::default();
|
||||
@@ -123,7 +112,6 @@ pub fn base_setup(conf: &Conf) -> Index {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -136,10 +124,9 @@ pub fn base_setup(conf: &Conf) -> Index {
|
||||
new_fields_ids_map,
|
||||
primary_key,
|
||||
&document_changes,
|
||||
RuntimeEmbedders::default(),
|
||||
EmbeddingConfigs::default(),
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -156,79 +143,25 @@ pub fn run_benches(c: &mut criterion::Criterion, confs: &[Conf]) {
|
||||
let file_name = Path::new(conf.dataset).file_name().and_then(|f| f.to_str()).unwrap();
|
||||
let name = format!("{}: {}", file_name, conf.group_name);
|
||||
let mut group = c.benchmark_group(&name);
|
||||
if let Some(sample_size) = conf.sample_size {
|
||||
group.sample_size(sample_size);
|
||||
}
|
||||
|
||||
for &query in conf.queries {
|
||||
for offset in conf.offsets {
|
||||
let parameter = match offset {
|
||||
None => query.to_string(),
|
||||
Some((offset, limit)) => format!("{query}[{offset}:{limit}]"),
|
||||
};
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(parameter),
|
||||
&query,
|
||||
|b, &query| {
|
||||
b.iter(|| {
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let mut search = index.search(&rtxn);
|
||||
search
|
||||
.query(query)
|
||||
.terms_matching_strategy(TermsMatchingStrategy::default());
|
||||
if let Some(filter) = conf.filter {
|
||||
let filter = Filter::from_str(filter).unwrap().unwrap();
|
||||
search.filter(filter);
|
||||
}
|
||||
if let Some(sort) = &conf.sort {
|
||||
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
|
||||
search.sort_criteria(sort);
|
||||
}
|
||||
if let Some((offset, limit)) = offset {
|
||||
search.offset(*offset).limit(*limit);
|
||||
}
|
||||
|
||||
let _ids = search.execute().unwrap();
|
||||
});
|
||||
},
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
if conf.get_documents {
|
||||
for offset in conf.offsets {
|
||||
let parameter = match offset {
|
||||
None => String::from("get_documents"),
|
||||
Some((offset, limit)) => format!("get_documents[{offset}:{limit}]"),
|
||||
};
|
||||
group.bench_with_input(BenchmarkId::from_parameter(parameter), &(), |b, &()| {
|
||||
b.iter(|| {
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
if let Some(sort) = &conf.sort {
|
||||
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
|
||||
let all_docs = index.documents_ids(&rtxn).unwrap();
|
||||
let facet_sort =
|
||||
recursive_sort(&index, &rtxn, sort, &all_docs).unwrap();
|
||||
let iter = facet_sort.iter().unwrap();
|
||||
if let Some((offset, limit)) = offset {
|
||||
let _results = iter.skip(*offset).take(*limit).collect::<Vec<_>>();
|
||||
} else {
|
||||
let _results = iter.collect::<Vec<_>>();
|
||||
}
|
||||
} else {
|
||||
let all_docs = index.documents_ids(&rtxn).unwrap();
|
||||
if let Some((offset, limit)) = offset {
|
||||
let _results =
|
||||
all_docs.iter().skip(*offset).take(*limit).collect::<Vec<_>>();
|
||||
} else {
|
||||
let _results = all_docs.iter().collect::<Vec<_>>();
|
||||
}
|
||||
}
|
||||
});
|
||||
group.bench_with_input(BenchmarkId::from_parameter(query), &query, |b, &query| {
|
||||
b.iter(|| {
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let mut search = index.search(&rtxn);
|
||||
search.query(query).terms_matching_strategy(TermsMatchingStrategy::default());
|
||||
if let Some(filter) = conf.filter {
|
||||
let filter = Filter::from_str(filter).unwrap().unwrap();
|
||||
search.filter(filter);
|
||||
}
|
||||
if let Some(sort) = &conf.sort {
|
||||
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
|
||||
search.sort_criteria(sort);
|
||||
}
|
||||
let _ids = search.execute().unwrap();
|
||||
});
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
|
||||
index.prepare_for_closing().wait();
|
||||
|
||||
@@ -67,7 +67,7 @@ fn main() -> anyhow::Result<()> {
|
||||
writeln!(
|
||||
&mut manifest_paths_file,
|
||||
r#"pub const {}: &str = {:?};"#,
|
||||
dataset.to_case(Case::UpperSnake),
|
||||
dataset.to_case(Case::ScreamingSnake),
|
||||
out_file.display(),
|
||||
)?;
|
||||
|
||||
|
||||
@@ -11,8 +11,8 @@ license.workspace = true
|
||||
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
|
||||
|
||||
[dependencies]
|
||||
time = { version = "0.3.41", features = ["parsing"] }
|
||||
time = { version = "0.3.37", features = ["parsing"] }
|
||||
|
||||
[build-dependencies]
|
||||
anyhow = "1.0.98"
|
||||
vergen-git2 = "1.0.7"
|
||||
anyhow = "1.0.95"
|
||||
vergen-git2 = "1.0.2"
|
||||
|
||||
@@ -11,21 +11,21 @@ readme.workspace = true
|
||||
license.workspace = true
|
||||
|
||||
[dependencies]
|
||||
anyhow = "1.0.98"
|
||||
flate2 = "1.1.2"
|
||||
http = "1.3.1"
|
||||
anyhow = "1.0.95"
|
||||
flate2 = "1.0.35"
|
||||
http = "1.2.0"
|
||||
meilisearch-types = { path = "../meilisearch-types" }
|
||||
once_cell = "1.21.3"
|
||||
once_cell = "1.20.2"
|
||||
regex = "1.11.1"
|
||||
roaring = { version = "0.10.12", features = ["serde"] }
|
||||
serde = { version = "1.0.219", features = ["derive"] }
|
||||
serde_json = { version = "1.0.140", features = ["preserve_order"] }
|
||||
tar = "0.4.44"
|
||||
tempfile = "3.20.0"
|
||||
thiserror = "2.0.12"
|
||||
time = { version = "0.3.41", features = ["serde-well-known", "formatting", "parsing", "macros"] }
|
||||
roaring = { version = "0.10.10", features = ["serde"] }
|
||||
serde = { version = "1.0.217", features = ["derive"] }
|
||||
serde_json = { version = "1.0.135", features = ["preserve_order"] }
|
||||
tar = "0.4.43"
|
||||
tempfile = "3.15.0"
|
||||
thiserror = "2.0.9"
|
||||
time = { version = "0.3.37", features = ["serde-well-known", "formatting", "parsing", "macros"] }
|
||||
tracing = "0.1.41"
|
||||
uuid = { version = "1.17.0", features = ["serde", "v4"] }
|
||||
uuid = { version = "1.11.0", features = ["serde", "v4"] }
|
||||
|
||||
[dev-dependencies]
|
||||
big_s = "1.0.2"
|
||||
|
||||
@@ -10,10 +10,8 @@ dump
|
||||
├── instance-uid.uuid
|
||||
├── keys.jsonl
|
||||
├── metadata.json
|
||||
├── tasks
|
||||
│ ├── update_files
|
||||
│ │ └── [task_id].jsonl
|
||||
│ └── queue.jsonl
|
||||
└── batches
|
||||
└── tasks
|
||||
├── update_files
|
||||
│ └── [task_id].jsonl
|
||||
└── queue.jsonl
|
||||
```
|
||||
```
|
||||
@@ -1,17 +1,12 @@
|
||||
#![allow(clippy::type_complexity)]
|
||||
#![allow(clippy::wrong_self_convention)]
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
use meilisearch_types::batches::BatchId;
|
||||
use meilisearch_types::byte_unit::Byte;
|
||||
use meilisearch_types::error::ResponseError;
|
||||
use meilisearch_types::keys::Key;
|
||||
use meilisearch_types::milli::update::IndexDocumentsMethod;
|
||||
use meilisearch_types::settings::Unchecked;
|
||||
use meilisearch_types::tasks::{
|
||||
Details, ExportIndexSettings, IndexSwap, KindWithContent, Status, Task, TaskId, TaskNetwork,
|
||||
};
|
||||
use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task, TaskId};
|
||||
use meilisearch_types::InstanceUid;
|
||||
use roaring::RoaringBitmap;
|
||||
use serde::{Deserialize, Serialize};
|
||||
@@ -94,8 +89,6 @@ pub struct TaskDump {
|
||||
default
|
||||
)]
|
||||
pub finished_at: Option<OffsetDateTime>,
|
||||
#[serde(default, skip_serializing_if = "Option::is_none")]
|
||||
pub network: Option<TaskNetwork>,
|
||||
}
|
||||
|
||||
// A `Kind` specific version made for the dump. If modified you may break the dump.
|
||||
@@ -131,7 +124,6 @@ pub enum KindDump {
|
||||
},
|
||||
IndexUpdate {
|
||||
primary_key: Option<String>,
|
||||
uid: Option<String>,
|
||||
},
|
||||
IndexSwap {
|
||||
swaps: Vec<IndexSwap>,
|
||||
@@ -149,18 +141,9 @@ pub enum KindDump {
|
||||
instance_uid: Option<InstanceUid>,
|
||||
},
|
||||
SnapshotCreation,
|
||||
Export {
|
||||
url: String,
|
||||
api_key: Option<String>,
|
||||
payload_size: Option<Byte>,
|
||||
indexes: BTreeMap<String, ExportIndexSettings>,
|
||||
},
|
||||
UpgradeDatabase {
|
||||
from: (u32, u32, u32),
|
||||
},
|
||||
IndexCompaction {
|
||||
index_uid: String,
|
||||
},
|
||||
}
|
||||
|
||||
impl From<Task> for TaskDump {
|
||||
@@ -177,7 +160,6 @@ impl From<Task> for TaskDump {
|
||||
enqueued_at: task.enqueued_at,
|
||||
started_at: task.started_at,
|
||||
finished_at: task.finished_at,
|
||||
network: task.network,
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -217,8 +199,8 @@ impl From<KindWithContent> for KindDump {
|
||||
KindWithContent::IndexCreation { primary_key, .. } => {
|
||||
KindDump::IndexCreation { primary_key }
|
||||
}
|
||||
KindWithContent::IndexUpdate { primary_key, new_index_uid: uid, .. } => {
|
||||
KindDump::IndexUpdate { primary_key, uid }
|
||||
KindWithContent::IndexUpdate { primary_key, .. } => {
|
||||
KindDump::IndexUpdate { primary_key }
|
||||
}
|
||||
KindWithContent::IndexSwap { swaps } => KindDump::IndexSwap { swaps },
|
||||
KindWithContent::TaskCancelation { query, tasks } => {
|
||||
@@ -231,21 +213,9 @@ impl From<KindWithContent> for KindDump {
|
||||
KindDump::DumpCreation { keys, instance_uid }
|
||||
}
|
||||
KindWithContent::SnapshotCreation => KindDump::SnapshotCreation,
|
||||
KindWithContent::Export { url, api_key, payload_size, indexes } => KindDump::Export {
|
||||
url,
|
||||
api_key,
|
||||
payload_size,
|
||||
indexes: indexes
|
||||
.into_iter()
|
||||
.map(|(pattern, settings)| (pattern.to_string(), settings))
|
||||
.collect(),
|
||||
},
|
||||
KindWithContent::UpgradeDatabase { from: version } => {
|
||||
KindDump::UpgradeDatabase { from: version }
|
||||
}
|
||||
KindWithContent::IndexCompaction { index_uid } => {
|
||||
KindDump::IndexCompaction { index_uid }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -258,17 +228,14 @@ pub(crate) mod test {
|
||||
|
||||
use big_s::S;
|
||||
use maplit::{btreemap, btreeset};
|
||||
use meilisearch_types::batches::{Batch, BatchEnqueuedAt, BatchStats};
|
||||
use meilisearch_types::enterprise_edition::network::{Network, Remote};
|
||||
use meilisearch_types::facet_values_sort::FacetValuesSort;
|
||||
use meilisearch_types::features::RuntimeTogglableFeatures;
|
||||
use meilisearch_types::index_uid_pattern::IndexUidPattern;
|
||||
use meilisearch_types::keys::{Action, Key};
|
||||
use meilisearch_types::milli;
|
||||
use meilisearch_types::milli::update::Setting;
|
||||
use meilisearch_types::milli::{self, FilterableAttributesRule};
|
||||
use meilisearch_types::settings::{Checked, FacetingSettings, Settings};
|
||||
use meilisearch_types::task_view::DetailsView;
|
||||
use meilisearch_types::tasks::{BatchStopReason, Details, Kind, Status};
|
||||
use meilisearch_types::tasks::{Details, Status};
|
||||
use serde_json::{json, Map, Value};
|
||||
use time::macros::datetime;
|
||||
use uuid::Uuid;
|
||||
@@ -310,10 +277,7 @@ pub(crate) mod test {
|
||||
let settings = Settings {
|
||||
displayed_attributes: Setting::Set(vec![S("race"), S("name")]).into(),
|
||||
searchable_attributes: Setting::Set(vec![S("name"), S("race")]).into(),
|
||||
filterable_attributes: Setting::Set(vec![
|
||||
FilterableAttributesRule::Field(S("race")),
|
||||
FilterableAttributesRule::Field(S("age")),
|
||||
]),
|
||||
filterable_attributes: Setting::Set(btreeset! { S("race"), S("age") }),
|
||||
sortable_attributes: Setting::Set(btreeset! { S("age") }),
|
||||
ranking_rules: Setting::NotSet,
|
||||
stop_words: Setting::NotSet,
|
||||
@@ -336,42 +300,11 @@ pub(crate) mod test {
|
||||
localized_attributes: Setting::NotSet,
|
||||
facet_search: Setting::NotSet,
|
||||
prefix_search: Setting::NotSet,
|
||||
chat: Setting::NotSet,
|
||||
vector_store: Setting::NotSet,
|
||||
_kind: std::marker::PhantomData,
|
||||
};
|
||||
settings.check()
|
||||
}
|
||||
|
||||
pub fn create_test_batches() -> Vec<Batch> {
|
||||
vec![Batch {
|
||||
uid: 0,
|
||||
details: DetailsView {
|
||||
received_documents: Some(12),
|
||||
indexed_documents: Some(Some(10)),
|
||||
..DetailsView::default()
|
||||
},
|
||||
progress: None,
|
||||
stats: BatchStats {
|
||||
total_nb_tasks: 1,
|
||||
status: maplit::btreemap! { Status::Succeeded => 1 },
|
||||
types: maplit::btreemap! { Kind::DocumentAdditionOrUpdate => 1 },
|
||||
index_uids: maplit::btreemap! { "doggo".to_string() => 1 },
|
||||
progress_trace: Default::default(),
|
||||
write_channel_congestion: None,
|
||||
internal_database_sizes: Default::default(),
|
||||
},
|
||||
embedder_stats: Default::default(),
|
||||
enqueued_at: Some(BatchEnqueuedAt {
|
||||
earliest: datetime!(2022-11-11 0:00 UTC),
|
||||
oldest: datetime!(2022-11-11 0:00 UTC),
|
||||
}),
|
||||
started_at: datetime!(2022-11-20 0:00 UTC),
|
||||
finished_at: Some(datetime!(2022-11-21 0:00 UTC)),
|
||||
stop_reason: BatchStopReason::Unspecified.to_string(),
|
||||
}]
|
||||
}
|
||||
|
||||
pub fn create_test_tasks() -> Vec<(TaskDump, Option<Vec<Document>>)> {
|
||||
vec![
|
||||
(
|
||||
@@ -395,7 +328,6 @@ pub(crate) mod test {
|
||||
enqueued_at: datetime!(2022-11-11 0:00 UTC),
|
||||
started_at: Some(datetime!(2022-11-20 0:00 UTC)),
|
||||
finished_at: Some(datetime!(2022-11-21 0:00 UTC)),
|
||||
network: None,
|
||||
},
|
||||
None,
|
||||
),
|
||||
@@ -420,7 +352,6 @@ pub(crate) mod test {
|
||||
enqueued_at: datetime!(2022-11-11 0:00 UTC),
|
||||
started_at: None,
|
||||
finished_at: None,
|
||||
network: None,
|
||||
},
|
||||
Some(vec![
|
||||
json!({ "id": 4, "race": "leonberg" }).as_object().unwrap().clone(),
|
||||
@@ -440,7 +371,6 @@ pub(crate) mod test {
|
||||
enqueued_at: datetime!(2022-11-15 0:00 UTC),
|
||||
started_at: None,
|
||||
finished_at: None,
|
||||
network: None,
|
||||
},
|
||||
None,
|
||||
),
|
||||
@@ -497,15 +427,6 @@ pub(crate) mod test {
|
||||
index.flush().unwrap();
|
||||
index.settings(&settings).unwrap();
|
||||
|
||||
// ========== pushing the batch queue
|
||||
let batches = create_test_batches();
|
||||
|
||||
let mut batch_queue = dump.create_batches_queue().unwrap();
|
||||
for batch in &batches {
|
||||
batch_queue.push_batch(batch).unwrap();
|
||||
}
|
||||
batch_queue.flush().unwrap();
|
||||
|
||||
// ========== pushing the task queue
|
||||
let tasks = create_test_tasks();
|
||||
|
||||
@@ -534,10 +455,6 @@ pub(crate) mod test {
|
||||
|
||||
dump.create_experimental_features(features).unwrap();
|
||||
|
||||
// ========== network
|
||||
let network = create_test_network();
|
||||
dump.create_network(network).unwrap();
|
||||
|
||||
// create the dump
|
||||
let mut file = tempfile::tempfile().unwrap();
|
||||
dump.persist_to(&mut file).unwrap();
|
||||
@@ -550,14 +467,6 @@ pub(crate) mod test {
|
||||
RuntimeTogglableFeatures::default()
|
||||
}
|
||||
|
||||
fn create_test_network() -> Network {
|
||||
Network {
|
||||
local: Some("myself".to_string()),
|
||||
remotes: maplit::btreemap! {"other".to_string() => Remote { url: "http://test".to_string(), search_api_key: Some("apiKey".to_string()), write_api_key: Some("docApiKey".to_string()) }},
|
||||
sharding: false,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_creating_and_read_dump() {
|
||||
let mut file = create_test_dump();
|
||||
@@ -606,9 +515,5 @@ pub(crate) mod test {
|
||||
// ==== checking the features
|
||||
let expected = create_test_features();
|
||||
assert_eq!(dump.features().unwrap().unwrap(), expected);
|
||||
|
||||
// ==== checking the network
|
||||
let expected = create_test_network();
|
||||
assert_eq!(&expected, dump.network().unwrap().unwrap());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
use std::fs::File;
|
||||
use std::str::FromStr;
|
||||
|
||||
use super::v2_to_v3::CompatV2ToV3;
|
||||
@@ -95,10 +94,6 @@ impl CompatIndexV1ToV2 {
|
||||
self.from.documents().map(|it| Box::new(it) as Box<dyn Iterator<Item = _>>)
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.from.documents_file()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v2::settings::Settings<v2::settings::Checked>> {
|
||||
Ok(v2::settings::Settings::<v2::settings::Unchecked>::from(self.from.settings()?).check())
|
||||
}
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
use std::fs::File;
|
||||
use std::str::FromStr;
|
||||
|
||||
use time::OffsetDateTime;
|
||||
@@ -97,7 +96,6 @@ impl CompatV2ToV3 {
|
||||
}
|
||||
}
|
||||
|
||||
#[allow(clippy::large_enum_variant)]
|
||||
pub enum CompatIndexV2ToV3 {
|
||||
V2(v2::V2IndexReader),
|
||||
Compat(Box<CompatIndexV1ToV2>),
|
||||
@@ -124,13 +122,6 @@ impl CompatIndexV2ToV3 {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
match self {
|
||||
CompatIndexV2ToV3::V2(v2) => v2.documents_file(),
|
||||
CompatIndexV2ToV3::Compat(compat) => compat.documents_file(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v3::Settings<v3::Checked>> {
|
||||
let settings = match self {
|
||||
CompatIndexV2ToV3::V2(from) => from.settings()?,
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
use std::fs::File;
|
||||
|
||||
use super::v2_to_v3::{CompatIndexV2ToV3, CompatV2ToV3};
|
||||
use super::v4_to_v5::CompatV4ToV5;
|
||||
use crate::reader::{v3, v4, UpdateFile};
|
||||
@@ -254,13 +252,6 @@ impl CompatIndexV3ToV4 {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
match self {
|
||||
CompatIndexV3ToV4::V3(v3) => v3.documents_file(),
|
||||
CompatIndexV3ToV4::Compat(compat) => compat.documents_file(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v4::Settings<v4::Checked>> {
|
||||
Ok(match self {
|
||||
CompatIndexV3ToV4::V3(v3) => {
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
use std::fs::File;
|
||||
|
||||
use super::v3_to_v4::{CompatIndexV3ToV4, CompatV3ToV4};
|
||||
use super::v5_to_v6::CompatV5ToV6;
|
||||
use crate::reader::{v4, v5, Document};
|
||||
@@ -243,13 +241,6 @@ impl CompatIndexV4ToV5 {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
match self {
|
||||
CompatIndexV4ToV5::V4(v4) => v4.documents_file(),
|
||||
CompatIndexV4ToV5::Compat(compat) => compat.documents_file(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v5::Settings<v5::Checked>> {
|
||||
match self {
|
||||
CompatIndexV4ToV5::V4(v4) => Ok(v5::Settings::from(v4.settings()?).check()),
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
use std::fs::File;
|
||||
use std::num::NonZeroUsize;
|
||||
use std::str::FromStr;
|
||||
|
||||
use super::v4_to_v5::{CompatIndexV4ToV5, CompatV4ToV5};
|
||||
@@ -85,7 +83,7 @@ impl CompatV5ToV6 {
|
||||
v6::Kind::IndexCreation { primary_key }
|
||||
}
|
||||
v5::tasks::TaskContent::IndexUpdate { primary_key, .. } => {
|
||||
v6::Kind::IndexUpdate { primary_key, uid: None }
|
||||
v6::Kind::IndexUpdate { primary_key }
|
||||
}
|
||||
v5::tasks::TaskContent::IndexDeletion { .. } => v6::Kind::IndexDeletion,
|
||||
v5::tasks::TaskContent::DocumentAddition {
|
||||
@@ -140,11 +138,9 @@ impl CompatV5ToV6 {
|
||||
v5::Details::Settings { settings } => {
|
||||
v6::Details::SettingsUpdate { settings: Box::new(settings.into()) }
|
||||
}
|
||||
v5::Details::IndexInfo { primary_key } => v6::Details::IndexInfo {
|
||||
primary_key,
|
||||
new_index_uid: None,
|
||||
old_index_uid: None,
|
||||
},
|
||||
v5::Details::IndexInfo { primary_key } => {
|
||||
v6::Details::IndexInfo { primary_key }
|
||||
}
|
||||
v5::Details::DocumentDeletion {
|
||||
received_document_ids,
|
||||
deleted_documents,
|
||||
@@ -163,7 +159,6 @@ impl CompatV5ToV6 {
|
||||
enqueued_at: task_view.enqueued_at,
|
||||
started_at: task_view.started_at,
|
||||
finished_at: task_view.finished_at,
|
||||
network: None,
|
||||
};
|
||||
|
||||
(task, content_file)
|
||||
@@ -201,14 +196,6 @@ impl CompatV5ToV6 {
|
||||
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
|
||||
Ok(None)
|
||||
}
|
||||
|
||||
pub fn network(&self) -> Result<Option<&v6::Network>> {
|
||||
Ok(None)
|
||||
}
|
||||
|
||||
pub fn webhooks(&self) -> Option<&v6::Webhooks> {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
pub enum CompatIndexV5ToV6 {
|
||||
@@ -251,13 +238,6 @@ impl CompatIndexV5ToV6 {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
match self {
|
||||
CompatIndexV5ToV6::V5(v5) => v5.documents_file(),
|
||||
CompatIndexV5ToV6::Compat(compat) => compat.documents_file(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
|
||||
match self {
|
||||
CompatIndexV5ToV6::V5(v5) => Ok(v6::Settings::from(v5.settings()?).check()),
|
||||
@@ -338,16 +318,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
|
||||
v6::Settings {
|
||||
displayed_attributes: v6::Setting::from(settings.displayed_attributes).into(),
|
||||
searchable_attributes: v6::Setting::from(settings.searchable_attributes).into(),
|
||||
filterable_attributes: match settings.filterable_attributes {
|
||||
v5::settings::Setting::Set(filterable_attributes) => v6::Setting::Set(
|
||||
filterable_attributes
|
||||
.into_iter()
|
||||
.map(v6::FilterableAttributesRule::Field)
|
||||
.collect(),
|
||||
),
|
||||
v5::settings::Setting::Reset => v6::Setting::Reset,
|
||||
v5::settings::Setting::NotSet => v6::Setting::NotSet,
|
||||
},
|
||||
filterable_attributes: settings.filterable_attributes.into(),
|
||||
sortable_attributes: settings.sortable_attributes.into(),
|
||||
ranking_rules: {
|
||||
match settings.ranking_rules {
|
||||
@@ -389,7 +360,6 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
|
||||
},
|
||||
disable_on_words: typo.disable_on_words.into(),
|
||||
disable_on_attributes: typo.disable_on_attributes.into(),
|
||||
disable_on_numbers: v6::Setting::NotSet,
|
||||
}),
|
||||
v5::Setting::Reset => v6::Setting::Reset,
|
||||
v5::Setting::NotSet => v6::Setting::NotSet,
|
||||
@@ -404,13 +374,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
|
||||
},
|
||||
pagination: match settings.pagination {
|
||||
v5::Setting::Set(pagination) => v6::Setting::Set(v6::PaginationSettings {
|
||||
max_total_hits: match pagination.max_total_hits {
|
||||
v5::Setting::Set(max_total_hits) => v6::Setting::Set(
|
||||
max_total_hits.try_into().unwrap_or(NonZeroUsize::new(1).unwrap()),
|
||||
),
|
||||
v5::Setting::Reset => v6::Setting::Reset,
|
||||
v5::Setting::NotSet => v6::Setting::NotSet,
|
||||
},
|
||||
max_total_hits: pagination.max_total_hits.into(),
|
||||
}),
|
||||
v5::Setting::Reset => v6::Setting::Reset,
|
||||
v5::Setting::NotSet => v6::Setting::NotSet,
|
||||
@@ -420,8 +384,6 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
|
||||
search_cutoff_ms: v6::Setting::NotSet,
|
||||
facet_search: v6::Setting::NotSet,
|
||||
prefix_search: v6::Setting::NotSet,
|
||||
chat: v6::Setting::NotSet,
|
||||
vector_store: v6::Setting::NotSet,
|
||||
_kind: std::marker::PhantomData,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -23,7 +23,6 @@ mod v6;
|
||||
pub type Document = serde_json::Map<String, serde_json::Value>;
|
||||
pub type UpdateFile = dyn Iterator<Item = Result<Document>>;
|
||||
|
||||
#[allow(clippy::large_enum_variant)]
|
||||
pub enum DumpReader {
|
||||
Current(V6Reader),
|
||||
Compat(CompatV5ToV6),
|
||||
@@ -102,13 +101,6 @@ impl DumpReader {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn batches(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Batch>> + '_>> {
|
||||
match self {
|
||||
DumpReader::Current(current) => Ok(current.batches()),
|
||||
DumpReader::Compat(_compat) => Ok(Box::new(std::iter::empty())),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn keys(&mut self) -> Result<Box<dyn Iterator<Item = Result<v6::Key>> + '_>> {
|
||||
match self {
|
||||
DumpReader::Current(current) => Ok(current.keys()),
|
||||
@@ -116,35 +108,12 @@ impl DumpReader {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn chat_completions_settings(
|
||||
&mut self,
|
||||
) -> Result<Box<dyn Iterator<Item = Result<(String, v6::ChatCompletionSettings)>> + '_>> {
|
||||
match self {
|
||||
DumpReader::Current(current) => current.chat_completions_settings(),
|
||||
DumpReader::Compat(_compat) => Ok(Box::new(std::iter::empty())),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
|
||||
match self {
|
||||
DumpReader::Current(current) => Ok(current.features()),
|
||||
DumpReader::Compat(compat) => compat.features(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn network(&self) -> Result<Option<&v6::Network>> {
|
||||
match self {
|
||||
DumpReader::Current(current) => Ok(current.network()),
|
||||
DumpReader::Compat(compat) => compat.network(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn webhooks(&self) -> Option<&v6::Webhooks> {
|
||||
match self {
|
||||
DumpReader::Current(current) => current.webhooks(),
|
||||
DumpReader::Compat(compat) => compat.webhooks(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl From<V6Reader> for DumpReader {
|
||||
@@ -199,14 +168,6 @@ impl DumpIndexReader {
|
||||
}
|
||||
}
|
||||
|
||||
/// A reference to a file in the NDJSON format containing all the documents of the index
|
||||
pub fn documents_file(&self) -> &File {
|
||||
match self {
|
||||
DumpIndexReader::Current(v6) => v6.documents_file(),
|
||||
DumpIndexReader::Compat(compat) => compat.documents_file(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
|
||||
match self {
|
||||
DumpIndexReader::Current(v6) => v6.settings(),
|
||||
@@ -258,10 +219,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2024-05-16 15:51:34.151044 +00:00:00");
|
||||
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -371,8 +328,6 @@ pub(crate) mod test {
|
||||
}
|
||||
|
||||
assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default());
|
||||
assert_eq!(dump.network().unwrap(), None);
|
||||
assert_eq!(dump.webhooks(), None);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -384,10 +339,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00");
|
||||
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -422,64 +373,6 @@ pub(crate) mod test {
|
||||
assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn import_dump_v6_network() {
|
||||
let dump = File::open("tests/assets/v6-with-network.dump").unwrap();
|
||||
let dump = DumpReader::open(dump).unwrap();
|
||||
|
||||
// top level infos
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2025-01-29 15:45:32.738676 +00:00:00");
|
||||
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
|
||||
|
||||
// network
|
||||
|
||||
let network = dump.network().unwrap().unwrap();
|
||||
insta::assert_snapshot!(network.local.as_ref().unwrap(), @"ms-0");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-0").as_ref().unwrap().url, @"http://localhost:7700");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-0").as_ref().unwrap().search_api_key.is_none(), @"true");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-1").as_ref().unwrap().url, @"http://localhost:7701");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-1").as_ref().unwrap().search_api_key.is_none(), @"true");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-2").as_ref().unwrap().url, @"http://ms-5679.example.meilisearch.io");
|
||||
insta::assert_snapshot!(network.remotes.get("ms-2").as_ref().unwrap().search_api_key.as_ref().unwrap(), @"foo");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn import_dump_v6_webhooks() {
|
||||
let dump = File::open("tests/assets/v6-with-webhooks.dump").unwrap();
|
||||
let dump = DumpReader::open(dump).unwrap();
|
||||
|
||||
// top level infos
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2025-07-31 9:21:30.479544 +00:00:00");
|
||||
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @r"
|
||||
Some(
|
||||
cb887dcc-34b3-48d1-addd-9815ae721a81,
|
||||
)
|
||||
");
|
||||
|
||||
// webhooks
|
||||
let webhooks = dump.webhooks().unwrap();
|
||||
insta::assert_json_snapshot!(webhooks, @r#"
|
||||
{
|
||||
"webhooks": {
|
||||
"627ea538-733d-4545-8d2d-03526eb381ce": {
|
||||
"url": "https://example.com/authorization-less",
|
||||
"headers": {}
|
||||
},
|
||||
"771b0a28-ef28-4082-b984-536f82958c65": {
|
||||
"url": "https://example.com/hook",
|
||||
"headers": {
|
||||
"authorization": "TOKEN"
|
||||
}
|
||||
},
|
||||
"f3583083-f8a7-4cbf-a5e7-fb3f1e28a7e9": {
|
||||
"url": "https://third.com",
|
||||
"headers": {}
|
||||
}
|
||||
}
|
||||
}
|
||||
"#);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn import_dump_v5() {
|
||||
let dump = File::open("tests/assets/v5.dump").unwrap();
|
||||
@@ -489,10 +382,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
|
||||
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -573,10 +462,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
|
||||
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -654,10 +539,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
|
||||
assert_eq!(dump.instance_uid().unwrap(), None);
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -751,10 +632,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
|
||||
assert_eq!(dump.instance_uid().unwrap(), None);
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -848,10 +725,6 @@ pub(crate) mod test {
|
||||
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
|
||||
assert_eq!(dump.instance_uid().unwrap(), None);
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
@@ -928,10 +801,6 @@ pub(crate) mod test {
|
||||
assert_eq!(dump.date(), None);
|
||||
assert_eq!(dump.instance_uid().unwrap(), None);
|
||||
|
||||
// batches didn't exists at the time
|
||||
let batches = dump.batches().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
meili_snap::snapshot!(meili_snap::json_string!(batches), @"[]");
|
||||
|
||||
// tasks
|
||||
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
|
||||
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
source: crates/dump/src/reader/mod.rs
|
||||
source: dump/src/reader/mod.rs
|
||||
expression: vector_index.settings().unwrap()
|
||||
---
|
||||
{
|
||||
@@ -49,7 +49,6 @@ expression: vector_index.settings().unwrap()
|
||||
"source": "huggingFace",
|
||||
"model": "BAAI/bge-base-en-v1.5",
|
||||
"revision": "617ca489d9e86b49b8167676d8220688b99db36e",
|
||||
"pooling": "forceMean",
|
||||
"documentTemplate": "{% for field in fields %} {{ field.name }}: {{ field.value }}\n{% endfor %}"
|
||||
}
|
||||
},
|
||||
|
||||
@@ -72,10 +72,6 @@ impl V1IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<self::settings::Settings> {
|
||||
Ok(serde_json::from_reader(&mut self.settings)?)
|
||||
}
|
||||
|
||||
@@ -203,10 +203,6 @@ impl V2IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<Settings<Checked>> {
|
||||
Ok(self.settings.clone())
|
||||
}
|
||||
|
||||
@@ -215,10 +215,6 @@ impl V3IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<Settings<Checked>> {
|
||||
Ok(self.settings.clone())
|
||||
}
|
||||
|
||||
@@ -108,7 +108,7 @@ where
|
||||
/// not supported on untagged enums.
|
||||
struct StarOrVisitor<T>(PhantomData<T>);
|
||||
|
||||
impl<T, FE> Visitor<'_> for StarOrVisitor<T>
|
||||
impl<'de, T, FE> Visitor<'de> for StarOrVisitor<T>
|
||||
where
|
||||
T: FromStr<Err = FE>,
|
||||
FE: Display,
|
||||
|
||||
@@ -210,10 +210,6 @@ impl V4IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<Settings<Checked>> {
|
||||
Ok(self.settings.clone())
|
||||
}
|
||||
|
||||
@@ -99,7 +99,7 @@ impl Task {
|
||||
/// Return true when a task is finished.
|
||||
/// A task is finished when its last state is either `Succeeded` or `Failed`.
|
||||
pub fn is_finished(&self) -> bool {
|
||||
self.events.last().is_some_and(|event| {
|
||||
self.events.last().map_or(false, |event| {
|
||||
matches!(event, TaskEvent::Succeded { .. } | TaskEvent::Failed { .. })
|
||||
})
|
||||
}
|
||||
|
||||
@@ -108,7 +108,7 @@ where
|
||||
/// not supported on untagged enums.
|
||||
struct StarOrVisitor<T>(PhantomData<T>);
|
||||
|
||||
impl<T, FE> Visitor<'_> for StarOrVisitor<T>
|
||||
impl<'de, T, FE> Visitor<'de> for StarOrVisitor<T>
|
||||
where
|
||||
T: FromStr<Err = FE>,
|
||||
FE: Display,
|
||||
|
||||
@@ -247,10 +247,6 @@ impl V5IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<Settings<Checked>> {
|
||||
Ok(self.settings.clone())
|
||||
}
|
||||
|
||||
@@ -114,7 +114,7 @@ impl Task {
|
||||
/// Return true when a task is finished.
|
||||
/// A task is finished when its last state is either `Succeeded` or `Failed`.
|
||||
pub fn is_finished(&self) -> bool {
|
||||
self.events.last().is_some_and(|event| {
|
||||
self.events.last().map_or(false, |event| {
|
||||
matches!(event, TaskEvent::Succeeded { .. } | TaskEvent::Failed { .. })
|
||||
})
|
||||
}
|
||||
|
||||
@@ -1,10 +1,8 @@
|
||||
use std::ffi::OsStr;
|
||||
use std::fs::{self, File};
|
||||
use std::io::{BufRead, BufReader, ErrorKind};
|
||||
use std::path::Path;
|
||||
|
||||
pub use meilisearch_types::milli;
|
||||
use meilisearch_types::milli::vector::embedder::hf::OverridePooling;
|
||||
use tempfile::TempDir;
|
||||
use time::OffsetDateTime;
|
||||
use tracing::debug;
|
||||
@@ -20,12 +18,8 @@ pub type Checked = meilisearch_types::settings::Checked;
|
||||
pub type Unchecked = meilisearch_types::settings::Unchecked;
|
||||
|
||||
pub type Task = crate::TaskDump;
|
||||
pub type Batch = meilisearch_types::batches::Batch;
|
||||
pub type Key = meilisearch_types::keys::Key;
|
||||
pub type ChatCompletionSettings = meilisearch_types::features::ChatCompletionSettings;
|
||||
pub type RuntimeTogglableFeatures = meilisearch_types::features::RuntimeTogglableFeatures;
|
||||
pub type Network = meilisearch_types::enterprise_edition::network::Network;
|
||||
pub type Webhooks = meilisearch_types::webhooks::WebhooksDumpView;
|
||||
|
||||
// ===== Other types to clarify the code of the compat module
|
||||
// everything related to the tasks
|
||||
@@ -49,18 +43,13 @@ pub type ResponseError = meilisearch_types::error::ResponseError;
|
||||
pub type Code = meilisearch_types::error::Code;
|
||||
pub type RankingRuleView = meilisearch_types::settings::RankingRuleView;
|
||||
|
||||
pub type FilterableAttributesRule = meilisearch_types::milli::FilterableAttributesRule;
|
||||
|
||||
pub struct V6Reader {
|
||||
dump: TempDir,
|
||||
instance_uid: Option<Uuid>,
|
||||
metadata: Metadata,
|
||||
tasks: BufReader<File>,
|
||||
batches: Option<BufReader<File>>,
|
||||
keys: BufReader<File>,
|
||||
features: Option<RuntimeTogglableFeatures>,
|
||||
network: Option<Network>,
|
||||
webhooks: Option<Webhooks>,
|
||||
}
|
||||
|
||||
impl V6Reader {
|
||||
@@ -88,46 +77,14 @@ impl V6Reader {
|
||||
} else {
|
||||
None
|
||||
};
|
||||
let batches = match File::open(dump.path().join("batches").join("queue.jsonl")) {
|
||||
Ok(file) => Some(BufReader::new(file)),
|
||||
// The batch file was only introduced during the v1.13, anything prior to that won't have batches
|
||||
Err(err) if err.kind() == ErrorKind::NotFound => None,
|
||||
Err(e) => return Err(e.into()),
|
||||
};
|
||||
|
||||
let network = match fs::read(dump.path().join("network.json")) {
|
||||
Ok(network_file) => Some(serde_json::from_reader(&*network_file)?),
|
||||
Err(error) => match error.kind() {
|
||||
// Allows the file to be missing, this will only result in all experimental features disabled.
|
||||
ErrorKind::NotFound => {
|
||||
debug!("`network.json` not found in dump");
|
||||
None
|
||||
}
|
||||
_ => return Err(error.into()),
|
||||
},
|
||||
};
|
||||
|
||||
let webhooks = match fs::read(dump.path().join("webhooks.json")) {
|
||||
Ok(webhooks_file) => Some(serde_json::from_reader(&*webhooks_file)?),
|
||||
Err(error) => match error.kind() {
|
||||
ErrorKind::NotFound => {
|
||||
debug!("`webhooks.json` not found in dump");
|
||||
None
|
||||
}
|
||||
_ => return Err(error.into()),
|
||||
},
|
||||
};
|
||||
|
||||
Ok(V6Reader {
|
||||
metadata: serde_json::from_reader(&*meta_file)?,
|
||||
instance_uid,
|
||||
tasks: BufReader::new(File::open(dump.path().join("tasks").join("queue.jsonl"))?),
|
||||
batches,
|
||||
keys: BufReader::new(File::open(dump.path().join("keys.jsonl"))?),
|
||||
features,
|
||||
network,
|
||||
dump,
|
||||
webhooks,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -167,7 +124,7 @@ impl V6Reader {
|
||||
&mut self,
|
||||
) -> Box<dyn Iterator<Item = Result<(Task, Option<Box<super::UpdateFile>>)>> + '_> {
|
||||
Box::new((&mut self.tasks).lines().map(|line| -> Result<_> {
|
||||
let task: Task = serde_json::from_str(&line?)?;
|
||||
let task: Task = serde_json::from_str(&line?).unwrap();
|
||||
|
||||
let update_file_path = self
|
||||
.dump
|
||||
@@ -179,7 +136,8 @@ impl V6Reader {
|
||||
if update_file_path.exists() {
|
||||
Ok((
|
||||
task,
|
||||
Some(Box::new(UpdateFile::new(&update_file_path)?) as Box<super::UpdateFile>),
|
||||
Some(Box::new(UpdateFile::new(&update_file_path).unwrap())
|
||||
as Box<super::UpdateFile>),
|
||||
))
|
||||
} else {
|
||||
Ok((task, None))
|
||||
@@ -187,61 +145,15 @@ impl V6Reader {
|
||||
}))
|
||||
}
|
||||
|
||||
pub fn batches(&mut self) -> Box<dyn Iterator<Item = Result<Batch>> + '_> {
|
||||
match self.batches.as_mut() {
|
||||
Some(batches) => Box::new((batches).lines().map(|line| -> Result<_> {
|
||||
let batch = serde_json::from_str(&line?)?;
|
||||
Ok(batch)
|
||||
})),
|
||||
None => Box::new(std::iter::empty()) as Box<dyn Iterator<Item = Result<Batch>> + '_>,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn keys(&mut self) -> Box<dyn Iterator<Item = Result<Key>> + '_> {
|
||||
Box::new(
|
||||
(&mut self.keys).lines().map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }),
|
||||
)
|
||||
}
|
||||
|
||||
pub fn chat_completions_settings(
|
||||
&mut self,
|
||||
) -> Result<Box<dyn Iterator<Item = Result<(String, ChatCompletionSettings)>> + '_>> {
|
||||
let entries = match fs::read_dir(self.dump.path().join("chat-completions-settings")) {
|
||||
Ok(entries) => entries,
|
||||
Err(e) if e.kind() == ErrorKind::NotFound => return Ok(Box::new(std::iter::empty())),
|
||||
Err(e) => return Err(e.into()),
|
||||
};
|
||||
Ok(Box::new(
|
||||
entries
|
||||
.map(|entry| -> Result<Option<_>> {
|
||||
let entry = entry?;
|
||||
let file_name = entry.file_name();
|
||||
let path = Path::new(&file_name);
|
||||
if entry.file_type()?.is_file() && path.extension() == Some(OsStr::new("json"))
|
||||
{
|
||||
let name = path.file_stem().unwrap().to_str().unwrap().to_string();
|
||||
let file = File::open(entry.path())?;
|
||||
let settings = serde_json::from_reader(file)?;
|
||||
Ok(Some((name, settings)))
|
||||
} else {
|
||||
Ok(None)
|
||||
}
|
||||
})
|
||||
.filter_map(|entry| entry.transpose()),
|
||||
))
|
||||
}
|
||||
|
||||
pub fn features(&self) -> Option<RuntimeTogglableFeatures> {
|
||||
self.features
|
||||
}
|
||||
|
||||
pub fn network(&self) -> Option<&Network> {
|
||||
self.network.as_ref()
|
||||
}
|
||||
|
||||
pub fn webhooks(&self) -> Option<&Webhooks> {
|
||||
self.webhooks.as_ref()
|
||||
}
|
||||
}
|
||||
|
||||
pub struct UpdateFile {
|
||||
@@ -297,34 +209,8 @@ impl V6IndexReader {
|
||||
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
|
||||
}
|
||||
|
||||
pub fn documents_file(&self) -> &File {
|
||||
self.documents.get_ref()
|
||||
}
|
||||
|
||||
pub fn settings(&mut self) -> Result<Settings<Checked>> {
|
||||
let mut settings: Settings<Unchecked> = serde_json::from_reader(&mut self.settings)?;
|
||||
patch_embedders(&mut settings);
|
||||
let settings: Settings<Unchecked> = serde_json::from_reader(&mut self.settings)?;
|
||||
Ok(settings.check())
|
||||
}
|
||||
}
|
||||
|
||||
fn patch_embedders(settings: &mut Settings<Unchecked>) {
|
||||
if let Setting::Set(embedders) = &mut settings.embedders {
|
||||
for settings in embedders.values_mut() {
|
||||
let Setting::Set(settings) = &mut settings.inner else {
|
||||
continue;
|
||||
};
|
||||
if settings.source != Setting::Set(milli::vector::settings::EmbedderSource::HuggingFace)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
settings.pooling = match settings.pooling {
|
||||
Setting::Set(pooling) => Setting::Set(pooling),
|
||||
// if the pooling for a hugging face embedder is not set, force it to `forceMean`
|
||||
// for backward compatibility with v1.13
|
||||
// dumps created in v1.14 and up will have the setting set for hugging face embedders
|
||||
Setting::Reset | Setting::NotSet => Setting::Set(OverridePooling::ForceMean),
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4,12 +4,9 @@ use std::path::PathBuf;
|
||||
|
||||
use flate2::write::GzEncoder;
|
||||
use flate2::Compression;
|
||||
use meilisearch_types::batches::Batch;
|
||||
use meilisearch_types::enterprise_edition::network::Network;
|
||||
use meilisearch_types::features::{ChatCompletionSettings, RuntimeTogglableFeatures};
|
||||
use meilisearch_types::features::RuntimeTogglableFeatures;
|
||||
use meilisearch_types::keys::Key;
|
||||
use meilisearch_types::settings::{Checked, Settings};
|
||||
use meilisearch_types::webhooks::WebhooksDumpView;
|
||||
use serde_json::{Map, Value};
|
||||
use tempfile::TempDir;
|
||||
use time::OffsetDateTime;
|
||||
@@ -53,18 +50,10 @@ impl DumpWriter {
|
||||
KeyWriter::new(self.dir.path().to_path_buf())
|
||||
}
|
||||
|
||||
pub fn create_chat_completions_settings(&self) -> Result<ChatCompletionsSettingsWriter> {
|
||||
ChatCompletionsSettingsWriter::new(self.dir.path().join("chat-completions-settings"))
|
||||
}
|
||||
|
||||
pub fn create_tasks_queue(&self) -> Result<TaskWriter> {
|
||||
TaskWriter::new(self.dir.path().join("tasks"))
|
||||
}
|
||||
|
||||
pub fn create_batches_queue(&self) -> Result<BatchWriter> {
|
||||
BatchWriter::new(self.dir.path().join("batches"))
|
||||
}
|
||||
|
||||
pub fn create_experimental_features(&self, features: RuntimeTogglableFeatures) -> Result<()> {
|
||||
Ok(std::fs::write(
|
||||
self.dir.path().join("experimental-features.json"),
|
||||
@@ -72,17 +61,6 @@ impl DumpWriter {
|
||||
)?)
|
||||
}
|
||||
|
||||
pub fn create_network(&self, network: Network) -> Result<()> {
|
||||
Ok(std::fs::write(self.dir.path().join("network.json"), serde_json::to_string(&network)?)?)
|
||||
}
|
||||
|
||||
pub fn create_webhooks(&self, webhooks: WebhooksDumpView) -> Result<()> {
|
||||
Ok(std::fs::write(
|
||||
self.dir.path().join("webhooks.json"),
|
||||
serde_json::to_string(&webhooks)?,
|
||||
)?)
|
||||
}
|
||||
|
||||
pub fn persist_to(self, mut writer: impl Write) -> Result<()> {
|
||||
let gz_encoder = GzEncoder::new(&mut writer, Compression::default());
|
||||
let mut tar_encoder = tar::Builder::new(gz_encoder);
|
||||
@@ -106,7 +84,7 @@ impl KeyWriter {
|
||||
}
|
||||
|
||||
pub fn push_key(&mut self, key: &Key) -> Result<()> {
|
||||
serde_json::to_writer(&mut self.keys, &key)?;
|
||||
self.keys.write_all(&serde_json::to_vec(key)?)?;
|
||||
self.keys.write_all(b"\n")?;
|
||||
Ok(())
|
||||
}
|
||||
@@ -117,24 +95,6 @@ impl KeyWriter {
|
||||
}
|
||||
}
|
||||
|
||||
pub struct ChatCompletionsSettingsWriter {
|
||||
path: PathBuf,
|
||||
}
|
||||
|
||||
impl ChatCompletionsSettingsWriter {
|
||||
pub(crate) fn new(path: PathBuf) -> Result<Self> {
|
||||
std::fs::create_dir(&path)?;
|
||||
Ok(ChatCompletionsSettingsWriter { path })
|
||||
}
|
||||
|
||||
pub fn push_settings(&mut self, name: &str, settings: &ChatCompletionSettings) -> Result<()> {
|
||||
let mut settings_file = File::create(self.path.join(name).with_extension("json"))?;
|
||||
serde_json::to_writer(&mut settings_file, &settings)?;
|
||||
settings_file.flush()?;
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
pub struct TaskWriter {
|
||||
queue: BufWriter<File>,
|
||||
update_files: PathBuf,
|
||||
@@ -154,7 +114,7 @@ impl TaskWriter {
|
||||
/// Pushes tasks in the dump.
|
||||
/// If the tasks has an associated `update_file` it'll use the `task_id` as its name.
|
||||
pub fn push_task(&mut self, task: &TaskDump) -> Result<UpdateFile> {
|
||||
serde_json::to_writer(&mut self.queue, &task)?;
|
||||
self.queue.write_all(&serde_json::to_vec(task)?)?;
|
||||
self.queue.write_all(b"\n")?;
|
||||
|
||||
Ok(UpdateFile::new(self.update_files.join(format!("{}.jsonl", task.uid))))
|
||||
@@ -166,30 +126,6 @@ impl TaskWriter {
|
||||
}
|
||||
}
|
||||
|
||||
pub struct BatchWriter {
|
||||
queue: BufWriter<File>,
|
||||
}
|
||||
|
||||
impl BatchWriter {
|
||||
pub(crate) fn new(path: PathBuf) -> Result<Self> {
|
||||
std::fs::create_dir(&path)?;
|
||||
let queue = File::create(path.join("queue.jsonl"))?;
|
||||
Ok(BatchWriter { queue: BufWriter::new(queue) })
|
||||
}
|
||||
|
||||
/// Pushes batches in the dump.
|
||||
pub fn push_batch(&mut self, batch: &Batch) -> Result<()> {
|
||||
serde_json::to_writer(&mut self.queue, &batch)?;
|
||||
self.queue.write_all(b"\n")?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn flush(mut self) -> Result<()> {
|
||||
self.queue.flush()?;
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
pub struct UpdateFile {
|
||||
path: PathBuf,
|
||||
writer: Option<BufWriter<File>>,
|
||||
@@ -201,8 +137,8 @@ impl UpdateFile {
|
||||
}
|
||||
|
||||
pub fn push_document(&mut self, document: &Document) -> Result<()> {
|
||||
if let Some(mut writer) = self.writer.as_mut() {
|
||||
serde_json::to_writer(&mut writer, &document)?;
|
||||
if let Some(writer) = self.writer.as_mut() {
|
||||
writer.write_all(&serde_json::to_vec(document)?)?;
|
||||
writer.write_all(b"\n")?;
|
||||
} else {
|
||||
let file = File::create(&self.path).unwrap();
|
||||
@@ -269,8 +205,8 @@ pub(crate) mod test {
|
||||
use super::*;
|
||||
use crate::reader::Document;
|
||||
use crate::test::{
|
||||
create_test_api_keys, create_test_batches, create_test_documents, create_test_dump,
|
||||
create_test_instance_uid, create_test_settings, create_test_tasks,
|
||||
create_test_api_keys, create_test_documents, create_test_dump, create_test_instance_uid,
|
||||
create_test_settings, create_test_tasks,
|
||||
};
|
||||
|
||||
fn create_directory_hierarchy(dir: &Path) -> String {
|
||||
@@ -345,10 +281,8 @@ pub(crate) mod test {
|
||||
let dump_path = dump.path();
|
||||
|
||||
// ==== checking global file hierarchy (we want to be sure there isn't too many files or too few)
|
||||
insta::assert_snapshot!(create_directory_hierarchy(dump_path), @r"
|
||||
insta::assert_snapshot!(create_directory_hierarchy(dump_path), @r###"
|
||||
.
|
||||
├---- batches/
|
||||
│ └---- queue.jsonl
|
||||
├---- indexes/
|
||||
│ └---- doggos/
|
||||
│ │ ├---- documents.jsonl
|
||||
@@ -361,9 +295,8 @@ pub(crate) mod test {
|
||||
├---- experimental-features.json
|
||||
├---- instance_uid.uuid
|
||||
├---- keys.jsonl
|
||||
├---- metadata.json
|
||||
└---- network.json
|
||||
");
|
||||
└---- metadata.json
|
||||
"###);
|
||||
|
||||
// ==== checking the top level infos
|
||||
let metadata = fs::read_to_string(dump_path.join("metadata.json")).unwrap();
|
||||
@@ -416,16 +349,6 @@ pub(crate) mod test {
|
||||
}
|
||||
}
|
||||
|
||||
// ==== checking the batch queue
|
||||
let batches_queue = fs::read_to_string(dump_path.join("batches/queue.jsonl")).unwrap();
|
||||
for (batch, expected) in batches_queue.lines().zip(create_test_batches()) {
|
||||
let mut batch = serde_json::from_str::<Batch>(batch).unwrap();
|
||||
if batch.details.settings == Some(Box::new(Settings::<Unchecked>::default())) {
|
||||
batch.details.settings = None;
|
||||
}
|
||||
assert_eq!(batch, expected, "{batch:#?}{expected:#?}");
|
||||
}
|
||||
|
||||
// ==== checking the keys
|
||||
let keys = fs::read_to_string(dump_path.join("keys.jsonl")).unwrap();
|
||||
for (key, expected) in keys.lines().zip(create_test_api_keys()) {
|
||||
|
||||
Binary file not shown.
Binary file not shown.
@@ -11,7 +11,7 @@ edition.workspace = true
|
||||
license.workspace = true
|
||||
|
||||
[dependencies]
|
||||
tempfile = "3.20.0"
|
||||
thiserror = "2.0.12"
|
||||
tempfile = "3.15.0"
|
||||
thiserror = "2.0.9"
|
||||
tracing = "0.1.41"
|
||||
uuid = { version = "1.17.0", features = ["serde", "v4"] }
|
||||
uuid = { version = "1.11.0", features = ["serde", "v4"] }
|
||||
|
||||
@@ -148,10 +148,11 @@ impl File {
|
||||
Ok(Self { path: PathBuf::new(), file: None })
|
||||
}
|
||||
|
||||
pub fn persist(self) -> Result<Option<StdFile>> {
|
||||
let Some(file) = self.file else { return Ok(None) };
|
||||
|
||||
Ok(Some(file.persist(&self.path)?))
|
||||
pub fn persist(self) -> Result<()> {
|
||||
if let Some(file) = self.file {
|
||||
file.persist(&self.path)?;
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -14,8 +14,7 @@ license.workspace = true
|
||||
[dependencies]
|
||||
nom = "7.1.3"
|
||||
nom_locate = "4.2.0"
|
||||
unescaper = "0.1.6"
|
||||
levenshtein_automata = { version = "0.2.1", features = ["fst_automaton"] }
|
||||
unescaper = "0.1.5"
|
||||
|
||||
[dev-dependencies]
|
||||
# fixed version due to format breakages in v1.40
|
||||
|
||||
@@ -7,14 +7,12 @@
|
||||
|
||||
use nom::branch::alt;
|
||||
use nom::bytes::complete::tag;
|
||||
use nom::character::complete::{char, multispace0, multispace1};
|
||||
use nom::combinator::{cut, map, value};
|
||||
use nom::sequence::{preceded, terminated, tuple};
|
||||
use nom::character::complete::multispace1;
|
||||
use nom::combinator::cut;
|
||||
use nom::sequence::{terminated, tuple};
|
||||
use Condition::*;
|
||||
|
||||
use crate::error::IResultExt;
|
||||
use crate::value::{parse_vector_value, parse_vector_value_cut};
|
||||
use crate::{parse_value, Error, ErrorKind, FilterCondition, IResult, Span, Token, VectorFilter};
|
||||
use crate::{parse_value, FilterCondition, IResult, Span, Token};
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum Condition<'a> {
|
||||
@@ -32,25 +30,6 @@ pub enum Condition<'a> {
|
||||
StartsWith { keyword: Token<'a>, word: Token<'a> },
|
||||
}
|
||||
|
||||
impl Condition<'_> {
|
||||
pub fn operator(&self) -> &str {
|
||||
match self {
|
||||
Condition::GreaterThan(_) => ">",
|
||||
Condition::GreaterThanOrEqual(_) => ">=",
|
||||
Condition::Equal(_) => "=",
|
||||
Condition::NotEqual(_) => "!=",
|
||||
Condition::Null => "IS NULL",
|
||||
Condition::Empty => "IS EMPTY",
|
||||
Condition::Exists => "EXISTS",
|
||||
Condition::LowerThan(_) => "<",
|
||||
Condition::LowerThanOrEqual(_) => "<=",
|
||||
Condition::Between { .. } => "TO",
|
||||
Condition::Contains { .. } => "CONTAINS",
|
||||
Condition::StartsWith { .. } => "STARTS WITH",
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// condition = value ("==" | ">" ...) value
|
||||
pub fn parse_condition(input: Span) -> IResult<FilterCondition> {
|
||||
let operator = alt((tag("<="), tag(">="), tag("!="), tag("<"), tag(">"), tag("=")));
|
||||
@@ -115,83 +94,6 @@ pub fn parse_not_exists(input: Span) -> IResult<FilterCondition> {
|
||||
Ok((input, FilterCondition::Not(Box::new(FilterCondition::Condition { fid: key, op: Exists }))))
|
||||
}
|
||||
|
||||
fn parse_vectors(input: Span) -> IResult<(Token, Option<Token>, VectorFilter)> {
|
||||
let (input, _) = multispace0(input)?;
|
||||
let (input, fid) = tag("_vectors")(input)?;
|
||||
|
||||
if let Ok((input, _)) = multispace1::<_, crate::Error>(input) {
|
||||
return Ok((input, (Token::from(fid), None, VectorFilter::None)));
|
||||
}
|
||||
|
||||
let (input, _) = char('.')(input)?;
|
||||
|
||||
// From this point, we are certain this is a vector filter, so our errors must be final.
|
||||
// We could use nom's `cut` but it's better to be explicit about the errors
|
||||
|
||||
if let Ok((_, space)) = tag::<_, _, ()>(" ")(input) {
|
||||
return Err(crate::Error::failure_from_kind(space, ErrorKind::VectorFilterMissingEmbedder));
|
||||
}
|
||||
|
||||
let (input, embedder_name) =
|
||||
parse_vector_value_cut(input, ErrorKind::VectorFilterInvalidEmbedder)?;
|
||||
|
||||
let (input, filter) = alt((
|
||||
map(
|
||||
preceded(tag(".fragments"), |input| {
|
||||
let (input, _) = tag(".")(input).map_cut(ErrorKind::VectorFilterMissingFragment)?;
|
||||
parse_vector_value_cut(input, ErrorKind::VectorFilterInvalidFragment)
|
||||
}),
|
||||
VectorFilter::Fragment,
|
||||
),
|
||||
value(VectorFilter::UserProvided, tag(".userProvided")),
|
||||
value(VectorFilter::DocumentTemplate, tag(".documentTemplate")),
|
||||
value(VectorFilter::Regenerate, tag(".regenerate")),
|
||||
value(VectorFilter::None, nom::combinator::success("")),
|
||||
))(input)?;
|
||||
|
||||
if let Ok((input, point)) = tag::<_, _, ()>(".")(input) {
|
||||
let opt_value = parse_vector_value(input).ok().map(|(_, v)| v);
|
||||
let value =
|
||||
opt_value.as_ref().map(|v| v.value().to_owned()).unwrap_or_else(|| point.to_string());
|
||||
let context = opt_value.map(|v| v.original_span()).unwrap_or(point);
|
||||
let previous_kind = match filter {
|
||||
VectorFilter::Fragment(_) => Some("fragments"),
|
||||
VectorFilter::DocumentTemplate => Some("documentTemplate"),
|
||||
VectorFilter::UserProvided => Some("userProvided"),
|
||||
VectorFilter::Regenerate => Some("regenerate"),
|
||||
VectorFilter::None => None,
|
||||
};
|
||||
return Err(Error::failure_from_kind(
|
||||
context,
|
||||
ErrorKind::VectorFilterUnknownSuffix(previous_kind, value),
|
||||
));
|
||||
}
|
||||
|
||||
let (input, _) = multispace1(input).map_cut(ErrorKind::VectorFilterLeftover)?;
|
||||
|
||||
Ok((input, (Token::from(fid), Some(embedder_name), filter)))
|
||||
}
|
||||
|
||||
/// vectors_exists = vectors ("EXISTS" | ("NOT" WS+ "EXISTS"))
|
||||
pub fn parse_vectors_exists(input: Span) -> IResult<FilterCondition> {
|
||||
let (input, (fid, embedder, filter)) = parse_vectors(input)?;
|
||||
|
||||
// Try parsing "EXISTS" first
|
||||
if let Ok((input, _)) = tag::<_, _, ()>("EXISTS")(input) {
|
||||
return Ok((input, FilterCondition::VectorExists { fid, embedder, filter }));
|
||||
}
|
||||
|
||||
// Try parsing "NOT EXISTS"
|
||||
if let Ok((input, _)) = tuple::<_, _, (), _>((tag("NOT"), multispace1, tag("EXISTS")))(input) {
|
||||
return Ok((
|
||||
input,
|
||||
FilterCondition::Not(Box::new(FilterCondition::VectorExists { fid, embedder, filter })),
|
||||
));
|
||||
}
|
||||
|
||||
Err(crate::Error::failure_from_kind(input, ErrorKind::VectorFilterOperation))
|
||||
}
|
||||
|
||||
/// contains = value "CONTAINS" value
|
||||
pub fn parse_contains(input: Span) -> IResult<FilterCondition> {
|
||||
let (input, (fid, contains, value)) =
|
||||
|
||||
@@ -35,30 +35,13 @@ impl<E> NomErrorExt<E> for nom::Err<E> {
|
||||
pub fn cut_with_err<'a, O>(
|
||||
mut parser: impl FnMut(Span<'a>) -> IResult<'a, O>,
|
||||
mut with: impl FnMut(Error<'a>) -> Error<'a>,
|
||||
) -> impl FnMut(Span<'a>) -> IResult<'a, O> {
|
||||
) -> impl FnMut(Span<'a>) -> IResult<O> {
|
||||
move |input| match parser.parse(input) {
|
||||
Err(nom::Err::Error(e)) => Err(nom::Err::Failure(with(e))),
|
||||
rest => rest,
|
||||
}
|
||||
}
|
||||
|
||||
pub trait IResultExt<'a> {
|
||||
fn map_cut(self, kind: ErrorKind<'a>) -> Self;
|
||||
}
|
||||
|
||||
impl<'a, T> IResultExt<'a> for IResult<'a, T> {
|
||||
fn map_cut(self, kind: ErrorKind<'a>) -> Self {
|
||||
self.map_err(move |e: nom::Err<Error<'a>>| {
|
||||
let input = match e {
|
||||
nom::Err::Incomplete(_) => return e,
|
||||
nom::Err::Error(e) => *e.context(),
|
||||
nom::Err::Failure(e) => *e.context(),
|
||||
};
|
||||
Error::failure_from_kind(input, kind)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct Error<'a> {
|
||||
context: Span<'a>,
|
||||
@@ -75,21 +58,9 @@ pub enum ExpectedValueKind {
|
||||
pub enum ErrorKind<'a> {
|
||||
ReservedGeo(&'a str),
|
||||
GeoRadius,
|
||||
GeoRadiusArgumentCount(usize),
|
||||
GeoBoundingBox,
|
||||
GeoPolygon,
|
||||
GeoPolygonNotEnoughPoints(usize),
|
||||
GeoCoordinatesNotPair(usize),
|
||||
MisusedGeoRadius,
|
||||
MisusedGeoBoundingBox,
|
||||
VectorFilterLeftover,
|
||||
VectorFilterInvalidQuotes,
|
||||
VectorFilterMissingEmbedder,
|
||||
VectorFilterInvalidEmbedder,
|
||||
VectorFilterMissingFragment,
|
||||
VectorFilterInvalidFragment,
|
||||
VectorFilterUnknownSuffix(Option<&'static str>, String),
|
||||
VectorFilterOperation,
|
||||
InvalidPrimary,
|
||||
InvalidEscapedNumber,
|
||||
ExpectedEof,
|
||||
@@ -120,10 +91,6 @@ impl<'a> Error<'a> {
|
||||
Self { context, kind }
|
||||
}
|
||||
|
||||
pub fn failure_from_kind(context: Span<'a>, kind: ErrorKind<'a>) -> nom::Err<Self> {
|
||||
nom::Err::Failure(Self::new_from_kind(context, kind))
|
||||
}
|
||||
|
||||
pub fn new_from_external(context: Span<'a>, error: impl std::error::Error) -> Self {
|
||||
Self::new_from_kind(context, ErrorKind::External(error.to_string()))
|
||||
}
|
||||
@@ -154,27 +121,13 @@ impl<'a> ParseError<Span<'a>> for Error<'a> {
|
||||
}
|
||||
}
|
||||
|
||||
impl Display for Error<'_> {
|
||||
impl<'a> Display for Error<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
let input = self.context.fragment();
|
||||
// When printing our error message we want to escape all `\n` to be sure we keep our format with the
|
||||
// first line being the diagnostic and the second line being the incriminated filter.
|
||||
let escaped_input = input.escape_debug();
|
||||
|
||||
fn key_suggestion<'a>(key: &str, keys: &[&'a str]) -> Option<&'a str> {
|
||||
let typos =
|
||||
levenshtein_automata::LevenshteinAutomatonBuilder::new(2, true).build_dfa(key);
|
||||
for key in keys.iter() {
|
||||
match typos.eval(key) {
|
||||
levenshtein_automata::Distance::Exact(_) => {
|
||||
return Some(key);
|
||||
}
|
||||
levenshtein_automata::Distance::AtLeast(_) => continue,
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
match &self.kind {
|
||||
ErrorKind::ExpectedValue(_) if input.trim().is_empty() => {
|
||||
writeln!(f, "Was expecting a value but instead got nothing.")?
|
||||
@@ -193,7 +146,7 @@ impl Display for Error<'_> {
|
||||
}
|
||||
ErrorKind::InvalidPrimary => {
|
||||
let text = if input.trim().is_empty() { "but instead got nothing.".to_string() } else { format!("at `{}`.", escaped_input) };
|
||||
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` {text}")?
|
||||
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` {}", text)?
|
||||
}
|
||||
ErrorKind::InvalidEscapedNumber => {
|
||||
writeln!(f, "Found an invalid escaped sequence number: `{}`.", escaped_input)?
|
||||
@@ -202,23 +155,11 @@ impl Display for Error<'_> {
|
||||
writeln!(f, "Found unexpected characters at the end of the filter: `{}`. You probably forgot an `OR` or an `AND` rule.", escaped_input)?
|
||||
}
|
||||
ErrorKind::GeoRadius => {
|
||||
writeln!(f, "The `_geoRadius` filter must be in the form: `_geoRadius(latitude, longitude, radius, optionalResolution)`.")?
|
||||
}
|
||||
ErrorKind::GeoRadiusArgumentCount(count) => {
|
||||
writeln!(f, "Was expecting 3 or 4 arguments for `_geoRadius`, but instead found {count}.")?
|
||||
writeln!(f, "The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.")?
|
||||
}
|
||||
ErrorKind::GeoBoundingBox => {
|
||||
writeln!(f, "The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.")?
|
||||
}
|
||||
ErrorKind::GeoPolygon => {
|
||||
writeln!(f, "The `_geoPolygon` filter doesn't match the expected format: `_geoPolygon([latitude, longitude], [latitude, longitude])`.")?
|
||||
}
|
||||
ErrorKind::GeoPolygonNotEnoughPoints(n) => {
|
||||
writeln!(f, "The `_geoPolygon` filter expects at least 3 points but only {n} were specified")?;
|
||||
}
|
||||
ErrorKind::GeoCoordinatesNotPair(number) => {
|
||||
writeln!(f, "Was expecting 2 coordinates but instead found {number}.")?
|
||||
}
|
||||
ErrorKind::ReservedGeo(name) => {
|
||||
writeln!(f, "`{}` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.", name.escape_debug())?
|
||||
}
|
||||
@@ -228,44 +169,6 @@ impl Display for Error<'_> {
|
||||
ErrorKind::MisusedGeoBoundingBox => {
|
||||
writeln!(f, "The `_geoBoundingBox` filter is an operation and can't be used as a value.")?
|
||||
}
|
||||
ErrorKind::VectorFilterLeftover => {
|
||||
writeln!(f, "The vector filter has leftover tokens.")?
|
||||
}
|
||||
ErrorKind::VectorFilterUnknownSuffix(_, value) if value.as_str() == "." => {
|
||||
writeln!(f, "Was expecting one of `.fragments`, `.userProvided`, `.documentTemplate`, `.regenerate` or nothing, but instead found a point without a valid value.")?;
|
||||
}
|
||||
ErrorKind::VectorFilterUnknownSuffix(None, value) if ["fragments", "userProvided", "documentTemplate", "regenerate"].contains(&value.as_str()) => {
|
||||
// This will happen with "_vectors.rest.\"userProvided\"" for instance
|
||||
writeln!(f, "Was expecting this part to be unquoted.")?
|
||||
}
|
||||
ErrorKind::VectorFilterUnknownSuffix(None, value) => {
|
||||
if let Some(suggestion) = key_suggestion(value, &["fragments", "userProvided", "documentTemplate", "regenerate"]) {
|
||||
writeln!(f, "Was expecting one of `fragments`, `userProvided`, `documentTemplate`, `regenerate` or nothing, but instead found `{value}`. Did you mean `{suggestion}`?")?;
|
||||
} else {
|
||||
writeln!(f, "Was expecting one of `fragments`, `userProvided`, `documentTemplate`, `regenerate` or nothing, but instead found `{value}`.")?;
|
||||
}
|
||||
}
|
||||
ErrorKind::VectorFilterUnknownSuffix(Some(previous_filter_kind), value) => {
|
||||
writeln!(f, "Vector filter can only accept one of `fragments`, `userProvided`, `documentTemplate` or `regenerate`, but found both `{previous_filter_kind}` and `{value}`.")?
|
||||
},
|
||||
ErrorKind::VectorFilterInvalidFragment => {
|
||||
writeln!(f, "The vector filter's fragment name is invalid.")?
|
||||
}
|
||||
ErrorKind::VectorFilterMissingFragment => {
|
||||
writeln!(f, "The vector filter is missing a fragment name.")?
|
||||
}
|
||||
ErrorKind::VectorFilterMissingEmbedder => {
|
||||
writeln!(f, "Was expecting embedder name but found nothing.")?
|
||||
}
|
||||
ErrorKind::VectorFilterInvalidEmbedder => {
|
||||
writeln!(f, "The vector filter's embedder name is invalid.")?
|
||||
}
|
||||
ErrorKind::VectorFilterOperation => {
|
||||
writeln!(f, "Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.")?
|
||||
}
|
||||
ErrorKind::VectorFilterInvalidQuotes => {
|
||||
writeln!(f, "The quotes in one of the values are inconsistent.")?
|
||||
}
|
||||
ErrorKind::ReservedKeyword(word) => {
|
||||
writeln!(f, "`{word}` is a reserved keyword and thus cannot be used as a field name unless it is put inside quotes. Use \"{word}\" or \'{word}\' instead.")?
|
||||
}
|
||||
|
||||
@@ -19,7 +19,6 @@
|
||||
//! word = (alphanumeric | _ | - | .)+
|
||||
//! geoRadius = "_geoRadius(" WS* float WS* "," WS* float WS* "," float WS* ")"
|
||||
//! geoBoundingBox = "_geoBoundingBox([" WS * float WS* "," WS* float WS* "], [" WS* float WS* "," WS* float WS* "]")
|
||||
//! geoPolygon = "_geoPolygon([[" WS* float WS* "," WS* float WS* "],+])"
|
||||
//! ```
|
||||
//!
|
||||
//! Other BNF grammar used to handle some specific errors:
|
||||
@@ -66,9 +65,6 @@ use nom_locate::LocatedSpan;
|
||||
pub(crate) use value::parse_value;
|
||||
use value::word_exact;
|
||||
|
||||
use crate::condition::parse_vectors_exists;
|
||||
use crate::error::IResultExt;
|
||||
|
||||
pub type Span<'a> = LocatedSpan<&'a str, &'a str>;
|
||||
|
||||
type IResult<'a, Ret> = nom::IResult<Span<'a>, Ret, Error<'a>>;
|
||||
@@ -84,7 +80,7 @@ pub struct Token<'a> {
|
||||
value: Option<String>,
|
||||
}
|
||||
|
||||
impl PartialEq for Token<'_> {
|
||||
impl<'a> PartialEq for Token<'a> {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.span.fragment() == other.span.fragment()
|
||||
}
|
||||
@@ -117,7 +113,7 @@ impl<'a> Token<'a> {
|
||||
self.span
|
||||
}
|
||||
|
||||
pub fn parse_finite_float(&self) -> Result<f64, Error<'a>> {
|
||||
pub fn parse_finite_float(&self) -> Result<f64, Error> {
|
||||
let value: f64 = self.value().parse().map_err(|e| self.as_external_error(e))?;
|
||||
if value.is_finite() {
|
||||
Ok(value)
|
||||
@@ -140,15 +136,6 @@ impl<'a> From<&'a str> for Token<'a> {
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum VectorFilter<'a> {
|
||||
Fragment(Token<'a>),
|
||||
DocumentTemplate,
|
||||
UserProvided,
|
||||
Regenerate,
|
||||
None,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum FilterCondition<'a> {
|
||||
Not(Box<Self>),
|
||||
@@ -156,10 +143,8 @@ pub enum FilterCondition<'a> {
|
||||
In { fid: Token<'a>, els: Vec<Token<'a>> },
|
||||
Or(Vec<Self>),
|
||||
And(Vec<Self>),
|
||||
VectorExists { fid: Token<'a>, embedder: Option<Token<'a>>, filter: VectorFilter<'a> },
|
||||
GeoLowerThan { point: [Token<'a>; 2], radius: Token<'a>, resolution: Option<Token<'a>> },
|
||||
GeoLowerThan { point: [Token<'a>; 2], radius: Token<'a> },
|
||||
GeoBoundingBox { top_right_point: [Token<'a>; 2], bottom_left_point: [Token<'a>; 2] },
|
||||
GeoPolygon { points: Vec<[Token<'a>; 2]> },
|
||||
}
|
||||
|
||||
pub enum TraversedElement<'a> {
|
||||
@@ -168,7 +153,7 @@ pub enum TraversedElement<'a> {
|
||||
}
|
||||
|
||||
impl<'a> FilterCondition<'a> {
|
||||
pub fn use_contains_operator(&self) -> Option<&Token<'a>> {
|
||||
pub fn use_contains_operator(&self) -> Option<&Token> {
|
||||
match self {
|
||||
FilterCondition::Condition { fid: _, op } => match op {
|
||||
Condition::GreaterThan(_)
|
||||
@@ -180,38 +165,21 @@ impl<'a> FilterCondition<'a> {
|
||||
| Condition::Exists
|
||||
| Condition::LowerThan(_)
|
||||
| Condition::LowerThanOrEqual(_)
|
||||
| Condition::Between { .. }
|
||||
| Condition::StartsWith { .. } => None,
|
||||
Condition::Contains { keyword, word: _ } => Some(keyword),
|
||||
| Condition::Between { .. } => None,
|
||||
Condition::Contains { keyword, word: _ }
|
||||
| Condition::StartsWith { keyword, word: _ } => Some(keyword),
|
||||
},
|
||||
FilterCondition::Not(this) => this.use_contains_operator(),
|
||||
FilterCondition::Or(seq) | FilterCondition::And(seq) => {
|
||||
seq.iter().find_map(|filter| filter.use_contains_operator())
|
||||
}
|
||||
FilterCondition::VectorExists { .. }
|
||||
| FilterCondition::GeoLowerThan { .. }
|
||||
| FilterCondition::GeoBoundingBox { .. }
|
||||
| FilterCondition::GeoPolygon { .. }
|
||||
| FilterCondition::In { .. } => None,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn use_vector_filter(&self) -> Option<&Token<'a>> {
|
||||
match self {
|
||||
FilterCondition::Condition { .. } => None,
|
||||
FilterCondition::Not(this) => this.use_vector_filter(),
|
||||
FilterCondition::Or(seq) | FilterCondition::And(seq) => {
|
||||
seq.iter().find_map(|filter| filter.use_vector_filter())
|
||||
}
|
||||
FilterCondition::GeoLowerThan { .. }
|
||||
| FilterCondition::GeoBoundingBox { .. }
|
||||
| FilterCondition::GeoPolygon { .. }
|
||||
| FilterCondition::In { .. } => None,
|
||||
FilterCondition::VectorExists { fid, .. } => Some(fid),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn fids(&self, depth: usize) -> Box<dyn Iterator<Item = &Token<'a>> + '_> {
|
||||
pub fn fids(&self, depth: usize) -> Box<dyn Iterator<Item = &Token> + '_> {
|
||||
if depth == 0 {
|
||||
return Box::new(std::iter::empty());
|
||||
}
|
||||
@@ -232,7 +200,7 @@ impl<'a> FilterCondition<'a> {
|
||||
}
|
||||
|
||||
/// Returns the first token found at the specified depth, `None` if no token at this depth.
|
||||
pub fn token_at_depth(&self, depth: usize) -> Option<&Token<'a>> {
|
||||
pub fn token_at_depth(&self, depth: usize) -> Option<&Token> {
|
||||
match self {
|
||||
FilterCondition::Condition { fid, .. } if depth == 0 => Some(fid),
|
||||
FilterCondition::Or(subfilters) => {
|
||||
@@ -258,7 +226,7 @@ impl<'a> FilterCondition<'a> {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn parse(input: &'a str) -> Result<Option<Self>, Error<'a>> {
|
||||
pub fn parse(input: &'a str) -> Result<Option<Self>, Error> {
|
||||
if input.trim().is_empty() {
|
||||
return Ok(None);
|
||||
}
|
||||
@@ -295,7 +263,10 @@ fn parse_in_body(input: Span) -> IResult<Vec<Token>> {
|
||||
let (input, _) = ws(word_exact("IN"))(input)?;
|
||||
|
||||
// everything after `IN` can be a failure
|
||||
let (input, _) = tag("[")(input).map_cut(ErrorKind::InOpeningBracket)?;
|
||||
let (input, _) =
|
||||
cut_with_err(tag("["), |_| Error::new_from_kind(input, ErrorKind::InOpeningBracket))(
|
||||
input,
|
||||
)?;
|
||||
|
||||
let (input, content) = cut(parse_value_list)(input)?;
|
||||
|
||||
@@ -400,27 +371,23 @@ fn parse_not(input: Span, depth: usize) -> IResult<FilterCondition> {
|
||||
/// If we parse `_geoRadius` we MUST parse the rest of the expression.
|
||||
fn parse_geo_radius(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to allow space BEFORE the _geoRadius but not after
|
||||
|
||||
let (input, _) = tuple((multispace0, word_exact("_geoRadius")))(input)?;
|
||||
|
||||
// if we were able to parse `_geoRadius` and can't parse the rest of the input we return a failure
|
||||
|
||||
let parsed =
|
||||
delimited(char('('), separated_list1(tag(","), ws(recognize_float)), char(')'))(input)
|
||||
.map_cut(ErrorKind::GeoRadius);
|
||||
let parsed = preceded(
|
||||
tuple((multispace0, word_exact("_geoRadius"))),
|
||||
// if we were able to parse `_geoRadius` and can't parse the rest of the input we return a failure
|
||||
cut(delimited(char('('), separated_list1(tag(","), ws(recognize_float)), char(')'))),
|
||||
)(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::GeoRadius)));
|
||||
|
||||
let (input, args) = parsed?;
|
||||
|
||||
if !(3..=4).contains(&args.len()) {
|
||||
return Err(Error::failure_from_kind(input, ErrorKind::GeoRadiusArgumentCount(args.len())));
|
||||
if args.len() != 3 {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::GeoRadius)));
|
||||
}
|
||||
|
||||
let res = FilterCondition::GeoLowerThan {
|
||||
point: [args[0].into(), args[1].into()],
|
||||
radius: args[2].into(),
|
||||
resolution: args.get(3).cloned().map(Token::from),
|
||||
};
|
||||
|
||||
Ok((input, res))
|
||||
}
|
||||
|
||||
@@ -428,31 +395,24 @@ fn parse_geo_radius(input: Span) -> IResult<FilterCondition> {
|
||||
/// If we parse `_geoBoundingBox` we MUST parse the rest of the expression.
|
||||
fn parse_geo_bounding_box(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to allow space BEFORE the _geoBoundingBox but not after
|
||||
|
||||
let (input, _) = tuple((multispace0, word_exact("_geoBoundingBox")))(input)?;
|
||||
|
||||
// if we were able to parse `_geoBoundingBox` and can't parse the rest of the input we return a failure
|
||||
|
||||
let (input, args) = delimited(
|
||||
char('('),
|
||||
separated_list1(
|
||||
tag(","),
|
||||
ws(delimited(char('['), separated_list1(tag(","), ws(recognize_float)), char(']'))),
|
||||
),
|
||||
char(')'),
|
||||
let parsed = preceded(
|
||||
tuple((multispace0, word_exact("_geoBoundingBox"))),
|
||||
// if we were able to parse `_geoBoundingBox` and can't parse the rest of the input we return a failure
|
||||
cut(delimited(
|
||||
char('('),
|
||||
separated_list1(
|
||||
tag(","),
|
||||
ws(delimited(char('['), separated_list1(tag(","), ws(recognize_float)), char(']'))),
|
||||
),
|
||||
char(')'),
|
||||
)),
|
||||
)(input)
|
||||
.map_cut(ErrorKind::GeoBoundingBox)?;
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::GeoBoundingBox)));
|
||||
|
||||
if args.len() != 2 {
|
||||
return Err(Error::failure_from_kind(input, ErrorKind::GeoBoundingBox));
|
||||
}
|
||||
let (input, args) = parsed?;
|
||||
|
||||
if let Some(offending) = args.iter().find(|a| a.len() != 2) {
|
||||
let context = offending.first().unwrap_or(&input);
|
||||
return Err(Error::failure_from_kind(
|
||||
*context,
|
||||
ErrorKind::GeoCoordinatesNotPair(offending.len()),
|
||||
));
|
||||
if args.len() != 2 || args[0].len() != 2 || args[1].len() != 2 {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::GeoBoundingBox)));
|
||||
}
|
||||
|
||||
let res = FilterCondition::GeoBoundingBox {
|
||||
@@ -462,47 +422,6 @@ fn parse_geo_bounding_box(input: Span) -> IResult<FilterCondition> {
|
||||
Ok((input, res))
|
||||
}
|
||||
|
||||
/// geoPolygon = "_geoPolygon([[" WS* float WS* "," WS* float WS* "],+])"
|
||||
/// If we parse `_geoPolygon` we MUST parse the rest of the expression.
|
||||
fn parse_geo_polygon(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to allow space BEFORE the _geoPolygon but not after
|
||||
|
||||
let (input, _) = tuple((multispace0, word_exact("_geoPolygon")))(input)?;
|
||||
|
||||
// if we were able to parse `_geoPolygon` and can't parse the rest of the input we return a failure
|
||||
|
||||
let (input, args): (_, Vec<Vec<LocatedSpan<_, _>>>) = delimited(
|
||||
char('('),
|
||||
separated_list1(
|
||||
tag(","),
|
||||
ws(delimited(char('['), separated_list1(tag(","), ws(recognize_float)), char(']'))),
|
||||
),
|
||||
preceded(opt(ws(char(','))), char(')')), // Tolerate trailing comma
|
||||
)(input)
|
||||
.map_cut(ErrorKind::GeoPolygon)?;
|
||||
|
||||
if args.len() < 3 {
|
||||
let context = args.last().and_then(|a| a.last()).unwrap_or(&input);
|
||||
return Err(Error::failure_from_kind(
|
||||
*context,
|
||||
ErrorKind::GeoPolygonNotEnoughPoints(args.len()),
|
||||
));
|
||||
}
|
||||
|
||||
if let Some(offending) = args.iter().find(|a| a.len() != 2) {
|
||||
let context = offending.first().unwrap_or(&input);
|
||||
return Err(Error::failure_from_kind(
|
||||
*context,
|
||||
ErrorKind::GeoCoordinatesNotPair(offending.len()),
|
||||
));
|
||||
}
|
||||
|
||||
let res = FilterCondition::GeoPolygon {
|
||||
points: args.into_iter().map(|a| [a[0].into(), a[1].into()]).collect(),
|
||||
};
|
||||
Ok((input, res))
|
||||
}
|
||||
|
||||
/// geoPoint = WS* "_geoPoint(float WS* "," WS* float WS* "," WS* float)
|
||||
fn parse_geo_point(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to forbid space BEFORE the _geoPoint but not after
|
||||
@@ -514,7 +433,7 @@ fn parse_geo_point(input: Span) -> IResult<FilterCondition> {
|
||||
))(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoPoint"))))?;
|
||||
// if we succeeded we still return a `Failure` because geoPoints are not allowed
|
||||
Err(Error::failure_from_kind(input, ErrorKind::ReservedGeo("_geoPoint")))
|
||||
Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoPoint"))))
|
||||
}
|
||||
|
||||
/// geoPoint = WS* "_geoDistance(float WS* "," WS* float WS* "," WS* float)
|
||||
@@ -528,7 +447,7 @@ fn parse_geo_distance(input: Span) -> IResult<FilterCondition> {
|
||||
))(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoDistance"))))?;
|
||||
// if we succeeded we still return a `Failure` because `geoDistance` filters are not allowed
|
||||
Err(Error::failure_from_kind(input, ErrorKind::ReservedGeo("_geoDistance")))
|
||||
Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoDistance"))))
|
||||
}
|
||||
|
||||
/// geo = WS* "_geo(float WS* "," WS* float WS* "," WS* float)
|
||||
@@ -542,7 +461,7 @@ fn parse_geo(input: Span) -> IResult<FilterCondition> {
|
||||
))(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::ReservedGeo("_geo"))))?;
|
||||
// if we succeeded we still return a `Failure` because `_geo` filter is not allowed
|
||||
Err(Error::failure_from_kind(input, ErrorKind::ReservedGeo("_geo")))
|
||||
Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::ReservedGeo("_geo"))))
|
||||
}
|
||||
|
||||
fn parse_error_reserved_keyword(input: Span) -> IResult<FilterCondition> {
|
||||
@@ -572,8 +491,8 @@ fn parse_primary(input: Span, depth: usize) -> IResult<FilterCondition> {
|
||||
Error::new_from_kind(input, ErrorKind::MissingClosingDelimiter(c.char()))
|
||||
}),
|
||||
),
|
||||
// Made a random block of functions because we reached the maximum number of elements per alt
|
||||
alt((parse_geo_radius, parse_geo_bounding_box, parse_geo_polygon)),
|
||||
parse_geo_radius,
|
||||
parse_geo_bounding_box,
|
||||
parse_in,
|
||||
parse_not_in,
|
||||
parse_condition,
|
||||
@@ -581,7 +500,8 @@ fn parse_primary(input: Span, depth: usize) -> IResult<FilterCondition> {
|
||||
parse_is_not_null,
|
||||
parse_is_empty,
|
||||
parse_is_not_empty,
|
||||
alt((parse_vectors_exists, parse_exists, parse_not_exists)),
|
||||
parse_exists,
|
||||
parse_not_exists,
|
||||
parse_to,
|
||||
parse_contains,
|
||||
parse_not_contains,
|
||||
@@ -607,7 +527,7 @@ pub fn parse_filter(input: Span) -> IResult<FilterCondition> {
|
||||
terminated(|input| parse_expression(input, 0), eof)(input)
|
||||
}
|
||||
|
||||
impl std::fmt::Display for FilterCondition<'_> {
|
||||
impl<'a> std::fmt::Display for FilterCondition<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
FilterCondition::Not(filter) => {
|
||||
@@ -637,28 +557,9 @@ impl std::fmt::Display for FilterCondition<'_> {
|
||||
}
|
||||
write!(f, "]")
|
||||
}
|
||||
FilterCondition::VectorExists { fid: _, embedder, filter: inner } => {
|
||||
write!(f, "_vectors")?;
|
||||
if let Some(embedder) = embedder {
|
||||
write!(f, ".{:?}", embedder.value())?;
|
||||
}
|
||||
match inner {
|
||||
VectorFilter::Fragment(fragment) => {
|
||||
write!(f, ".fragments.{:?}", fragment.value())?
|
||||
}
|
||||
VectorFilter::DocumentTemplate => write!(f, ".documentTemplate")?,
|
||||
VectorFilter::UserProvided => write!(f, ".userProvided")?,
|
||||
VectorFilter::Regenerate => write!(f, ".regenerate")?,
|
||||
VectorFilter::None => (),
|
||||
}
|
||||
write!(f, " EXISTS")
|
||||
}
|
||||
FilterCondition::GeoLowerThan { point, radius, resolution: None } => {
|
||||
FilterCondition::GeoLowerThan { point, radius } => {
|
||||
write!(f, "_geoRadius({}, {}, {})", point[0], point[1], radius)
|
||||
}
|
||||
FilterCondition::GeoLowerThan { point, radius, resolution: Some(resolution) } => {
|
||||
write!(f, "_geoRadius({}, {}, {}, {})", point[0], point[1], radius, resolution)
|
||||
}
|
||||
FilterCondition::GeoBoundingBox {
|
||||
top_right_point: top_left_point,
|
||||
bottom_left_point: bottom_right_point,
|
||||
@@ -672,18 +573,10 @@ impl std::fmt::Display for FilterCondition<'_> {
|
||||
bottom_right_point[1]
|
||||
)
|
||||
}
|
||||
FilterCondition::GeoPolygon { points } => {
|
||||
write!(f, "_geoPolygon([")?;
|
||||
for point in points {
|
||||
write!(f, "[{}, {}], ", point[0], point[1])?;
|
||||
}
|
||||
write!(f, "])")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl std::fmt::Display for Condition<'_> {
|
||||
impl<'a> std::fmt::Display for Condition<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Condition::GreaterThan(token) => write!(f, "> {token}"),
|
||||
@@ -701,8 +594,7 @@ impl std::fmt::Display for Condition<'_> {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl std::fmt::Display for Token<'_> {
|
||||
impl<'a> std::fmt::Display for Token<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
write!(f, "{{{}}}", self.value())
|
||||
}
|
||||
@@ -717,7 +609,7 @@ pub mod tests {
|
||||
/// Create a raw [Token]. You must specify the string that appear BEFORE your element followed by your element
|
||||
pub fn rtok<'a>(before: &'a str, value: &'a str) -> Token<'a> {
|
||||
// if the string is empty we still need to return 1 for the line number
|
||||
let lines = if before.is_empty() { 1 } else { before.lines().count() };
|
||||
let lines = before.is_empty().then_some(1).unwrap_or_else(|| before.lines().count());
|
||||
let offset = before.chars().count();
|
||||
// the extra field is not checked in the tests so we can set it to nothing
|
||||
unsafe { Span::new_from_raw_offset(offset, lines as u32, value, "") }.into()
|
||||
@@ -736,9 +628,6 @@ pub mod tests {
|
||||
insta::assert_snapshot!(p(r"title = 'foo\\\\\\\\'"), @r#"{title} = {foo\\\\}"#);
|
||||
// but it also works with other sequences
|
||||
insta::assert_snapshot!(p(r#"title = 'foo\x20\n\t\"\'"'"#), @"{title} = {foo \n\t\"\'\"}");
|
||||
|
||||
insta::assert_snapshot!(p(r#"_vectors." valid.name ".fragments."also.. valid! " EXISTS"#), @r#"_vectors." valid.name ".fragments."also.. valid! " EXISTS"#);
|
||||
insta::assert_snapshot!(p("_vectors.\"\n\t\r\\\"\" EXISTS"), @r#"_vectors."\n\t\r\"" EXISTS"#);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -801,18 +690,6 @@ pub mod tests {
|
||||
insta::assert_snapshot!(p("NOT subscribers IS NOT EMPTY"), @"{subscribers} IS EMPTY");
|
||||
insta::assert_snapshot!(p("subscribers IS NOT EMPTY"), @"NOT ({subscribers} IS EMPTY)");
|
||||
|
||||
// Test _vectors EXISTS + _vectors NOT EXITS
|
||||
insta::assert_snapshot!(p("_vectors EXISTS"), @"_vectors EXISTS");
|
||||
insta::assert_snapshot!(p("_vectors.embedderName EXISTS"), @r#"_vectors."embedderName" EXISTS"#);
|
||||
insta::assert_snapshot!(p("_vectors.embedderName.documentTemplate EXISTS"), @r#"_vectors."embedderName".documentTemplate EXISTS"#);
|
||||
insta::assert_snapshot!(p("_vectors.embedderName.regenerate EXISTS"), @r#"_vectors."embedderName".regenerate EXISTS"#);
|
||||
insta::assert_snapshot!(p("_vectors.embedderName.regenerate EXISTS"), @r#"_vectors."embedderName".regenerate EXISTS"#);
|
||||
insta::assert_snapshot!(p("_vectors.embedderName.fragments.fragmentName EXISTS"), @r#"_vectors."embedderName".fragments."fragmentName" EXISTS"#);
|
||||
insta::assert_snapshot!(p(" _vectors.embedderName.fragments.fragmentName EXISTS"), @r#"_vectors."embedderName".fragments."fragmentName" EXISTS"#);
|
||||
insta::assert_snapshot!(p("NOT _vectors EXISTS"), @"NOT (_vectors EXISTS)");
|
||||
insta::assert_snapshot!(p(" NOT _vectors EXISTS"), @"NOT (_vectors EXISTS)");
|
||||
insta::assert_snapshot!(p(" _vectors NOT EXISTS"), @"NOT (_vectors EXISTS)");
|
||||
|
||||
// Test EXISTS + NOT EXITS
|
||||
insta::assert_snapshot!(p("subscribers EXISTS"), @"{subscribers} EXISTS");
|
||||
insta::assert_snapshot!(p("NOT subscribers EXISTS"), @"NOT ({subscribers} EXISTS)");
|
||||
@@ -842,17 +719,12 @@ pub mod tests {
|
||||
insta::assert_snapshot!(p("_geoRadius(12, 13, 14)"), @"_geoRadius({12}, {13}, {14})");
|
||||
insta::assert_snapshot!(p("NOT _geoRadius(12, 13, 14)"), @"NOT (_geoRadius({12}, {13}, {14}))");
|
||||
insta::assert_snapshot!(p("_geoRadius(12,13,14)"), @"_geoRadius({12}, {13}, {14})");
|
||||
insta::assert_snapshot!(p("_geoRadius(12,13,14,1000)"), @"_geoRadius({12}, {13}, {14}, {1000})");
|
||||
|
||||
// Test geo bounding box
|
||||
insta::assert_snapshot!(p("_geoBoundingBox([12, 13], [14, 15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
|
||||
insta::assert_snapshot!(p("NOT _geoBoundingBox([12, 13], [14, 15])"), @"NOT (_geoBoundingBox([{12}, {13}], [{14}, {15}]))");
|
||||
insta::assert_snapshot!(p("_geoBoundingBox([12,13],[14,15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
|
||||
|
||||
// Test geo polygon
|
||||
insta::assert_snapshot!(p("_geoPolygon([12, 13], [14, 15], [16, 17])"), @"_geoPolygon([[{12}, {13}], [{14}, {15}], [{16}, {17}], ])");
|
||||
insta::assert_snapshot!(p("_geoPolygon([12, 13], [14, 15], [-1.2,2939.2], [1,1])"), @"_geoPolygon([[{12}, {13}], [{14}, {15}], [{-1.2}, {2939.2}], [{1}, {1}], ])");
|
||||
|
||||
// Test OR + AND
|
||||
insta::assert_snapshot!(p("channel = ponce AND 'dog race' != 'bernese mountain'"), @"AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
|
||||
insta::assert_snapshot!(p("channel = ponce OR 'dog race' != 'bernese mountain'"), @"OR[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
|
||||
@@ -909,80 +781,50 @@ pub mod tests {
|
||||
11:12 channel = 🐻 AND followers < 100
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("'OR'"), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `\'OR\'`.
|
||||
insta::assert_snapshot!(p("'OR'"), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `\'OR\'`.
|
||||
1:5 'OR'
|
||||
");
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("OR"), @r###"
|
||||
Was expecting a value but instead got `OR`, which is a reserved keyword. To use `OR` as a field name or a value, surround it by quotes.
|
||||
1:3 OR
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("channel Ponce"), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `channel Ponce`.
|
||||
insta::assert_snapshot!(p("channel Ponce"), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `channel Ponce`.
|
||||
1:14 channel Ponce
|
||||
");
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("channel = Ponce OR"), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` but instead got nothing.
|
||||
insta::assert_snapshot!(p("channel = Ponce OR"), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` but instead got nothing.
|
||||
19:19 channel = Ponce OR
|
||||
");
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoRadius"), @r"
|
||||
The `_geoRadius` filter must be in the form: `_geoRadius(latitude, longitude, radius, optionalResolution)`.
|
||||
11:11 _geoRadius
|
||||
");
|
||||
insta::assert_snapshot!(p("_geoRadius"), @r###"
|
||||
The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.
|
||||
1:11 _geoRadius
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoRadius = 12"), @r"
|
||||
The `_geoRadius` filter must be in the form: `_geoRadius(latitude, longitude, radius, optionalResolution)`.
|
||||
11:16 _geoRadius = 12
|
||||
");
|
||||
insta::assert_snapshot!(p("_geoRadius = 12"), @r###"
|
||||
The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.
|
||||
1:16 _geoRadius = 12
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoBoundingBox"), @r"
|
||||
insta::assert_snapshot!(p("_geoBoundingBox"), @r###"
|
||||
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
|
||||
16:16 _geoBoundingBox
|
||||
");
|
||||
1:16 _geoBoundingBox
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoBoundingBox = 12"), @r"
|
||||
insta::assert_snapshot!(p("_geoBoundingBox = 12"), @r###"
|
||||
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
|
||||
16:21 _geoBoundingBox = 12
|
||||
");
|
||||
1:21 _geoBoundingBox = 12
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoBoundingBox(1.0, 1.0)"), @r"
|
||||
insta::assert_snapshot!(p("_geoBoundingBox(1.0, 1.0)"), @r###"
|
||||
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
|
||||
17:26 _geoBoundingBox(1.0, 1.0)
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon([1,2,3])"), @r"
|
||||
The `_geoPolygon` filter expects at least 3 points but only 1 were specified
|
||||
18:19 _geoPolygon([1,2,3])
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon(1,2,3)"), @r"
|
||||
The `_geoPolygon` filter doesn't match the expected format: `_geoPolygon([latitude, longitude], [latitude, longitude])`.
|
||||
13:19 _geoPolygon(1,2,3)
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon([1,2],[1,2],[1,2,3])"), @r"
|
||||
Was expecting 2 coordinates but instead found 3.
|
||||
26:27 _geoPolygon([1,2],[1,2],[1,2,3])
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon([1,2],[1,2,3])"), @r"
|
||||
The `_geoPolygon` filter expects at least 3 points but only 2 were specified
|
||||
24:25 _geoPolygon([1,2],[1,2,3])
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon(1)"), @r"
|
||||
The `_geoPolygon` filter doesn't match the expected format: `_geoPolygon([latitude, longitude], [latitude, longitude])`.
|
||||
13:15 _geoPolygon(1)
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p("_geoPolygon([1,2)"), @r"
|
||||
The `_geoPolygon` filter doesn't match the expected format: `_geoPolygon([latitude, longitude], [latitude, longitude])`.
|
||||
17:18 _geoPolygon([1,2)
|
||||
");
|
||||
1:26 _geoBoundingBox(1.0, 1.0)
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("_geoPoint(12, 13, 14)"), @r###"
|
||||
`_geoPoint` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
|
||||
@@ -1039,15 +881,15 @@ pub mod tests {
|
||||
34:35 channel = mv OR followers >= 1000)
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("colour NOT EXIST"), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `colour NOT EXIST`.
|
||||
insta::assert_snapshot!(p("colour NOT EXIST"), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `colour NOT EXIST`.
|
||||
1:17 colour NOT EXIST
|
||||
");
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("subscribers 100 TO1000"), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `subscribers 100 TO1000`.
|
||||
insta::assert_snapshot!(p("subscribers 100 TO1000"), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `subscribers 100 TO1000`.
|
||||
1:23 subscribers 100 TO1000
|
||||
");
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p("channel = ponce ORdog != 'bernese mountain'"), @r###"
|
||||
Found unexpected characters at the end of the filter: `ORdog != \'bernese mountain\'`. You probably forgot an `OR` or an `AND` rule.
|
||||
@@ -1102,108 +944,43 @@ pub mod tests {
|
||||
"###
|
||||
);
|
||||
|
||||
insta::assert_snapshot!(p(r#"_vectors _vectors EXISTS"#), @r"
|
||||
Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.
|
||||
10:25 _vectors _vectors EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors. embedderName EXISTS"#), @r"
|
||||
Was expecting embedder name but found nothing.
|
||||
10:11 _vectors. embedderName EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors .embedderName EXISTS"#), @r"
|
||||
Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.
|
||||
10:30 _vectors .embedderName EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName. EXISTS"#), @r"
|
||||
Was expecting one of `.fragments`, `.userProvided`, `.documentTemplate`, `.regenerate` or nothing, but instead found a point without a valid value.
|
||||
22:23 _vectors.embedderName. EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors."embedderName EXISTS"#), @r#"
|
||||
The quotes in one of the values are inconsistent.
|
||||
10:30 _vectors."embedderName EXISTS
|
||||
"#);
|
||||
insta::assert_snapshot!(p(r#"_vectors."embedderNam"e EXISTS"#), @r#"
|
||||
The vector filter has leftover tokens.
|
||||
23:31 _vectors."embedderNam"e EXISTS
|
||||
"#);
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.documentTemplate. EXISTS"#), @r"
|
||||
Was expecting one of `.fragments`, `.userProvided`, `.documentTemplate`, `.regenerate` or nothing, but instead found a point without a valid value.
|
||||
39:40 _vectors.embedderName.documentTemplate. EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.fragments EXISTS"#), @r"
|
||||
The vector filter is missing a fragment name.
|
||||
32:39 _vectors.embedderName.fragments EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.fragments. EXISTS"#), @r"
|
||||
The vector filter's fragment name is invalid.
|
||||
33:40 _vectors.embedderName.fragments. EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.fragments.test test EXISTS"#), @r"
|
||||
Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.
|
||||
38:49 _vectors.embedderName.fragments.test test EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.fragments. test EXISTS"#), @r"
|
||||
The vector filter's fragment name is invalid.
|
||||
33:45 _vectors.embedderName.fragments. test EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName .fragments. test EXISTS"#), @r"
|
||||
Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.
|
||||
23:46 _vectors.embedderName .fragments. test EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName .fragments.test EXISTS"#), @r"
|
||||
Was expecting an operation like `EXISTS` or `NOT EXISTS` after the vector filter.
|
||||
23:45 _vectors.embedderName .fragments.test EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.fargments.test EXISTS"#), @r"
|
||||
Was expecting one of `fragments`, `userProvided`, `documentTemplate`, `regenerate` or nothing, but instead found `fargments`. Did you mean `fragments`?
|
||||
23:32 _vectors.embedderName.fargments.test EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName."userProvided" EXISTS"#), @r#"
|
||||
Was expecting this part to be unquoted.
|
||||
24:36 _vectors.embedderName."userProvided" EXISTS
|
||||
"#);
|
||||
insta::assert_snapshot!(p(r#"_vectors.embedderName.userProvided.fragments.test EXISTS"#), @r"
|
||||
Vector filter can only accept one of `fragments`, `userProvided`, `documentTemplate` or `regenerate`, but found both `userProvided` and `fragments`.
|
||||
36:45 _vectors.embedderName.userProvided.fragments.test EXISTS
|
||||
");
|
||||
|
||||
insta::assert_snapshot!(p(r#"NOT OR EXISTS AND EXISTS NOT EXISTS"#), @r###"
|
||||
Was expecting a value but instead got `OR`, which is a reserved keyword. To use `OR` as a field name or a value, surround it by quotes.
|
||||
5:7 NOT OR EXISTS AND EXISTS NOT EXISTS
|
||||
"###);
|
||||
|
||||
insta::assert_snapshot!(p(r#"value NULL"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value NULL`.
|
||||
insta::assert_snapshot!(p(r#"value NULL"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NULL`.
|
||||
1:11 value NULL
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value NOT NULL"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value NOT NULL`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value NOT NULL"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NOT NULL`.
|
||||
1:15 value NOT NULL
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value EMPTY"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value EMPTY`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value EMPTY"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value EMPTY`.
|
||||
1:12 value EMPTY
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value NOT EMPTY"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value NOT EMPTY`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value NOT EMPTY"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NOT EMPTY`.
|
||||
1:16 value NOT EMPTY
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value IS"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value IS`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value IS"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS`.
|
||||
1:9 value IS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value IS NOT"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value IS NOT`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value IS NOT"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT`.
|
||||
1:13 value IS NOT
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value IS EXISTS"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value IS EXISTS`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value IS EXISTS"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS EXISTS`.
|
||||
1:16 value IS EXISTS
|
||||
");
|
||||
insta::assert_snapshot!(p(r#"value IS NOT EXISTS"#), @r"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, `_geoBoundingBox` or `_geoPolygon` at `value IS NOT EXISTS`.
|
||||
"###);
|
||||
insta::assert_snapshot!(p(r#"value IS NOT EXISTS"#), @r###"
|
||||
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT EXISTS`.
|
||||
1:20 value IS NOT EXISTS
|
||||
");
|
||||
"###);
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
||||
@@ -52,7 +52,7 @@ fn quoted_by(quote: char, input: Span) -> IResult<Token> {
|
||||
}
|
||||
|
||||
// word = (alphanumeric | _ | - | .)+ except for reserved keywords
|
||||
pub fn word_not_keyword<'a>(input: Span<'a>) -> IResult<'a, Token<'a>> {
|
||||
pub fn word_not_keyword<'a>(input: Span<'a>) -> IResult<Token<'a>> {
|
||||
let (input, word): (_, Token<'a>) =
|
||||
take_while1(is_value_component)(input).map(|(s, t)| (s, t.into()))?;
|
||||
if is_keyword(word.value()) {
|
||||
@@ -80,51 +80,6 @@ pub fn word_exact<'a, 'b: 'a>(tag: &'b str) -> impl Fn(Span<'a>) -> IResult<'a,
|
||||
}
|
||||
}
|
||||
|
||||
/// vector_value = ( non_dot_word | singleQuoted | doubleQuoted)
|
||||
pub fn parse_vector_value(input: Span) -> IResult<Token> {
|
||||
pub fn non_dot_word(input: Span) -> IResult<Token> {
|
||||
let (input, word) = take_while1(|c| is_value_component(c) && c != '.')(input)?;
|
||||
Ok((input, word.into()))
|
||||
}
|
||||
|
||||
let (input, value) = alt((
|
||||
delimited(char('\''), cut(|input| quoted_by('\'', input)), cut(char('\''))),
|
||||
delimited(char('"'), cut(|input| quoted_by('"', input)), cut(char('"'))),
|
||||
non_dot_word,
|
||||
))(input)?;
|
||||
|
||||
match unescaper::unescape(value.value()) {
|
||||
Ok(content) => {
|
||||
if content.len() != value.value().len() {
|
||||
Ok((input, Token::new(value.original_span(), Some(content))))
|
||||
} else {
|
||||
Ok((input, value))
|
||||
}
|
||||
}
|
||||
Err(unescaper::Error::IncompleteStr(_)) => Err(nom::Err::Incomplete(nom::Needed::Unknown)),
|
||||
Err(unescaper::Error::ParseIntError { .. }) => Err(nom::Err::Error(Error::new_from_kind(
|
||||
value.original_span(),
|
||||
ErrorKind::InvalidEscapedNumber,
|
||||
))),
|
||||
Err(unescaper::Error::InvalidChar { .. }) => Err(nom::Err::Error(Error::new_from_kind(
|
||||
value.original_span(),
|
||||
ErrorKind::MalformedValue,
|
||||
))),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn parse_vector_value_cut<'a>(input: Span<'a>, kind: ErrorKind<'a>) -> IResult<'a, Token<'a>> {
|
||||
parse_vector_value(input).map_err(|e| match e {
|
||||
nom::Err::Failure(e) => match e.kind() {
|
||||
ErrorKind::Char(c) if *c == '"' || *c == '\'' => {
|
||||
crate::Error::failure_from_kind(input, ErrorKind::VectorFilterInvalidQuotes)
|
||||
}
|
||||
_ => crate::Error::failure_from_kind(input, kind),
|
||||
},
|
||||
_ => crate::Error::failure_from_kind(input, kind),
|
||||
})
|
||||
}
|
||||
|
||||
/// value = WS* ( word | singleQuoted | doubleQuoted) WS+
|
||||
pub fn parse_value(input: Span) -> IResult<Token> {
|
||||
// to get better diagnostic message we are going to strip the left whitespaces from the input right now
|
||||
@@ -144,21 +99,31 @@ pub fn parse_value(input: Span) -> IResult<Token> {
|
||||
}
|
||||
|
||||
match parse_geo_radius(input) {
|
||||
Ok(_) => return Err(Error::failure_from_kind(input, ErrorKind::MisusedGeoRadius)),
|
||||
Ok(_) => {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::MisusedGeoRadius)))
|
||||
}
|
||||
// if we encountered a failure it means the user badly wrote a _geoRadius filter.
|
||||
// But instead of showing them how to fix his syntax we are going to tell them they should not use this filter as a value.
|
||||
Err(e) if e.is_failure() => {
|
||||
return Err(Error::failure_from_kind(input, ErrorKind::MisusedGeoRadius))
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::MisusedGeoRadius)))
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
|
||||
match parse_geo_bounding_box(input) {
|
||||
Ok(_) => return Err(Error::failure_from_kind(input, ErrorKind::MisusedGeoBoundingBox)),
|
||||
Ok(_) => {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(
|
||||
input,
|
||||
ErrorKind::MisusedGeoBoundingBox,
|
||||
)))
|
||||
}
|
||||
// if we encountered a failure it means the user badly wrote a _geoBoundingBox filter.
|
||||
// But instead of showing them how to fix his syntax we are going to tell them they should not use this filter as a value.
|
||||
Err(e) if e.is_failure() => {
|
||||
return Err(Error::failure_from_kind(input, ErrorKind::MisusedGeoBoundingBox))
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(
|
||||
input,
|
||||
ErrorKind::MisusedGeoBoundingBox,
|
||||
)))
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
|
||||
@@ -16,7 +16,7 @@ license.workspace = true
|
||||
serde_json = "1.0"
|
||||
|
||||
[dev-dependencies]
|
||||
criterion = { version = "0.6.0", features = ["html_reports"] }
|
||||
criterion = { version = "0.5.1", features = ["html_reports"] }
|
||||
|
||||
[[bench]]
|
||||
name = "benchmarks"
|
||||
|
||||
@@ -12,11 +12,11 @@ license.workspace = true
|
||||
|
||||
[dependencies]
|
||||
arbitrary = { version = "1.4.1", features = ["derive"] }
|
||||
bumpalo = "3.18.1"
|
||||
clap = { version = "4.5.40", features = ["derive"] }
|
||||
either = "1.15.0"
|
||||
bumpalo = "3.16.0"
|
||||
clap = { version = "4.5.24", features = ["derive"] }
|
||||
either = "1.13.0"
|
||||
fastrand = "2.3.0"
|
||||
milli = { path = "../milli" }
|
||||
serde = { version = "1.0.219", features = ["derive"] }
|
||||
serde_json = { version = "1.0.140", features = ["preserve_order"] }
|
||||
tempfile = "3.20.0"
|
||||
serde = { version = "1.0.217", features = ["derive"] }
|
||||
serde_json = { version = "1.0.135", features = ["preserve_order"] }
|
||||
tempfile = "3.15.0"
|
||||
|
||||
@@ -13,7 +13,7 @@ use milli::heed::EnvOpenOptions;
|
||||
use milli::progress::Progress;
|
||||
use milli::update::new::indexer;
|
||||
use milli::update::IndexerConfig;
|
||||
use milli::vector::RuntimeEmbedders;
|
||||
use milli::vector::EmbeddingConfigs;
|
||||
use milli::Index;
|
||||
use serde_json::Value;
|
||||
use tempfile::TempDir;
|
||||
@@ -57,8 +57,7 @@ fn main() {
|
||||
let opt = opt.clone();
|
||||
|
||||
let handle = std::thread::spawn(move || {
|
||||
let options = EnvOpenOptions::new();
|
||||
let mut options = options.read_txn_without_tls();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(1024 * 1024 * 1024 * 1024);
|
||||
let tempdir = match opt.path {
|
||||
Some(path) => TempDir::new_in(path).unwrap(),
|
||||
@@ -89,7 +88,7 @@ fn main() {
|
||||
let mut new_fields_ids_map = db_fields_ids_map.clone();
|
||||
|
||||
let indexer_alloc = Bump::new();
|
||||
let embedders = RuntimeEmbedders::default();
|
||||
let embedders = EmbeddingConfigs::default();
|
||||
let mut indexer = indexer::DocumentOperation::new();
|
||||
|
||||
let mut operations = Vec::new();
|
||||
@@ -129,7 +128,6 @@ fn main() {
|
||||
&mut new_fields_ids_map,
|
||||
&|| false,
|
||||
Progress::default(),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
@@ -145,7 +143,6 @@ fn main() {
|
||||
embedders,
|
||||
&|| false,
|
||||
&Progress::default(),
|
||||
&Default::default(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
|
||||
@@ -11,31 +11,29 @@ edition.workspace = true
|
||||
license.workspace = true
|
||||
|
||||
[dependencies]
|
||||
anyhow = "1.0.98"
|
||||
anyhow = "1.0.95"
|
||||
bincode = "1.3.3"
|
||||
byte-unit = "5.1.6"
|
||||
bumpalo = "3.18.1"
|
||||
bumpalo = "3.16.0"
|
||||
bumparaw-collections = "0.1.4"
|
||||
convert_case = "0.8.0"
|
||||
convert_case = "0.6.0"
|
||||
csv = "1.3.1"
|
||||
derive_builder = "0.20.2"
|
||||
dump = { path = "../dump" }
|
||||
enum-iterator = "2.1.0"
|
||||
file-store = { path = "../file-store" }
|
||||
flate2 = "1.1.2"
|
||||
indexmap = "2.9.0"
|
||||
flate2 = "1.0.35"
|
||||
meilisearch-auth = { path = "../meilisearch-auth" }
|
||||
meilisearch-types = { path = "../meilisearch-types" }
|
||||
memmap2 = "0.9.7"
|
||||
memmap2 = "0.9.5"
|
||||
page_size = "0.6.0"
|
||||
rayon = "1.10.0"
|
||||
roaring = { version = "0.10.12", features = ["serde"] }
|
||||
serde = { version = "1.0.219", features = ["derive"] }
|
||||
serde_json = { version = "1.0.140", features = ["preserve_order"] }
|
||||
roaring = { version = "0.10.10", features = ["serde"] }
|
||||
serde = { version = "1.0.217", features = ["derive"] }
|
||||
serde_json = { version = "1.0.135", features = ["preserve_order"] }
|
||||
synchronoise = "1.0.1"
|
||||
tempfile = "3.20.0"
|
||||
thiserror = "2.0.12"
|
||||
time = { version = "0.3.41", features = [
|
||||
tempfile = "3.15.0"
|
||||
thiserror = "2.0.9"
|
||||
time = { version = "0.3.37", features = [
|
||||
"serde-well-known",
|
||||
"formatting",
|
||||
"parsing",
|
||||
@@ -43,12 +41,12 @@ time = { version = "0.3.41", features = [
|
||||
] }
|
||||
tracing = "0.1.41"
|
||||
ureq = "2.12.1"
|
||||
uuid = { version = "1.17.0", features = ["serde", "v4"] }
|
||||
backoff = "0.4.0"
|
||||
uuid = { version = "1.11.0", features = ["serde", "v4"] }
|
||||
|
||||
[dev-dependencies]
|
||||
arroy = "0.5.0"
|
||||
big_s = "1.0.2"
|
||||
crossbeam-channel = "0.5.15"
|
||||
crossbeam-channel = "0.5.14"
|
||||
# fixed version due to format breakages in v1.40
|
||||
insta = { version = "=1.39.0", features = ["json", "redactions"] }
|
||||
maplit = "1.0.2"
|
||||
|
||||
@@ -1,12 +1,8 @@
|
||||
#![allow(clippy::result_large_err)]
|
||||
|
||||
use std::collections::HashMap;
|
||||
use std::io;
|
||||
|
||||
use dump::{KindDump, TaskDump, UpdateFile};
|
||||
use meilisearch_types::batches::{Batch, BatchId};
|
||||
use meilisearch_types::heed::RwTxn;
|
||||
use meilisearch_types::index_uid_pattern::IndexUidPattern;
|
||||
use meilisearch_types::milli;
|
||||
use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
|
||||
use roaring::RoaringBitmap;
|
||||
@@ -18,15 +14,9 @@ pub struct Dump<'a> {
|
||||
index_scheduler: &'a IndexScheduler,
|
||||
wtxn: RwTxn<'a>,
|
||||
|
||||
batch_to_task_mapping: HashMap<BatchId, RoaringBitmap>,
|
||||
|
||||
indexes: HashMap<String, RoaringBitmap>,
|
||||
statuses: HashMap<Status, RoaringBitmap>,
|
||||
kinds: HashMap<Kind, RoaringBitmap>,
|
||||
|
||||
batch_indexes: HashMap<String, RoaringBitmap>,
|
||||
batch_statuses: HashMap<Status, RoaringBitmap>,
|
||||
batch_kinds: HashMap<Kind, RoaringBitmap>,
|
||||
}
|
||||
|
||||
impl<'a> Dump<'a> {
|
||||
@@ -37,72 +27,12 @@ impl<'a> Dump<'a> {
|
||||
Ok(Dump {
|
||||
index_scheduler,
|
||||
wtxn,
|
||||
batch_to_task_mapping: HashMap::new(),
|
||||
indexes: HashMap::new(),
|
||||
statuses: HashMap::new(),
|
||||
kinds: HashMap::new(),
|
||||
batch_indexes: HashMap::new(),
|
||||
batch_statuses: HashMap::new(),
|
||||
batch_kinds: HashMap::new(),
|
||||
})
|
||||
}
|
||||
|
||||
/// Register a new batch coming from a dump in the scheduler.
|
||||
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
|
||||
pub fn register_dumped_batch(&mut self, batch: Batch) -> Result<()> {
|
||||
self.index_scheduler.queue.batches.all_batches.put(&mut self.wtxn, &batch.uid, &batch)?;
|
||||
if let Some(enqueued_at) = batch.enqueued_at {
|
||||
utils::insert_task_datetime(
|
||||
&mut self.wtxn,
|
||||
self.index_scheduler.queue.batches.enqueued_at,
|
||||
enqueued_at.earliest,
|
||||
batch.uid,
|
||||
)?;
|
||||
utils::insert_task_datetime(
|
||||
&mut self.wtxn,
|
||||
self.index_scheduler.queue.batches.enqueued_at,
|
||||
enqueued_at.oldest,
|
||||
batch.uid,
|
||||
)?;
|
||||
}
|
||||
utils::insert_task_datetime(
|
||||
&mut self.wtxn,
|
||||
self.index_scheduler.queue.batches.started_at,
|
||||
batch.started_at,
|
||||
batch.uid,
|
||||
)?;
|
||||
if let Some(finished_at) = batch.finished_at {
|
||||
utils::insert_task_datetime(
|
||||
&mut self.wtxn,
|
||||
self.index_scheduler.queue.batches.finished_at,
|
||||
finished_at,
|
||||
batch.uid,
|
||||
)?;
|
||||
}
|
||||
|
||||
for index in batch.stats.index_uids.keys() {
|
||||
match self.batch_indexes.get_mut(index) {
|
||||
Some(bitmap) => {
|
||||
bitmap.insert(batch.uid);
|
||||
}
|
||||
None => {
|
||||
let mut bitmap = RoaringBitmap::new();
|
||||
bitmap.insert(batch.uid);
|
||||
self.batch_indexes.insert(index.to_string(), bitmap);
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
for status in batch.stats.status.keys() {
|
||||
self.batch_statuses.entry(*status).or_default().insert(batch.uid);
|
||||
}
|
||||
for kind in batch.stats.types.keys() {
|
||||
self.batch_kinds.entry(*kind).or_default().insert(batch.uid);
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Register a new task coming from a dump in the scheduler.
|
||||
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
|
||||
pub fn register_dumped_task(
|
||||
@@ -149,7 +79,6 @@ impl<'a> Dump<'a> {
|
||||
canceled_by: task.canceled_by,
|
||||
details: task.details,
|
||||
status: task.status,
|
||||
network: task.network,
|
||||
kind: match task.kind {
|
||||
KindDump::DocumentImport {
|
||||
primary_key,
|
||||
@@ -200,10 +129,9 @@ impl<'a> Dump<'a> {
|
||||
index_uid: task.index_uid.ok_or(Error::CorruptedDump)?,
|
||||
primary_key,
|
||||
},
|
||||
KindDump::IndexUpdate { primary_key, uid } => KindWithContent::IndexUpdate {
|
||||
KindDump::IndexUpdate { primary_key } => KindWithContent::IndexUpdate {
|
||||
index_uid: task.index_uid.ok_or(Error::CorruptedDump)?,
|
||||
primary_key,
|
||||
new_index_uid: uid,
|
||||
},
|
||||
KindDump::IndexSwap { swaps } => KindWithContent::IndexSwap { swaps },
|
||||
KindDump::TaskCancelation { query, tasks } => {
|
||||
@@ -216,34 +144,11 @@ impl<'a> Dump<'a> {
|
||||
KindWithContent::DumpCreation { keys, instance_uid }
|
||||
}
|
||||
KindDump::SnapshotCreation => KindWithContent::SnapshotCreation,
|
||||
KindDump::Export { url, api_key, payload_size, indexes } => {
|
||||
KindWithContent::Export {
|
||||
url,
|
||||
api_key,
|
||||
payload_size,
|
||||
indexes: indexes
|
||||
.into_iter()
|
||||
.map(|(pattern, settings)| {
|
||||
Ok((
|
||||
IndexUidPattern::try_from(pattern)
|
||||
.map_err(|_| Error::CorruptedDump)?,
|
||||
settings,
|
||||
))
|
||||
})
|
||||
.collect::<Result<_, Error>>()?,
|
||||
}
|
||||
}
|
||||
KindDump::UpgradeDatabase { from } => KindWithContent::UpgradeDatabase { from },
|
||||
KindDump::IndexCompaction { index_uid } => {
|
||||
KindWithContent::IndexCompaction { index_uid }
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
self.index_scheduler.queue.tasks.all_tasks.put(&mut self.wtxn, &task.uid, &task)?;
|
||||
if let Some(batch_id) = task.batch_uid {
|
||||
self.batch_to_task_mapping.entry(batch_id).or_default().insert(task.uid);
|
||||
}
|
||||
|
||||
for index in task.indexes() {
|
||||
match self.indexes.get_mut(index) {
|
||||
@@ -293,14 +198,6 @@ impl<'a> Dump<'a> {
|
||||
|
||||
/// Commit all the changes and exit the importing dump state
|
||||
pub fn finish(mut self) -> Result<()> {
|
||||
for (batch_id, task_ids) in self.batch_to_task_mapping {
|
||||
self.index_scheduler.queue.batch_to_tasks_mapping.put(
|
||||
&mut self.wtxn,
|
||||
&batch_id,
|
||||
&task_ids,
|
||||
)?;
|
||||
}
|
||||
|
||||
for (index, bitmap) in self.indexes {
|
||||
self.index_scheduler.queue.tasks.index_tasks.put(&mut self.wtxn, &index, &bitmap)?;
|
||||
}
|
||||
@@ -311,16 +208,6 @@ impl<'a> Dump<'a> {
|
||||
self.index_scheduler.queue.tasks.put_kind(&mut self.wtxn, kind, &bitmap)?;
|
||||
}
|
||||
|
||||
for (index, bitmap) in self.batch_indexes {
|
||||
self.index_scheduler.queue.batches.index_tasks.put(&mut self.wtxn, &index, &bitmap)?;
|
||||
}
|
||||
for (status, bitmap) in self.batch_statuses {
|
||||
self.index_scheduler.queue.batches.put_status(&mut self.wtxn, status, &bitmap)?;
|
||||
}
|
||||
for (kind, bitmap) in self.batch_kinds {
|
||||
self.index_scheduler.queue.batches.put_kind(&mut self.wtxn, kind, &bitmap)?;
|
||||
}
|
||||
|
||||
self.wtxn.commit()?;
|
||||
self.index_scheduler.scheduler.wake_up.signal();
|
||||
|
||||
|
||||
@@ -2,7 +2,6 @@ use std::fmt::Display;
|
||||
|
||||
use meilisearch_types::batches::BatchId;
|
||||
use meilisearch_types::error::{Code, ErrorCode};
|
||||
use meilisearch_types::milli::index::RollbackOutcome;
|
||||
use meilisearch_types::tasks::{Kind, Status};
|
||||
use meilisearch_types::{heed, milli};
|
||||
use thiserror::Error;
|
||||
@@ -67,8 +66,6 @@ pub enum Error {
|
||||
SwapDuplicateIndexesFound(Vec<String>),
|
||||
#[error("Index `{0}` not found.")]
|
||||
SwapIndexNotFound(String),
|
||||
#[error("Cannot rename `{0}` to `{1}` as the index already exists. Hint: You can remove `{1}` first and then do your remove.")]
|
||||
SwapIndexFoundDuringRename(String, String),
|
||||
#[error("Meilisearch cannot receive write operations because the limit of the task database has been reached. Please delete tasks to continue performing write operations.")]
|
||||
NoSpaceLeftInTaskQueue,
|
||||
#[error(
|
||||
@@ -76,10 +73,6 @@ pub enum Error {
|
||||
.0.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", ")
|
||||
)]
|
||||
SwapIndexesNotFound(Vec<String>),
|
||||
#[error("The following indexes are being renamed but cannot because their new name conflicts with an already existing index: {}. Renaming doesn't overwrite the other index name.",
|
||||
.0.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", ")
|
||||
)]
|
||||
SwapIndexesFoundDuringRename(Vec<String>),
|
||||
#[error("Corrupted dump.")]
|
||||
CorruptedDump,
|
||||
#[error(
|
||||
@@ -116,8 +109,6 @@ pub enum Error {
|
||||
InvalidIndexUid { index_uid: String },
|
||||
#[error("Task `{0}` not found.")]
|
||||
TaskNotFound(TaskId),
|
||||
#[error("Task `{0}` does not contain any documents. Only `documentAdditionOrUpdate` tasks with the statuses `enqueued` or `processing` contain documents")]
|
||||
TaskFileNotFound(TaskId),
|
||||
#[error("Batch `{0}` not found.")]
|
||||
BatchNotFound(BatchId),
|
||||
#[error("Query parameters to filter the tasks to delete are missing. Available query parameters are: `uids`, `indexUids`, `statuses`, `types`, `canceledBy`, `beforeEnqueuedAt`, `afterEnqueuedAt`, `beforeStartedAt`, `afterStartedAt`, `beforeFinishedAt`, `afterFinishedAt`.")]
|
||||
@@ -136,8 +127,8 @@ pub enum Error {
|
||||
_ => format!("{error}")
|
||||
})]
|
||||
Milli { error: milli::Error, index_uid: Option<String> },
|
||||
#[error("An unexpected crash occurred when processing the task: {0}")]
|
||||
ProcessBatchPanicked(String),
|
||||
#[error("An unexpected crash occurred when processing the task.")]
|
||||
ProcessBatchPanicked,
|
||||
#[error(transparent)]
|
||||
FileStore(#[from] file_store::Error),
|
||||
#[error(transparent)]
|
||||
@@ -158,27 +149,7 @@ pub enum Error {
|
||||
#[error(transparent)]
|
||||
DatabaseUpgrade(Box<Self>),
|
||||
#[error(transparent)]
|
||||
Export(Box<Self>),
|
||||
#[error("Failed to export documents to remote server {code} ({type}): {message} <{link}>")]
|
||||
FromRemoteWhenExporting { message: String, code: String, r#type: String, link: String },
|
||||
#[error("Failed to rollback for index `{index}`: {rollback_outcome} ")]
|
||||
RollbackFailed { index: String, rollback_outcome: RollbackOutcome },
|
||||
#[error(transparent)]
|
||||
UnrecoverableError(Box<Self>),
|
||||
#[error("The index scheduler is in version v{}.{}.{}, but Meilisearch is in version v{}.{}.{}.\n - hint: start the correct version of Meilisearch, or consider updating your database. See also <https://www.meilisearch.com/docs/learn/update_and_migration/updating>",
|
||||
index_scheduler_version.0, index_scheduler_version.1, index_scheduler_version.2,
|
||||
package_version.0, package_version.1, package_version.2)]
|
||||
IndexSchedulerVersionMismatch {
|
||||
index_scheduler_version: (u32, u32, u32),
|
||||
package_version: (u32, u32, u32),
|
||||
},
|
||||
#[error("Index `{index}` is in version v{}.{}.{}, but Meilisearch is in version v{}.{}.{}.\n - note: this is an internal error, please consider filing a bug report: <https://github.com/meilisearch/meilisearch/issues/new?template=bug_report.md>",
|
||||
index_version.0, index_version.1, index_version.2, package_version.0, package_version.1, package_version.2)]
|
||||
IndexVersionMismatch {
|
||||
index: String,
|
||||
index_version: (u32, u32, u32),
|
||||
package_version: (u32, u32, u32),
|
||||
},
|
||||
#[error(transparent)]
|
||||
HeedTransaction(heed::Error),
|
||||
|
||||
@@ -209,8 +180,6 @@ impl Error {
|
||||
| Error::SwapIndexNotFound(_)
|
||||
| Error::NoSpaceLeftInTaskQueue
|
||||
| Error::SwapIndexesNotFound(_)
|
||||
| Error::SwapIndexFoundDuringRename(_, _)
|
||||
| Error::SwapIndexesFoundDuringRename(_)
|
||||
| Error::CorruptedDump
|
||||
| Error::InvalidTaskDate { .. }
|
||||
| Error::InvalidTaskUid { .. }
|
||||
@@ -220,29 +189,23 @@ impl Error {
|
||||
| Error::InvalidTaskCanceledBy { .. }
|
||||
| Error::InvalidIndexUid { .. }
|
||||
| Error::TaskNotFound(_)
|
||||
| Error::TaskFileNotFound(_)
|
||||
| Error::BatchNotFound(_)
|
||||
| Error::TaskDeletionWithEmptyQuery
|
||||
| Error::TaskCancelationWithEmptyQuery
|
||||
| Error::FromRemoteWhenExporting { .. }
|
||||
| Error::AbortedTask
|
||||
| Error::Dump(_)
|
||||
| Error::Heed(_)
|
||||
| Error::Milli { .. }
|
||||
| Error::ProcessBatchPanicked(_)
|
||||
| Error::ProcessBatchPanicked
|
||||
| Error::FileStore(_)
|
||||
| Error::IoError(_)
|
||||
| Error::Persist(_)
|
||||
| Error::FeatureNotEnabled(_)
|
||||
| Error::Export(_)
|
||||
| Error::Anyhow(_) => true,
|
||||
Error::CreateBatch(_)
|
||||
| Error::CorruptedTaskQueue
|
||||
| Error::DatabaseUpgrade(_)
|
||||
| Error::UnrecoverableError(_)
|
||||
| Error::IndexSchedulerVersionMismatch { .. }
|
||||
| Error::IndexVersionMismatch { .. }
|
||||
| Error::RollbackFailed { .. }
|
||||
| Error::HeedTransaction(_) => false,
|
||||
#[cfg(test)]
|
||||
Error::PlannedFailure => false,
|
||||
@@ -279,8 +242,6 @@ impl ErrorCode for Error {
|
||||
Error::SwapDuplicateIndexFound(_) => Code::InvalidSwapDuplicateIndexFound,
|
||||
Error::SwapIndexNotFound(_) => Code::IndexNotFound,
|
||||
Error::SwapIndexesNotFound(_) => Code::IndexNotFound,
|
||||
Error::SwapIndexFoundDuringRename(_, _) => Code::IndexAlreadyExists,
|
||||
Error::SwapIndexesFoundDuringRename(_) => Code::IndexAlreadyExists,
|
||||
Error::InvalidTaskDate { field, .. } => (*field).into(),
|
||||
Error::InvalidTaskUid { .. } => Code::InvalidTaskUids,
|
||||
Error::InvalidBatchUid { .. } => Code::InvalidBatchUids,
|
||||
@@ -289,7 +250,6 @@ impl ErrorCode for Error {
|
||||
Error::InvalidTaskCanceledBy { .. } => Code::InvalidTaskCanceledBy,
|
||||
Error::InvalidIndexUid { .. } => Code::InvalidIndexUid,
|
||||
Error::TaskNotFound(_) => Code::TaskNotFound,
|
||||
Error::TaskFileNotFound(_) => Code::TaskFileNotFound,
|
||||
Error::BatchNotFound(_) => Code::BatchNotFound,
|
||||
Error::TaskDeletionWithEmptyQuery => Code::MissingTaskFilters,
|
||||
Error::TaskCancelationWithEmptyQuery => Code::MissingTaskFilters,
|
||||
@@ -297,8 +257,7 @@ impl ErrorCode for Error {
|
||||
Error::NoSpaceLeftInTaskQueue => Code::NoSpaceLeftOnDevice,
|
||||
Error::Dump(e) => e.error_code(),
|
||||
Error::Milli { error, .. } => error.error_code(),
|
||||
Error::ProcessBatchPanicked(_) => Code::Internal,
|
||||
Error::FromRemoteWhenExporting { .. } => Code::Internal,
|
||||
Error::ProcessBatchPanicked => Code::Internal,
|
||||
Error::Heed(e) => e.error_code(),
|
||||
Error::HeedTransaction(e) => e.error_code(),
|
||||
Error::FileStore(e) => e.error_code(),
|
||||
@@ -311,11 +270,7 @@ impl ErrorCode for Error {
|
||||
Error::CorruptedTaskQueue => Code::Internal,
|
||||
Error::CorruptedDump => Code::Internal,
|
||||
Error::DatabaseUpgrade(_) => Code::Internal,
|
||||
Error::Export(_) => Code::Internal,
|
||||
Error::RollbackFailed { .. } => Code::Internal,
|
||||
Error::UnrecoverableError(_) => Code::Internal,
|
||||
Error::IndexSchedulerVersionMismatch { .. } => Code::Internal,
|
||||
Error::IndexVersionMismatch { .. } => Code::Internal,
|
||||
Error::CreateBatch(_) => Code::Internal,
|
||||
|
||||
// This one should never be seen by the end user
|
||||
|
||||
@@ -1,9 +1,8 @@
|
||||
use std::sync::{Arc, RwLock};
|
||||
|
||||
use meilisearch_types::enterprise_edition::network::Network;
|
||||
use meilisearch_types::features::{InstanceTogglableFeatures, RuntimeTogglableFeatures};
|
||||
use meilisearch_types::heed::types::{SerdeJson, Str};
|
||||
use meilisearch_types::heed::{Database, Env, RwTxn, WithoutTls};
|
||||
use meilisearch_types::heed::{Database, Env, RwTxn};
|
||||
|
||||
use crate::error::FeatureNotEnabledError;
|
||||
use crate::Result;
|
||||
@@ -15,16 +14,10 @@ mod db_name {
|
||||
pub const EXPERIMENTAL_FEATURES: &str = "experimental-features";
|
||||
}
|
||||
|
||||
mod db_keys {
|
||||
pub const EXPERIMENTAL_FEATURES: &str = "experimental-features";
|
||||
pub const NETWORK: &str = "network";
|
||||
}
|
||||
|
||||
#[derive(Clone)]
|
||||
pub(crate) struct FeatureData {
|
||||
persisted: Database<Str, SerdeJson<RuntimeTogglableFeatures>>,
|
||||
runtime: Arc<RwLock<RuntimeTogglableFeatures>>,
|
||||
network: Arc<RwLock<Network>>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
@@ -86,91 +79,13 @@ impl RoFeatures {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action: "Using `CONTAINS` in a filter",
|
||||
disabled_action: "Using `CONTAINS` or `STARTS WITH` in a filter",
|
||||
feature: "contains filter",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/763",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_network(&self, disabled_action: &'static str) -> Result<()> {
|
||||
if self.runtime.network {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action,
|
||||
feature: "network",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/805",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_get_task_documents_route(&self) -> Result<()> {
|
||||
if self.runtime.get_task_documents_route {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action: "Getting the documents of an enqueued task",
|
||||
feature: "get task documents route",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/808",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_composite_embedders(&self, disabled_action: &'static str) -> Result<()> {
|
||||
if self.runtime.composite_embedders {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action,
|
||||
feature: "composite embedders",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/816",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_chat_completions(&self, disabled_action: &'static str) -> Result<()> {
|
||||
if self.runtime.chat_completions {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action,
|
||||
feature: "chat completions",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/835",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_multimodal(&self, disabled_action: &'static str) -> Result<()> {
|
||||
if self.runtime.multimodal {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action,
|
||||
feature: "multimodal",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/846",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
|
||||
pub fn check_vector_store_setting(&self, disabled_action: &'static str) -> Result<()> {
|
||||
if self.runtime.vector_store_setting {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(FeatureNotEnabledError {
|
||||
disabled_action,
|
||||
feature: "vector_store_setting",
|
||||
issue_link: "https://github.com/orgs/meilisearch/discussions/860",
|
||||
}
|
||||
.into())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl FeatureData {
|
||||
@@ -179,7 +94,7 @@ impl FeatureData {
|
||||
}
|
||||
|
||||
pub fn new(
|
||||
env: &Env<WithoutTls>,
|
||||
env: &Env,
|
||||
wtxn: &mut RwTxn,
|
||||
instance_features: InstanceTogglableFeatures,
|
||||
) -> Result<Self> {
|
||||
@@ -187,7 +102,7 @@ impl FeatureData {
|
||||
env.create_database(wtxn, Some(db_name::EXPERIMENTAL_FEATURES))?;
|
||||
|
||||
let persisted_features: RuntimeTogglableFeatures =
|
||||
runtime_features_db.get(wtxn, db_keys::EXPERIMENTAL_FEATURES)?.unwrap_or_default();
|
||||
runtime_features_db.get(wtxn, db_name::EXPERIMENTAL_FEATURES)?.unwrap_or_default();
|
||||
let InstanceTogglableFeatures { metrics, logs_route, contains_filter } = instance_features;
|
||||
let runtime = Arc::new(RwLock::new(RuntimeTogglableFeatures {
|
||||
metrics: metrics || persisted_features.metrics,
|
||||
@@ -196,15 +111,7 @@ impl FeatureData {
|
||||
..persisted_features
|
||||
}));
|
||||
|
||||
// Once this is stabilized, network should be stored along with webhooks in index-scheduler's persisted database
|
||||
let network_db = runtime_features_db.remap_data_type::<SerdeJson<Network>>();
|
||||
let network: Network = network_db.get(wtxn, db_keys::NETWORK)?.unwrap_or_default();
|
||||
|
||||
Ok(Self {
|
||||
persisted: runtime_features_db,
|
||||
runtime,
|
||||
network: Arc::new(RwLock::new(network)),
|
||||
})
|
||||
Ok(Self { persisted: runtime_features_db, runtime })
|
||||
}
|
||||
|
||||
pub fn put_runtime_features(
|
||||
@@ -212,7 +119,7 @@ impl FeatureData {
|
||||
mut wtxn: RwTxn,
|
||||
features: RuntimeTogglableFeatures,
|
||||
) -> Result<()> {
|
||||
self.persisted.put(&mut wtxn, db_keys::EXPERIMENTAL_FEATURES, &features)?;
|
||||
self.persisted.put(&mut wtxn, db_name::EXPERIMENTAL_FEATURES, &features)?;
|
||||
wtxn.commit()?;
|
||||
|
||||
// safe to unwrap, the lock will only fail if:
|
||||
@@ -233,21 +140,4 @@ impl FeatureData {
|
||||
pub fn features(&self) -> RoFeatures {
|
||||
RoFeatures::new(self)
|
||||
}
|
||||
|
||||
pub fn put_network(&self, mut wtxn: RwTxn, new_network: Network) -> Result<()> {
|
||||
self.persisted.remap_data_type::<SerdeJson<Network>>().put(
|
||||
&mut wtxn,
|
||||
db_keys::NETWORK,
|
||||
&new_network,
|
||||
)?;
|
||||
wtxn.commit()?;
|
||||
|
||||
let mut network = self.network.write().unwrap();
|
||||
*network = new_network;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn network(&self) -> Network {
|
||||
Network::clone(&*self.network.read().unwrap())
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,7 +1,5 @@
|
||||
use std::collections::BTreeMap;
|
||||
use std::env::VarError;
|
||||
use std::path::Path;
|
||||
use std::str::FromStr;
|
||||
use std::time::Duration;
|
||||
|
||||
use meilisearch_types::heed::{EnvClosingEvent, EnvFlags, EnvOpenOptions};
|
||||
@@ -220,9 +218,11 @@ impl IndexMap {
|
||||
uuid: &Uuid,
|
||||
enable_mdb_writemap: bool,
|
||||
map_size_growth: usize,
|
||||
) -> Option<EnvClosingEvent> {
|
||||
let index = self.available.remove(uuid)?;
|
||||
Some(self.close(*uuid, index, enable_mdb_writemap, map_size_growth))
|
||||
) {
|
||||
let Some(index) = self.available.remove(uuid) else {
|
||||
return;
|
||||
};
|
||||
self.close(*uuid, index, enable_mdb_writemap, map_size_growth);
|
||||
}
|
||||
|
||||
fn close(
|
||||
@@ -231,21 +231,14 @@ impl IndexMap {
|
||||
index: Index,
|
||||
enable_mdb_writemap: bool,
|
||||
map_size_growth: usize,
|
||||
) -> EnvClosingEvent {
|
||||
) {
|
||||
let map_size = index.map_size() + map_size_growth;
|
||||
let closing_event = index.prepare_for_closing();
|
||||
let generation = self.next_generation();
|
||||
self.unavailable.insert(
|
||||
uuid,
|
||||
Some(ClosingIndex {
|
||||
uuid,
|
||||
closing_event: closing_event.clone(),
|
||||
enable_mdb_writemap,
|
||||
map_size,
|
||||
generation,
|
||||
}),
|
||||
Some(ClosingIndex { uuid, closing_event, enable_mdb_writemap, map_size, generation }),
|
||||
);
|
||||
closing_event
|
||||
}
|
||||
|
||||
/// Attempts to delete and index.
|
||||
@@ -309,21 +302,9 @@ fn create_or_open_index(
|
||||
map_size: usize,
|
||||
creation: bool,
|
||||
) -> Result<Index> {
|
||||
let options = EnvOpenOptions::new();
|
||||
let mut options = options.read_txn_without_tls();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(clamp_to_page_size(map_size));
|
||||
|
||||
// You can find more details about this experimental
|
||||
// environment variable on the following GitHub discussion:
|
||||
// <https://github.com/orgs/meilisearch/discussions/806>
|
||||
let max_readers = match std::env::var("MEILI_EXPERIMENTAL_INDEX_MAX_READERS") {
|
||||
Ok(value) => u32::from_str(&value).unwrap(),
|
||||
Err(VarError::NotPresent) => 1024,
|
||||
Err(VarError::NotUnicode(value)) => panic!(
|
||||
"Invalid unicode for the `MEILI_EXPERIMENTAL_INDEX_MAX_READERS` env var: {value:?}"
|
||||
),
|
||||
};
|
||||
options.max_readers(max_readers);
|
||||
options.max_readers(1024);
|
||||
if enable_mdb_writemap {
|
||||
unsafe { options.flags(EnvFlags::WRITE_MAP) };
|
||||
}
|
||||
@@ -339,7 +320,7 @@ fn create_or_open_index(
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
|
||||
use meilisearch_types::heed::{Env, WithoutTls};
|
||||
use meilisearch_types::heed::Env;
|
||||
use meilisearch_types::Index;
|
||||
use uuid::Uuid;
|
||||
|
||||
@@ -349,7 +330,7 @@ mod tests {
|
||||
use crate::IndexScheduler;
|
||||
|
||||
impl IndexMapper {
|
||||
fn test() -> (Self, Env<WithoutTls>, IndexSchedulerHandle) {
|
||||
fn test() -> (Self, Env, IndexSchedulerHandle) {
|
||||
let (index_scheduler, handle) = IndexScheduler::test(true, vec![]);
|
||||
(index_scheduler.index_mapper, index_scheduler.env, handle)
|
||||
}
|
||||
|
||||
@@ -4,10 +4,8 @@ use std::time::Duration;
|
||||
use std::{fs, thread};
|
||||
|
||||
use meilisearch_types::heed::types::{SerdeJson, Str};
|
||||
use meilisearch_types::heed::{Database, Env, EnvClosingEvent, RoTxn, RwTxn, WithoutTls};
|
||||
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
|
||||
use meilisearch_types::milli;
|
||||
use meilisearch_types::milli::database_stats::DatabaseStats;
|
||||
use meilisearch_types::milli::index::RollbackOutcome;
|
||||
use meilisearch_types::milli::update::IndexerConfig;
|
||||
use meilisearch_types::milli::{FieldDistribution, Index};
|
||||
use serde::{Deserialize, Serialize};
|
||||
@@ -71,7 +69,7 @@ pub struct IndexMapper {
|
||||
/// Path to the folder where the LMDB environments of each index are.
|
||||
base_path: PathBuf,
|
||||
/// The map size an index is opened with on the first time.
|
||||
pub(crate) index_base_map_size: usize,
|
||||
index_base_map_size: usize,
|
||||
/// The quantity by which the map size of an index is incremented upon reopening, in bytes.
|
||||
index_growth_amount: usize,
|
||||
/// Whether we open a meilisearch index with the MDB_WRITEMAP option or not.
|
||||
@@ -100,25 +98,14 @@ pub enum IndexStatus {
|
||||
/// The statistics that can be computed from an `Index` object.
|
||||
#[derive(Serialize, Deserialize, Debug)]
|
||||
pub struct IndexStats {
|
||||
/// Stats of the documents database.
|
||||
#[serde(default)]
|
||||
pub documents_database_stats: DatabaseStats,
|
||||
|
||||
#[serde(default, skip_serializing)]
|
||||
pub number_of_documents: Option<u64>,
|
||||
|
||||
/// Number of documents in the index.
|
||||
pub number_of_documents: u64,
|
||||
/// Size taken up by the index' DB, in bytes.
|
||||
///
|
||||
/// This includes the size taken by both the used and free pages of the DB, and as the free pages
|
||||
/// are not returned to the disk after a deletion, this number is typically larger than
|
||||
/// `used_database_size` that only includes the size of the used pages.
|
||||
pub database_size: u64,
|
||||
/// Number of embeddings in the index.
|
||||
/// Option: retrocompatible with the stats of the pre-v1.13.0 versions of meilisearch
|
||||
pub number_of_embeddings: Option<u64>,
|
||||
/// Number of embedded documents in the index.
|
||||
/// Option: retrocompatible with the stats of the pre-v1.13.0 versions of meilisearch
|
||||
pub number_of_embedded_documents: Option<u64>,
|
||||
/// Size taken by the used pages of the index' DB, in bytes.
|
||||
///
|
||||
/// As the DB backend does not return to the disk the pages that are not currently used by the DB,
|
||||
@@ -143,12 +130,8 @@ impl IndexStats {
|
||||
///
|
||||
/// - rtxn: a RO transaction for the index, obtained from `Index::read_txn()`.
|
||||
pub fn new(index: &Index, rtxn: &RoTxn) -> milli::Result<Self> {
|
||||
let vector_store_stats = index.vector_store_stats(rtxn)?;
|
||||
Ok(IndexStats {
|
||||
number_of_embeddings: Some(vector_store_stats.number_of_embeddings),
|
||||
number_of_embedded_documents: Some(vector_store_stats.documents.len()),
|
||||
documents_database_stats: index.documents_stats(rtxn)?.unwrap_or_default(),
|
||||
number_of_documents: None,
|
||||
number_of_documents: index.number_of_documents(rtxn)?,
|
||||
database_size: index.on_disk_size()?,
|
||||
used_database_size: index.used_size()?,
|
||||
primary_key: index.primary_key(rtxn)?.map(|s| s.to_string()),
|
||||
@@ -165,7 +148,7 @@ impl IndexMapper {
|
||||
}
|
||||
|
||||
pub fn new(
|
||||
env: &Env<WithoutTls>,
|
||||
env: &Env,
|
||||
wtxn: &mut RwTxn,
|
||||
options: &IndexSchedulerOptions,
|
||||
budget: IndexBudget,
|
||||
@@ -341,24 +324,6 @@ impl IndexMapper {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Closes the specified index.
|
||||
///
|
||||
/// This operation involves closing the underlying environment and so can take a long time to complete.
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// - If the Index corresponding to the passed name is concurrently being deleted/resized or cannot be found in the
|
||||
/// in memory hash map.
|
||||
pub fn close_index(&self, rtxn: &RoTxn, name: &str) -> Result<Option<EnvClosingEvent>> {
|
||||
let uuid = self
|
||||
.index_mapping
|
||||
.get(rtxn, name)?
|
||||
.ok_or_else(|| Error::IndexNotFound(name.to_string()))?;
|
||||
|
||||
// We remove the index from the in-memory index map.
|
||||
Ok(self.index_map.write().unwrap().close_for_resize(&uuid, self.enable_mdb_writemap, 0))
|
||||
}
|
||||
|
||||
/// Return an index, may open it if it wasn't already opened.
|
||||
pub fn index(&self, rtxn: &RoTxn, name: &str) -> Result<Index> {
|
||||
if let Some((current_name, current_index)) =
|
||||
@@ -450,51 +415,6 @@ impl IndexMapper {
|
||||
Ok(index)
|
||||
}
|
||||
|
||||
pub fn rollback_index(
|
||||
&self,
|
||||
rtxn: &RoTxn,
|
||||
name: &str,
|
||||
to: (u32, u32, u32),
|
||||
) -> Result<RollbackOutcome> {
|
||||
// remove any currently updating index to make sure that we aren't keeping a reference to the index somewhere
|
||||
drop(self.currently_updating_index.write().unwrap().take());
|
||||
|
||||
let uuid = self
|
||||
.index_mapping
|
||||
.get(rtxn, name)?
|
||||
.ok_or_else(|| Error::IndexNotFound(name.to_string()))?;
|
||||
|
||||
// take the lock to make sure noone is messing with the indexes while we rollback
|
||||
// this will block any search or other operation, but we are rollbacking so this is probably acceptable.
|
||||
let mut index_map = self.index_map.write().unwrap();
|
||||
|
||||
'close_index: loop {
|
||||
match index_map.get(&uuid) {
|
||||
Available(_) => {
|
||||
index_map.close_for_resize(&uuid, self.enable_mdb_writemap, 0);
|
||||
// index should now be `Closing`; try again
|
||||
continue;
|
||||
}
|
||||
// index already closed
|
||||
Missing => break 'close_index,
|
||||
// closing requested by this thread or another one; wait for closing to complete, then exit
|
||||
Closing(closing_index) => {
|
||||
if closing_index.wait_timeout(Duration::from_secs(100)).is_none() {
|
||||
// release the lock so it doesn't get poisoned
|
||||
drop(index_map);
|
||||
panic!("cannot close index")
|
||||
}
|
||||
break;
|
||||
}
|
||||
BeingDeleted => return Err(Error::IndexNotFound(name.to_string())),
|
||||
};
|
||||
}
|
||||
|
||||
let index_path = self.base_path.join(uuid.to_string());
|
||||
Index::rollback(milli::heed::EnvOpenOptions::new().read_txn_without_tls(), index_path, to)
|
||||
.map_err(|err| crate::Error::from_milli(err, Some(name.to_string())))
|
||||
}
|
||||
|
||||
/// Attempts `f` for each index that exists in the index mapper.
|
||||
///
|
||||
/// It is preferable to use this function rather than a loop that opens all indexes, as a way to avoid having all indexes opened,
|
||||
@@ -544,20 +464,6 @@ impl IndexMapper {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Rename an index.
|
||||
pub fn rename(&self, wtxn: &mut RwTxn, current: &str, new: &str) -> Result<()> {
|
||||
let uuid = self
|
||||
.index_mapping
|
||||
.get(wtxn, current)?
|
||||
.ok_or_else(|| Error::IndexNotFound(current.to_string()))?;
|
||||
if self.index_mapping.get(wtxn, new)?.is_some() {
|
||||
return Err(Error::IndexAlreadyExists(new.to_string()));
|
||||
}
|
||||
self.index_mapping.delete(wtxn, current)?;
|
||||
self.index_mapping.put(wtxn, new, &uuid)?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// The stats of an index.
|
||||
///
|
||||
/// If available in the cache, they are directly returned.
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
use std::collections::BTreeSet;
|
||||
use std::fmt::Write;
|
||||
|
||||
use meilisearch_types::batches::{Batch, BatchEnqueuedAt, BatchStats};
|
||||
use meilisearch_types::batches::Batch;
|
||||
use meilisearch_types::heed::types::{SerdeBincode, SerdeJson, Str};
|
||||
use meilisearch_types::heed::{Database, RoTxn};
|
||||
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
|
||||
@@ -20,22 +20,20 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
|
||||
|
||||
let IndexScheduler {
|
||||
cleanup_enabled: _,
|
||||
experimental_no_edition_2024_for_dumps: _,
|
||||
processing_tasks,
|
||||
env,
|
||||
version,
|
||||
queue,
|
||||
scheduler,
|
||||
persisted,
|
||||
|
||||
index_mapper,
|
||||
features: _,
|
||||
webhooks: _,
|
||||
webhook_url: _,
|
||||
webhook_authorization_header: _,
|
||||
test_breakpoint_sdr: _,
|
||||
planned_failures: _,
|
||||
run_loop_iteration: _,
|
||||
embedders: _,
|
||||
chat_settings: _,
|
||||
} = scheduler;
|
||||
|
||||
let rtxn = env.read_txn().unwrap();
|
||||
@@ -43,8 +41,11 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
|
||||
let mut snap = String::new();
|
||||
|
||||
let indx_sched_version = version.get_version(&rtxn).unwrap();
|
||||
let latest_version =
|
||||
(versioning::VERSION_MAJOR, versioning::VERSION_MINOR, versioning::VERSION_PATCH);
|
||||
let latest_version = (
|
||||
versioning::VERSION_MAJOR.parse().unwrap(),
|
||||
versioning::VERSION_MINOR.parse().unwrap(),
|
||||
versioning::VERSION_PATCH.parse().unwrap(),
|
||||
);
|
||||
if indx_sched_version != Some(latest_version) {
|
||||
snap.push_str(&format!("index scheduler running on version {indx_sched_version:?}\n"));
|
||||
}
|
||||
@@ -62,13 +63,6 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
|
||||
}
|
||||
snap.push_str("\n----------------------------------------------------------------------\n");
|
||||
|
||||
let persisted_db_snapshot = snapshot_persisted_db(&rtxn, persisted);
|
||||
if !persisted_db_snapshot.is_empty() {
|
||||
snap.push_str("### Persisted:\n");
|
||||
snap.push_str(&persisted_db_snapshot);
|
||||
snap.push_str("----------------------------------------------------------------------\n");
|
||||
}
|
||||
|
||||
snap.push_str("### All Tasks:\n");
|
||||
snap.push_str(&snapshot_all_tasks(&rtxn, queue.tasks.all_tasks));
|
||||
snap.push_str("----------------------------------------------------------------------\n");
|
||||
@@ -207,16 +201,6 @@ pub fn snapshot_date_db(rtxn: &RoTxn, db: Database<BEI128, CboRoaringBitmapCodec
|
||||
snap
|
||||
}
|
||||
|
||||
pub fn snapshot_persisted_db(rtxn: &RoTxn, db: &Database<Str, Str>) -> String {
|
||||
let mut snap = String::new();
|
||||
let iter = db.iter(rtxn).unwrap();
|
||||
for next in iter {
|
||||
let (key, value) = next.unwrap();
|
||||
snap.push_str(&format!("{key}: {value}\n"));
|
||||
}
|
||||
snap
|
||||
}
|
||||
|
||||
pub fn snapshot_task(task: &Task) -> String {
|
||||
let mut snap = String::new();
|
||||
let Task {
|
||||
@@ -230,7 +214,6 @@ pub fn snapshot_task(task: &Task) -> String {
|
||||
details,
|
||||
status,
|
||||
kind,
|
||||
network,
|
||||
} = task;
|
||||
snap.push('{');
|
||||
snap.push_str(&format!("uid: {uid}, "));
|
||||
@@ -248,9 +231,6 @@ pub fn snapshot_task(task: &Task) -> String {
|
||||
snap.push_str(&format!("details: {}, ", &snapshot_details(details)));
|
||||
}
|
||||
snap.push_str(&format!("kind: {kind:?}"));
|
||||
if let Some(network) = network {
|
||||
snap.push_str(&format!("network: {network:?}, "))
|
||||
}
|
||||
|
||||
snap.push('}');
|
||||
snap
|
||||
@@ -278,8 +258,8 @@ fn snapshot_details(d: &Details) -> String {
|
||||
Details::SettingsUpdate { settings } => {
|
||||
format!("{{ settings: {settings:?} }}")
|
||||
}
|
||||
Details::IndexInfo { primary_key, new_index_uid, old_index_uid } => {
|
||||
format!("{{ primary_key: {primary_key:?}, old_new_uid: {old_index_uid:?}, new_index_uid: {new_index_uid:?} }}")
|
||||
Details::IndexInfo { primary_key } => {
|
||||
format!("{{ primary_key: {primary_key:?} }}")
|
||||
}
|
||||
Details::DocumentDeletion {
|
||||
provided_ids: received_document_ids,
|
||||
@@ -311,15 +291,9 @@ fn snapshot_details(d: &Details) -> String {
|
||||
Details::IndexSwap { swaps } => {
|
||||
format!("{{ swaps: {swaps:?} }}")
|
||||
}
|
||||
Details::Export { url, api_key, payload_size, indexes } => {
|
||||
format!("{{ url: {url:?}, api_key: {api_key:?}, payload_size: {payload_size:?}, indexes: {indexes:?} }}")
|
||||
}
|
||||
Details::UpgradeDatabase { from, to } => {
|
||||
format!("{{ from: {from:?}, to: {to:?} }}")
|
||||
}
|
||||
Details::IndexCompaction { index_uid, pre_compaction_size, post_compaction_size } => {
|
||||
format!("{{ index_uid: {index_uid:?}, pre_compaction_size: {pre_compaction_size:?}, post_compaction_size: {post_compaction_size:?} }}")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -335,7 +309,6 @@ pub fn snapshot_status(
|
||||
}
|
||||
snap
|
||||
}
|
||||
|
||||
pub fn snapshot_kind(rtxn: &RoTxn, db: Database<SerdeBincode<Kind>, RoaringBitmapCodec>) -> String {
|
||||
let mut snap = String::new();
|
||||
let iter = db.iter(rtxn).unwrap();
|
||||
@@ -356,7 +329,6 @@ pub fn snapshot_index_tasks(rtxn: &RoTxn, db: Database<Str, RoaringBitmapCodec>)
|
||||
}
|
||||
snap
|
||||
}
|
||||
|
||||
pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec>) -> String {
|
||||
let mut snap = String::new();
|
||||
let iter = db.iter(rtxn).unwrap();
|
||||
@@ -369,41 +341,14 @@ pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec
|
||||
|
||||
pub fn snapshot_batch(batch: &Batch) -> String {
|
||||
let mut snap = String::new();
|
||||
let Batch {
|
||||
uid,
|
||||
details,
|
||||
stats,
|
||||
embedder_stats,
|
||||
started_at,
|
||||
finished_at,
|
||||
progress: _,
|
||||
enqueued_at,
|
||||
stop_reason,
|
||||
} = batch;
|
||||
let stats = BatchStats {
|
||||
progress_trace: Default::default(),
|
||||
internal_database_sizes: Default::default(),
|
||||
write_channel_congestion: None,
|
||||
..stats.clone()
|
||||
};
|
||||
let Batch { uid, details, stats, started_at, finished_at, progress: _ } = batch;
|
||||
if let Some(finished_at) = finished_at {
|
||||
assert!(finished_at > started_at);
|
||||
}
|
||||
let BatchEnqueuedAt { earliest, oldest } = enqueued_at.unwrap();
|
||||
assert!(*started_at > earliest);
|
||||
assert!(earliest >= oldest);
|
||||
|
||||
snap.push('{');
|
||||
snap.push_str(&format!("uid: {uid}, "));
|
||||
snap.push_str(&format!("details: {}, ", serde_json::to_string(details).unwrap()));
|
||||
snap.push_str(&format!("stats: {}, ", serde_json::to_string(&stats).unwrap()));
|
||||
if !embedder_stats.skip_serializing() {
|
||||
snap.push_str(&format!(
|
||||
"embedder stats: {}, ",
|
||||
serde_json::to_string(&embedder_stats).unwrap()
|
||||
));
|
||||
}
|
||||
snap.push_str(&format!("stop reason: {}, ", serde_json::to_string(&stop_reason).unwrap()));
|
||||
snap.push_str(&format!("stats: {}, ", serde_json::to_string(stats).unwrap()));
|
||||
snap.push('}');
|
||||
snap
|
||||
}
|
||||
@@ -416,8 +361,7 @@ pub fn snapshot_index_mapper(rtxn: &RoTxn, mapper: &IndexMapper) -> String {
|
||||
let stats = mapper.stats_of(rtxn, &name).unwrap();
|
||||
s.push_str(&format!(
|
||||
"{name}: {{ number_of_documents: {}, field_distribution: {:?} }}\n",
|
||||
stats.documents_database_stats.number_of_entries(),
|
||||
stats.field_distribution
|
||||
stats.number_of_documents, stats.field_distribution
|
||||
));
|
||||
}
|
||||
|
||||
|
||||
@@ -1,6 +1,3 @@
|
||||
// The main Error type is large and boxing the large variant make the pattern matching fails
|
||||
#![allow(clippy::result_large_err)]
|
||||
|
||||
/*!
|
||||
This crate defines the index scheduler, which is responsible for:
|
||||
1. Keeping references to meilisearch's indexes and mapping them to their
|
||||
@@ -36,7 +33,7 @@ mod test_utils;
|
||||
pub mod upgrade;
|
||||
mod utils;
|
||||
pub mod uuid_codec;
|
||||
pub mod versioning;
|
||||
mod versioning;
|
||||
|
||||
pub type Result<T, E = Error> = std::result::Result<T, E>;
|
||||
pub type TaskId = u32;
|
||||
@@ -54,31 +51,22 @@ pub use features::RoFeatures;
|
||||
use flate2::bufread::GzEncoder;
|
||||
use flate2::Compression;
|
||||
use meilisearch_types::batches::Batch;
|
||||
use meilisearch_types::enterprise_edition::network::Network;
|
||||
use meilisearch_types::features::{
|
||||
ChatCompletionSettings, InstanceTogglableFeatures, RuntimeTogglableFeatures,
|
||||
};
|
||||
use meilisearch_types::features::{InstanceTogglableFeatures, RuntimeTogglableFeatures};
|
||||
use meilisearch_types::heed::byteorder::BE;
|
||||
use meilisearch_types::heed::types::{DecodeIgnore, SerdeJson, Str, I128};
|
||||
use meilisearch_types::heed::{self, Database, Env, RoTxn, WithoutTls};
|
||||
use meilisearch_types::heed::types::I128;
|
||||
use meilisearch_types::heed::{self, Env, RoTxn};
|
||||
use meilisearch_types::milli::index::IndexEmbeddingConfig;
|
||||
use meilisearch_types::milli::update::IndexerConfig;
|
||||
use meilisearch_types::milli::vector::json_template::JsonTemplate;
|
||||
use meilisearch_types::milli::vector::{
|
||||
Embedder, EmbedderOptions, RuntimeEmbedder, RuntimeEmbedders, RuntimeFragment,
|
||||
};
|
||||
use meilisearch_types::milli::vector::{Embedder, EmbedderOptions, EmbeddingConfigs};
|
||||
use meilisearch_types::milli::{self, Index};
|
||||
use meilisearch_types::task_view::TaskView;
|
||||
use meilisearch_types::tasks::{KindWithContent, Task, TaskNetwork};
|
||||
use meilisearch_types::webhooks::{Webhook, WebhooksDumpView, WebhooksView};
|
||||
use milli::vector::db::IndexEmbeddingConfig;
|
||||
use meilisearch_types::tasks::{KindWithContent, Task};
|
||||
use processing::ProcessingTasks;
|
||||
pub use queue::Query;
|
||||
use queue::Queue;
|
||||
use roaring::RoaringBitmap;
|
||||
use scheduler::Scheduler;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use time::OffsetDateTime;
|
||||
use uuid::Uuid;
|
||||
use versioning::Versioning;
|
||||
|
||||
use crate::index_mapper::IndexMapper;
|
||||
@@ -86,17 +74,6 @@ use crate::utils::clamp_to_page_size;
|
||||
|
||||
pub(crate) type BEI128 = I128<BE>;
|
||||
|
||||
const TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT: u64 = 40;
|
||||
|
||||
mod db_name {
|
||||
pub const CHAT_SETTINGS: &str = "chat-settings";
|
||||
pub const PERSISTED: &str = "persisted";
|
||||
}
|
||||
|
||||
mod db_keys {
|
||||
pub const WEBHOOKS: &str = "webhooks";
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct IndexSchedulerOptions {
|
||||
/// The path to the version file of Meilisearch.
|
||||
@@ -113,10 +90,10 @@ pub struct IndexSchedulerOptions {
|
||||
pub snapshots_path: PathBuf,
|
||||
/// The path to the folder containing the dumps.
|
||||
pub dumps_path: PathBuf,
|
||||
/// The webhook url that was set by the CLI.
|
||||
pub cli_webhook_url: Option<String>,
|
||||
/// The Authorization header to send to the webhook URL that was set by the CLI.
|
||||
pub cli_webhook_authorization: Option<String>,
|
||||
/// The URL on which we must send the tasks statuses
|
||||
pub webhook_url: Option<String>,
|
||||
/// The value we will send into the Authorization HTTP header on the webhook URL
|
||||
pub webhook_authorization_header: Option<String>,
|
||||
/// The maximum size, in bytes, of the task index.
|
||||
pub task_db_size: usize,
|
||||
/// The size, in bytes, with which a meilisearch index is opened the first time of each meilisearch index.
|
||||
@@ -148,19 +125,13 @@ pub struct IndexSchedulerOptions {
|
||||
pub instance_features: InstanceTogglableFeatures,
|
||||
/// The experimental features enabled for this instance.
|
||||
pub auto_upgrade: bool,
|
||||
/// The maximal number of entries in the search query cache of an embedder.
|
||||
///
|
||||
/// 0 disables the cache.
|
||||
pub embedding_cache_cap: usize,
|
||||
/// Snapshot compaction status.
|
||||
pub experimental_no_snapshot_compaction: bool,
|
||||
}
|
||||
|
||||
/// Structure which holds meilisearch's indexes and schedules the tasks
|
||||
/// to be performed on them.
|
||||
pub struct IndexScheduler {
|
||||
/// The LMDB environment which the DBs are associated with.
|
||||
pub(crate) env: Env<WithoutTls>,
|
||||
pub(crate) env: Env,
|
||||
|
||||
/// The list of tasks currently processing
|
||||
pub(crate) processing_tasks: Arc<RwLock<ProcessingTasks>>,
|
||||
@@ -174,29 +145,17 @@ pub struct IndexScheduler {
|
||||
/// In charge of fetching and setting the status of experimental features.
|
||||
features: features::FeatureData,
|
||||
|
||||
/// Stores the custom chat prompts and other settings of the indexes.
|
||||
pub(crate) chat_settings: Database<Str, SerdeJson<ChatCompletionSettings>>,
|
||||
|
||||
/// Everything related to the processing of the tasks
|
||||
pub scheduler: scheduler::Scheduler,
|
||||
|
||||
/// Whether we should automatically cleanup the task queue or not.
|
||||
pub(crate) cleanup_enabled: bool,
|
||||
|
||||
/// Whether we should use the old document indexer or the new one.
|
||||
pub(crate) experimental_no_edition_2024_for_dumps: bool,
|
||||
/// The webhook url we should send tasks to after processing every batches.
|
||||
pub(crate) webhook_url: Option<String>,
|
||||
/// The Authorization header to send to the webhook URL.
|
||||
pub(crate) webhook_authorization_header: Option<String>,
|
||||
|
||||
/// A database to store single-keyed data that is persisted across restarts.
|
||||
persisted: Database<Str, Str>,
|
||||
|
||||
/// Webhook, loaded and stored in the `persisted` database
|
||||
webhooks: Arc<Webhooks>,
|
||||
|
||||
/// A map to retrieve the runtime representation of an embedder depending on its configuration.
|
||||
///
|
||||
/// This map may return the same embedder object for two different indexes or embedder settings,
|
||||
/// but it will only do this if the embedder configuration options are the same, leading
|
||||
/// to the same embeddings for the same input text.
|
||||
embedders: Arc<RwLock<HashMap<EmbedderOptions, Arc<Embedder>>>>,
|
||||
|
||||
// ================= test
|
||||
@@ -229,10 +188,8 @@ impl IndexScheduler {
|
||||
|
||||
index_mapper: self.index_mapper.clone(),
|
||||
cleanup_enabled: self.cleanup_enabled,
|
||||
experimental_no_edition_2024_for_dumps: self.experimental_no_edition_2024_for_dumps,
|
||||
persisted: self.persisted,
|
||||
|
||||
webhooks: self.webhooks.clone(),
|
||||
webhook_url: self.webhook_url.clone(),
|
||||
webhook_authorization_header: self.webhook_authorization_header.clone(),
|
||||
embedders: self.embedders.clone(),
|
||||
#[cfg(test)]
|
||||
test_breakpoint_sdr: self.test_breakpoint_sdr.clone(),
|
||||
@@ -241,24 +198,17 @@ impl IndexScheduler {
|
||||
#[cfg(test)]
|
||||
run_loop_iteration: self.run_loop_iteration.clone(),
|
||||
features: self.features.clone(),
|
||||
chat_settings: self.chat_settings,
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) const fn nb_db() -> u32 {
|
||||
Versioning::nb_db()
|
||||
+ Queue::nb_db()
|
||||
+ IndexMapper::nb_db()
|
||||
+ features::FeatureData::nb_db()
|
||||
+ 1 // chat-prompts
|
||||
+ 1 // persisted
|
||||
Versioning::nb_db() + Queue::nb_db() + IndexMapper::nb_db() + features::FeatureData::nb_db()
|
||||
}
|
||||
|
||||
/// Create an index scheduler and start its run loop.
|
||||
#[allow(private_interfaces)] // because test_utils is private
|
||||
pub fn new(
|
||||
options: IndexSchedulerOptions,
|
||||
auth_env: Env<WithoutTls>,
|
||||
from_db_version: (u32, u32, u32),
|
||||
#[cfg(test)] test_breakpoint_sdr: crossbeam_channel::Sender<(test_utils::Breakpoint, bool)>,
|
||||
#[cfg(test)] planned_failures: Vec<(usize, test_utils::FailureLocation)>,
|
||||
@@ -290,9 +240,7 @@ impl IndexScheduler {
|
||||
};
|
||||
|
||||
let env = unsafe {
|
||||
let env_options = heed::EnvOpenOptions::new();
|
||||
let mut env_options = env_options.read_txn_without_tls();
|
||||
env_options
|
||||
heed::EnvOpenOptions::new()
|
||||
.max_dbs(Self::nb_db())
|
||||
.map_size(budget.task_db_size)
|
||||
.open(&options.tasks_path)
|
||||
@@ -302,18 +250,9 @@ impl IndexScheduler {
|
||||
let version = versioning::Versioning::new(&env, from_db_version)?;
|
||||
|
||||
let mut wtxn = env.write_txn()?;
|
||||
|
||||
let features = features::FeatureData::new(&env, &mut wtxn, options.instance_features)?;
|
||||
let queue = Queue::new(&env, &mut wtxn, &options)?;
|
||||
let index_mapper = IndexMapper::new(&env, &mut wtxn, &options, budget)?;
|
||||
let chat_settings = env.create_database(&mut wtxn, Some(db_name::CHAT_SETTINGS))?;
|
||||
|
||||
let persisted = env.create_database(&mut wtxn, Some(db_name::PERSISTED))?;
|
||||
let webhooks_db = persisted.remap_data_type::<SerdeJson<Webhooks>>();
|
||||
let mut webhooks = webhooks_db.get(&wtxn, db_keys::WEBHOOKS)?.unwrap_or_default();
|
||||
webhooks
|
||||
.with_cli(options.cli_webhook_url.clone(), options.cli_webhook_authorization.clone());
|
||||
|
||||
wtxn.commit()?;
|
||||
|
||||
// allow unreachable_code to get rids of the warning in the case of a test build.
|
||||
@@ -321,16 +260,13 @@ impl IndexScheduler {
|
||||
processing_tasks: Arc::new(RwLock::new(ProcessingTasks::new())),
|
||||
version,
|
||||
queue,
|
||||
scheduler: Scheduler::new(&options, auth_env),
|
||||
scheduler: Scheduler::new(&options),
|
||||
|
||||
index_mapper,
|
||||
env,
|
||||
cleanup_enabled: options.cleanup_enabled,
|
||||
experimental_no_edition_2024_for_dumps: options
|
||||
.indexer_config
|
||||
.experimental_no_edition_2024_for_dumps,
|
||||
persisted,
|
||||
webhooks: Arc::new(webhooks),
|
||||
webhook_url: options.webhook_url,
|
||||
webhook_authorization_header: options.webhook_authorization_header,
|
||||
embedders: Default::default(),
|
||||
|
||||
#[cfg(test)]
|
||||
@@ -340,17 +276,12 @@ impl IndexScheduler {
|
||||
#[cfg(test)]
|
||||
run_loop_iteration: Arc::new(RwLock::new(0)),
|
||||
features,
|
||||
chat_settings,
|
||||
};
|
||||
|
||||
this.run();
|
||||
Ok(this)
|
||||
}
|
||||
|
||||
fn read_txn(&self) -> Result<RoTxn<'_, WithoutTls>> {
|
||||
self.env.read_txn().map_err(|e| e.into())
|
||||
}
|
||||
|
||||
/// Return `Ok(())` if the index scheduler is able to access one of its database.
|
||||
pub fn health(&self) -> Result<()> {
|
||||
let rtxn = self.env.read_txn()?;
|
||||
@@ -427,16 +358,15 @@ impl IndexScheduler {
|
||||
}
|
||||
}
|
||||
|
||||
pub fn read_txn(&self) -> Result<RoTxn> {
|
||||
self.env.read_txn().map_err(|e| e.into())
|
||||
}
|
||||
|
||||
/// Start the run loop for the given index scheduler.
|
||||
///
|
||||
/// This function will execute in a different thread and must be called
|
||||
/// only once per index scheduler.
|
||||
fn run(&self) {
|
||||
// If the number of batched tasks is 0, we don't need to run the scheduler at all.
|
||||
// It will never be able to process any tasks.
|
||||
if self.scheduler.max_number_of_batched_tasks == 0 {
|
||||
return;
|
||||
}
|
||||
let run = self.private_clone();
|
||||
std::thread::Builder::new()
|
||||
.name(String::from("scheduler"))
|
||||
@@ -454,9 +384,9 @@ impl IndexScheduler {
|
||||
Ok(Ok(TickOutcome::StopProcessingForever)) => break,
|
||||
Ok(Err(e)) => {
|
||||
tracing::error!("{e}");
|
||||
// Wait when an irrecoverable error occurs.
|
||||
// Wait one second when an irrecoverable error occurs.
|
||||
if !e.is_recoverable() {
|
||||
std::thread::sleep(Duration::from_secs(10));
|
||||
std::thread::sleep(Duration::from_secs(1));
|
||||
}
|
||||
}
|
||||
Err(_panic) => {
|
||||
@@ -483,17 +413,6 @@ impl IndexScheduler {
|
||||
Ok(self.env.non_free_pages_size()?)
|
||||
}
|
||||
|
||||
/// Return the maximum possible database size
|
||||
pub fn max_size(&self) -> Result<u64> {
|
||||
Ok(self.env.info().map_size as u64)
|
||||
}
|
||||
|
||||
/// Return the max size of task allowed until the task queue stop receiving.
|
||||
pub fn remaining_size_until_task_queue_stop(&self) -> Result<u64> {
|
||||
Ok((self.env.info().map_size as u64 * TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT / 100)
|
||||
.saturating_sub(self.used_size()?))
|
||||
}
|
||||
|
||||
/// Return the index corresponding to the name.
|
||||
///
|
||||
/// * If the index wasn't opened before, the index will be opened.
|
||||
@@ -508,14 +427,12 @@ impl IndexScheduler {
|
||||
/// If you need to fetch information from or perform an action on all indexes,
|
||||
/// see the `try_for_each_index` function.
|
||||
pub fn index(&self, name: &str) -> Result<Index> {
|
||||
let rtxn = self.env.read_txn()?;
|
||||
self.index_mapper.index(&rtxn, name)
|
||||
self.index_mapper.index(&self.env.read_txn()?, name)
|
||||
}
|
||||
|
||||
/// Return the boolean referring if index exists.
|
||||
pub fn index_exists(&self, name: &str) -> Result<bool> {
|
||||
let rtxn = self.env.read_txn()?;
|
||||
self.index_mapper.index_exists(&rtxn, name)
|
||||
self.index_mapper.index_exists(&self.env.read_txn()?, name)
|
||||
}
|
||||
|
||||
/// Return the name of all indexes without opening them.
|
||||
@@ -544,7 +461,7 @@ impl IndexScheduler {
|
||||
|
||||
/// Returns the total number of indexes available for the specified filter.
|
||||
/// And a `Vec` of the index_uid + its stats
|
||||
pub fn paginated_indexes_stats(
|
||||
pub fn get_paginated_indexes_stats(
|
||||
&self,
|
||||
filters: &meilisearch_auth::AuthFilter,
|
||||
from: usize,
|
||||
@@ -585,31 +502,12 @@ impl IndexScheduler {
|
||||
ret.map(|ret| (total, ret))
|
||||
}
|
||||
|
||||
/// Returns the total number of chat workspaces available ~~for the specified filter~~.
|
||||
/// And a `Vec` of the workspace_uids
|
||||
pub fn paginated_chat_workspace_uids(
|
||||
&self,
|
||||
from: usize,
|
||||
limit: usize,
|
||||
) -> Result<(usize, Vec<String>)> {
|
||||
let rtxn = self.read_txn()?;
|
||||
let total = self.chat_settings.len(&rtxn)?;
|
||||
let mut iter = self.chat_settings.iter(&rtxn)?.skip(from);
|
||||
iter.by_ref()
|
||||
.take(limit)
|
||||
.map(|ret| ret.map_err(Error::from))
|
||||
.map(|ret| ret.map(|(uid, _)| uid.to_string()))
|
||||
.collect::<Result<Vec<_>, Error>>()
|
||||
.map(|ret| (total as usize, ret))
|
||||
}
|
||||
|
||||
/// The returned structure contains:
|
||||
/// 1. The name of the property being observed can be `statuses`, `types`, or `indexes`.
|
||||
/// 2. The name of the specific data related to the property can be `enqueued` for the `statuses`, `settingsUpdate` for the `types`, or the name of the index for the `indexes`, for example.
|
||||
/// 3. The number of times the properties appeared.
|
||||
pub fn get_stats(&self) -> Result<BTreeMap<String, BTreeMap<String, u64>>> {
|
||||
let rtxn = self.read_txn()?;
|
||||
self.queue.get_stats(&rtxn, &self.processing_tasks.read().unwrap())
|
||||
self.queue.get_stats(&self.read_txn()?, &self.processing_tasks.read().unwrap())
|
||||
}
|
||||
|
||||
// Return true if there is at least one task that is processing.
|
||||
@@ -627,11 +525,6 @@ impl IndexScheduler {
|
||||
Ok(nbr_index_processing_tasks > 0)
|
||||
}
|
||||
|
||||
/// Whether the index should use the old document indexer.
|
||||
pub fn no_edition_2024_for_dumps(&self) -> bool {
|
||||
self.experimental_no_edition_2024_for_dumps
|
||||
}
|
||||
|
||||
/// Return the tasks matching the query from the user's point of view along
|
||||
/// with the total number of tasks matching the query, ignoring from and limit.
|
||||
///
|
||||
@@ -670,16 +563,6 @@ impl IndexScheduler {
|
||||
self.queue.get_task_ids_from_authorized_indexes(&rtxn, query, filters, &processing)
|
||||
}
|
||||
|
||||
pub fn set_task_network(&self, task_id: TaskId, network: TaskNetwork) -> Result<()> {
|
||||
let mut wtxn = self.env.write_txn()?;
|
||||
let mut task =
|
||||
self.queue.tasks.get_task(&wtxn, task_id)?.ok_or(Error::TaskNotFound(task_id))?;
|
||||
task.network = Some(network);
|
||||
self.queue.tasks.all_tasks.put(&mut wtxn, &task_id, &task)?;
|
||||
wtxn.commit()?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Return the batches matching the query from the user's point of view along
|
||||
/// with the total number of batches matching the query, ignoring from and limit.
|
||||
///
|
||||
@@ -727,10 +610,9 @@ impl IndexScheduler {
|
||||
task_id: Option<TaskId>,
|
||||
dry_run: bool,
|
||||
) -> Result<Task> {
|
||||
// if the task doesn't delete or cancel anything and 40% of the task queue is full, we must refuse to enqueue the incoming task
|
||||
if !matches!(&kind, KindWithContent::TaskDeletion { tasks, .. } | KindWithContent::TaskCancelation { tasks, .. } if !tasks.is_empty())
|
||||
&& (self.env.non_free_pages_size()? * 100) / self.env.info().map_size as u64
|
||||
> TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT
|
||||
// if the task doesn't delete anything and 50% of the task queue is full, we must refuse to enqueue the incomming task
|
||||
if !matches!(&kind, KindWithContent::TaskDeletion { tasks, .. } if !tasks.is_empty())
|
||||
&& (self.env.non_free_pages_size()? * 100) / self.env.info().map_size as u64 > 40
|
||||
{
|
||||
return Err(Error::NoSpaceLeftInTaskQueue);
|
||||
}
|
||||
@@ -760,7 +642,7 @@ impl IndexScheduler {
|
||||
|
||||
/// Register a new task coming from a dump in the scheduler.
|
||||
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
|
||||
pub fn register_dumped_task(&mut self) -> Result<Dump<'_>> {
|
||||
pub fn register_dumped_task(&mut self) -> Result<Dump> {
|
||||
Dump::new(self)
|
||||
}
|
||||
|
||||
@@ -788,90 +670,86 @@ impl IndexScheduler {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Once the tasks changes have been committed we must send all the tasks that were updated to our webhooks
|
||||
fn notify_webhooks(&self, updated: RoaringBitmap) {
|
||||
struct TaskReader<'a, 'b> {
|
||||
rtxn: &'a RoTxn<'a>,
|
||||
index_scheduler: &'a IndexScheduler,
|
||||
tasks: &'b mut roaring::bitmap::Iter<'b>,
|
||||
buffer: Vec<u8>,
|
||||
written: usize,
|
||||
}
|
||||
/// Once the tasks changes have been committed we must send all the tasks that were updated to our webhook if there is one.
|
||||
fn notify_webhook(&self, updated: &RoaringBitmap) -> Result<()> {
|
||||
if let Some(ref url) = self.webhook_url {
|
||||
struct TaskReader<'a, 'b> {
|
||||
rtxn: &'a RoTxn<'a>,
|
||||
index_scheduler: &'a IndexScheduler,
|
||||
tasks: &'b mut roaring::bitmap::Iter<'b>,
|
||||
buffer: Vec<u8>,
|
||||
written: usize,
|
||||
}
|
||||
|
||||
impl Read for TaskReader<'_, '_> {
|
||||
fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> {
|
||||
if self.buffer.is_empty() {
|
||||
match self.tasks.next() {
|
||||
None => return Ok(0),
|
||||
Some(task_id) => {
|
||||
let task = self
|
||||
.index_scheduler
|
||||
.queue
|
||||
.tasks
|
||||
.get_task(self.rtxn, task_id)
|
||||
.map_err(io::Error::other)?
|
||||
.ok_or_else(|| io::Error::other(Error::CorruptedTaskQueue))?;
|
||||
impl<'a, 'b> Read for TaskReader<'a, 'b> {
|
||||
fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> {
|
||||
if self.buffer.is_empty() {
|
||||
match self.tasks.next() {
|
||||
None => return Ok(0),
|
||||
Some(task_id) => {
|
||||
let task = self
|
||||
.index_scheduler
|
||||
.queue
|
||||
.tasks
|
||||
.get_task(self.rtxn, task_id)
|
||||
.map_err(|err| io::Error::new(io::ErrorKind::Other, err))?
|
||||
.ok_or_else(|| {
|
||||
io::Error::new(
|
||||
io::ErrorKind::Other,
|
||||
Error::CorruptedTaskQueue,
|
||||
)
|
||||
})?;
|
||||
|
||||
serde_json::to_writer(&mut self.buffer, &TaskView::from_task(&task))?;
|
||||
self.buffer.push(b'\n');
|
||||
serde_json::to_writer(
|
||||
&mut self.buffer,
|
||||
&TaskView::from_task(&task),
|
||||
)?;
|
||||
self.buffer.push(b'\n');
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let mut to_write = &self.buffer[self.written..];
|
||||
let wrote = io::copy(&mut to_write, &mut buf)?;
|
||||
self.written += wrote as usize;
|
||||
|
||||
// we wrote everything and must refresh our buffer on the next call
|
||||
if self.written == self.buffer.len() {
|
||||
self.written = 0;
|
||||
self.buffer.clear();
|
||||
}
|
||||
|
||||
Ok(wrote as usize)
|
||||
}
|
||||
}
|
||||
|
||||
let mut to_write = &self.buffer[self.written..];
|
||||
let wrote = io::copy(&mut to_write, &mut buf)?;
|
||||
self.written += wrote as usize;
|
||||
let rtxn = self.env.read_txn()?;
|
||||
|
||||
// we wrote everything and must refresh our buffer on the next call
|
||||
if self.written == self.buffer.len() {
|
||||
self.written = 0;
|
||||
self.buffer.clear();
|
||||
}
|
||||
let task_reader = TaskReader {
|
||||
rtxn: &rtxn,
|
||||
index_scheduler: self,
|
||||
tasks: &mut updated.into_iter(),
|
||||
buffer: Vec::with_capacity(50), // on average a task is around ~100 bytes
|
||||
written: 0,
|
||||
};
|
||||
|
||||
Ok(wrote as usize)
|
||||
// let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
|
||||
let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
|
||||
let request = ureq::post(url)
|
||||
.timeout(Duration::from_secs(30))
|
||||
.set("Content-Encoding", "gzip")
|
||||
.set("Content-Type", "application/x-ndjson");
|
||||
let request = match &self.webhook_authorization_header {
|
||||
Some(header) => request.set("Authorization", header),
|
||||
None => request,
|
||||
};
|
||||
|
||||
if let Err(e) = request.send(reader) {
|
||||
tracing::error!("While sending data to the webhook: {e}");
|
||||
}
|
||||
}
|
||||
|
||||
let webhooks = self.webhooks.get_all();
|
||||
if webhooks.is_empty() {
|
||||
return;
|
||||
}
|
||||
let this = self.private_clone();
|
||||
// We must take the RoTxn before entering the thread::spawn otherwise another batch may be
|
||||
// processed before we had the time to take our txn.
|
||||
let rtxn = match self.env.clone().static_read_txn() {
|
||||
Ok(rtxn) => rtxn,
|
||||
Err(e) => {
|
||||
tracing::error!("Couldn't get an rtxn to notify the webhook: {e}");
|
||||
return;
|
||||
}
|
||||
};
|
||||
|
||||
std::thread::spawn(move || {
|
||||
for (uuid, Webhook { url, headers }) in webhooks.iter() {
|
||||
let task_reader = TaskReader {
|
||||
rtxn: &rtxn,
|
||||
index_scheduler: &this,
|
||||
tasks: &mut updated.iter(),
|
||||
buffer: Vec::with_capacity(page_size::get()),
|
||||
written: 0,
|
||||
};
|
||||
|
||||
let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
|
||||
|
||||
let mut request = ureq::post(url)
|
||||
.timeout(Duration::from_secs(30))
|
||||
.set("Content-Encoding", "gzip")
|
||||
.set("Content-Type", "application/x-ndjson");
|
||||
for (header_name, header_value) in headers.iter() {
|
||||
request = request.set(header_name, header_value);
|
||||
}
|
||||
|
||||
if let Err(e) = request.send(reader) {
|
||||
tracing::error!("While sending data to the webhook {uuid}: {e}");
|
||||
}
|
||||
}
|
||||
});
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn index_stats(&self, index_uid: &str) -> Result<IndexStats> {
|
||||
@@ -892,85 +770,40 @@ impl IndexScheduler {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn put_network(&self, network: Network) -> Result<()> {
|
||||
let wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?;
|
||||
self.features.put_network(wtxn, network)?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn network(&self) -> Network {
|
||||
self.features.network()
|
||||
}
|
||||
|
||||
pub fn update_runtime_webhooks(&self, runtime: RuntimeWebhooks) -> Result<()> {
|
||||
let webhooks = Webhooks::from_runtime(runtime);
|
||||
let mut wtxn = self.env.write_txn()?;
|
||||
let webhooks_db = self.persisted.remap_data_type::<SerdeJson<Webhooks>>();
|
||||
webhooks_db.put(&mut wtxn, db_keys::WEBHOOKS, &webhooks)?;
|
||||
wtxn.commit()?;
|
||||
self.webhooks.update_runtime(webhooks.into_runtime());
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn webhooks_dump_view(&self) -> WebhooksDumpView {
|
||||
// We must not dump the cli api key
|
||||
WebhooksDumpView { webhooks: self.webhooks.get_runtime() }
|
||||
}
|
||||
|
||||
pub fn webhooks_view(&self) -> WebhooksView {
|
||||
WebhooksView { webhooks: self.webhooks.get_all() }
|
||||
}
|
||||
|
||||
pub fn retrieve_runtime_webhooks(&self) -> RuntimeWebhooks {
|
||||
self.webhooks.get_runtime()
|
||||
}
|
||||
|
||||
// TODO: consider using a type alias or a struct embedder/template
|
||||
pub fn embedders(
|
||||
&self,
|
||||
index_uid: String,
|
||||
embedding_configs: Vec<IndexEmbeddingConfig>,
|
||||
) -> Result<RuntimeEmbedders> {
|
||||
) -> Result<EmbeddingConfigs> {
|
||||
let res: Result<_> = embedding_configs
|
||||
.into_iter()
|
||||
.map(
|
||||
|IndexEmbeddingConfig {
|
||||
name,
|
||||
config: milli::vector::EmbeddingConfig { embedder_options, prompt, quantized },
|
||||
fragments,
|
||||
}|
|
||||
-> Result<(String, Arc<RuntimeEmbedder>)> {
|
||||
let document_template = prompt
|
||||
.try_into()
|
||||
.map_err(meilisearch_types::milli::Error::from)
|
||||
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
|
||||
|
||||
let fragments = fragments
|
||||
.into_inner()
|
||||
.into_iter()
|
||||
.map(|fragment| {
|
||||
let value = embedder_options.fragment(&fragment.name).unwrap();
|
||||
let template = JsonTemplate::new(value.clone()).unwrap();
|
||||
RuntimeFragment { name: fragment.name, id: fragment.id, template }
|
||||
})
|
||||
.collect();
|
||||
..
|
||||
}| {
|
||||
let prompt = Arc::new(
|
||||
prompt
|
||||
.try_into()
|
||||
.map_err(meilisearch_types::milli::Error::from)
|
||||
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
|
||||
);
|
||||
// optimistically return existing embedder
|
||||
{
|
||||
let embedders = self.embedders.read().unwrap();
|
||||
if let Some(embedder) = embedders.get(&embedder_options) {
|
||||
let runtime = Arc::new(RuntimeEmbedder::new(
|
||||
embedder.clone(),
|
||||
document_template,
|
||||
fragments,
|
||||
quantized.unwrap_or_default(),
|
||||
return Ok((
|
||||
name,
|
||||
(embedder.clone(), prompt, quantized.unwrap_or_default()),
|
||||
));
|
||||
|
||||
return Ok((name, runtime));
|
||||
}
|
||||
}
|
||||
|
||||
// add missing embedder
|
||||
let embedder = Arc::new(
|
||||
Embedder::new(embedder_options.clone(), self.scheduler.embedding_cache_cap)
|
||||
Embedder::new(embedder_options.clone())
|
||||
.map_err(meilisearch_types::milli::vector::Error::from)
|
||||
.map_err(|err| {
|
||||
Error::from_milli(err.into(), Some(index_uid.clone()))
|
||||
@@ -980,44 +813,11 @@ impl IndexScheduler {
|
||||
let mut embedders = self.embedders.write().unwrap();
|
||||
embedders.insert(embedder_options, embedder.clone());
|
||||
}
|
||||
|
||||
let runtime = Arc::new(RuntimeEmbedder::new(
|
||||
embedder.clone(),
|
||||
document_template,
|
||||
fragments,
|
||||
quantized.unwrap_or_default(),
|
||||
));
|
||||
|
||||
Ok((name, runtime))
|
||||
Ok((name, (embedder, prompt, quantized.unwrap_or_default())))
|
||||
},
|
||||
)
|
||||
.collect();
|
||||
res.map(RuntimeEmbedders::new)
|
||||
}
|
||||
|
||||
pub fn chat_settings(&self, uid: &str) -> Result<Option<ChatCompletionSettings>> {
|
||||
let rtxn = self.env.read_txn()?;
|
||||
self.chat_settings.get(&rtxn, uid).map_err(Into::into)
|
||||
}
|
||||
|
||||
/// Return true if chat workspace exists.
|
||||
pub fn chat_workspace_exists(&self, name: &str) -> Result<bool> {
|
||||
let rtxn = self.env.read_txn()?;
|
||||
Ok(self.chat_settings.remap_data_type::<DecodeIgnore>().get(&rtxn, name)?.is_some())
|
||||
}
|
||||
|
||||
pub fn put_chat_settings(&self, uid: &str, settings: &ChatCompletionSettings) -> Result<()> {
|
||||
let mut wtxn = self.env.write_txn()?;
|
||||
self.chat_settings.put(&mut wtxn, uid, settings)?;
|
||||
wtxn.commit()?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
pub fn delete_chat_settings(&self, uid: &str) -> Result<bool> {
|
||||
let mut wtxn = self.env.write_txn()?;
|
||||
let deleted = self.chat_settings.delete(&mut wtxn, uid)?;
|
||||
wtxn.commit()?;
|
||||
Ok(deleted)
|
||||
res.map(EmbeddingConfigs::new)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1053,72 +853,3 @@ pub struct IndexStats {
|
||||
/// Internal stats computed from the index.
|
||||
pub inner_stats: index_mapper::IndexStats,
|
||||
}
|
||||
|
||||
/// These structure are not meant to be exposed to the end user, if needed, use the meilisearch-types::webhooks structure instead.
|
||||
/// /!\ Everytime you deserialize this structure you should fill the cli_webhook later on with the `with_cli` method. /!\
|
||||
#[derive(Debug, Serialize, Deserialize, Default)]
|
||||
#[serde(rename_all = "camelCase")]
|
||||
struct Webhooks {
|
||||
// The cli webhook should *never* be stored in a database.
|
||||
// It represent a state that only exists for this execution of meilisearch
|
||||
#[serde(skip)]
|
||||
pub cli: Option<CliWebhook>,
|
||||
|
||||
#[serde(default)]
|
||||
pub runtime: RwLock<RuntimeWebhooks>,
|
||||
}
|
||||
|
||||
type RuntimeWebhooks = BTreeMap<Uuid, Webhook>;
|
||||
|
||||
impl Webhooks {
|
||||
pub fn with_cli(&mut self, url: Option<String>, auth: Option<String>) {
|
||||
if let Some(url) = url {
|
||||
let webhook = CliWebhook { url, auth };
|
||||
self.cli = Some(webhook);
|
||||
}
|
||||
}
|
||||
|
||||
pub fn from_runtime(webhooks: RuntimeWebhooks) -> Self {
|
||||
Self { cli: None, runtime: RwLock::new(webhooks) }
|
||||
}
|
||||
|
||||
pub fn into_runtime(self) -> RuntimeWebhooks {
|
||||
// safe because we own self and it cannot be cloned
|
||||
self.runtime.into_inner().unwrap()
|
||||
}
|
||||
|
||||
pub fn update_runtime(&self, webhooks: RuntimeWebhooks) {
|
||||
*self.runtime.write().unwrap() = webhooks;
|
||||
}
|
||||
|
||||
/// Returns all the webhooks in an unified view. The cli webhook is represented with an uuid set to 0
|
||||
pub fn get_all(&self) -> BTreeMap<Uuid, Webhook> {
|
||||
self.cli
|
||||
.as_ref()
|
||||
.map(|wh| (Uuid::nil(), Webhook::from(wh)))
|
||||
.into_iter()
|
||||
.chain(self.runtime.read().unwrap().iter().map(|(uuid, wh)| (*uuid, wh.clone())))
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Returns all the runtime webhooks.
|
||||
pub fn get_runtime(&self) -> BTreeMap<Uuid, Webhook> {
|
||||
self.runtime.read().unwrap().iter().map(|(uuid, wh)| (*uuid, wh.clone())).collect()
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize, Default, Clone, PartialEq)]
|
||||
struct CliWebhook {
|
||||
pub url: String,
|
||||
pub auth: Option<String>,
|
||||
}
|
||||
|
||||
impl From<&CliWebhook> for Webhook {
|
||||
fn from(webhook: &CliWebhook) -> Self {
|
||||
let mut headers = BTreeMap::new();
|
||||
if let Some(ref auth) = webhook.auth {
|
||||
headers.insert("Authorization".to_string(), auth.to_string());
|
||||
}
|
||||
Self { url: webhook.url.to_string(), headers }
|
||||
}
|
||||
}
|
||||
|
||||
@@ -64,17 +64,9 @@ make_enum_progress! {
|
||||
}
|
||||
}
|
||||
|
||||
make_enum_progress! {
|
||||
pub enum FinalizingIndexStep {
|
||||
Committing,
|
||||
ComputingStats,
|
||||
}
|
||||
}
|
||||
|
||||
make_enum_progress! {
|
||||
pub enum TaskCancelationProgress {
|
||||
RetrievingTasks,
|
||||
CancelingUpgrade,
|
||||
UpdatingTasks,
|
||||
}
|
||||
}
|
||||
@@ -103,12 +95,9 @@ make_enum_progress! {
|
||||
pub enum DumpCreationProgress {
|
||||
StartTheDumpCreation,
|
||||
DumpTheApiKeys,
|
||||
DumpTheChatCompletionSettings,
|
||||
DumpTheTasks,
|
||||
DumpTheBatches,
|
||||
DumpTheIndexes,
|
||||
DumpTheExperimentalFeatures,
|
||||
DumpTheWebhooks,
|
||||
CompressTheDump,
|
||||
}
|
||||
}
|
||||
@@ -138,16 +127,6 @@ make_enum_progress! {
|
||||
}
|
||||
}
|
||||
|
||||
make_enum_progress! {
|
||||
pub enum IndexCompaction {
|
||||
RetrieveTheIndex,
|
||||
CreateTemporaryFile,
|
||||
CopyAndCompactTheIndex,
|
||||
PersistTheCompactedIndex,
|
||||
CloseTheIndex,
|
||||
}
|
||||
}
|
||||
|
||||
make_enum_progress! {
|
||||
pub enum InnerSwappingTwoIndexes {
|
||||
RetrieveTheTasks,
|
||||
@@ -187,17 +166,8 @@ make_enum_progress! {
|
||||
}
|
||||
}
|
||||
|
||||
make_enum_progress! {
|
||||
pub enum Export {
|
||||
EnsuringCorrectnessOfTheTarget,
|
||||
ExportingTheSettings,
|
||||
ExportingTheDocuments,
|
||||
}
|
||||
}
|
||||
|
||||
make_atomic_progress!(Task alias AtomicTaskStep => "task" );
|
||||
make_atomic_progress!(Document alias AtomicDocumentStep => "document" );
|
||||
make_atomic_progress!(Index alias AtomicIndexStep => "index" );
|
||||
make_atomic_progress!(Batch alias AtomicBatchStep => "batch" );
|
||||
make_atomic_progress!(UpdateFile alias AtomicUpdateFileStep => "update file" );
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ use std::ops::{Bound, RangeBounds};
|
||||
|
||||
use meilisearch_types::batches::{Batch, BatchId};
|
||||
use meilisearch_types::heed::types::{DecodeIgnore, SerdeBincode, SerdeJson, Str};
|
||||
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls};
|
||||
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
|
||||
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
|
||||
use meilisearch_types::tasks::{Kind, Status};
|
||||
use roaring::{MultiOps, RoaringBitmap};
|
||||
@@ -12,8 +12,8 @@ use time::OffsetDateTime;
|
||||
use super::{Query, Queue};
|
||||
use crate::processing::ProcessingTasks;
|
||||
use crate::utils::{
|
||||
insert_task_datetime, keep_ids_within_datetimes, map_bound,
|
||||
remove_n_tasks_datetime_earlier_than, remove_task_datetime, ProcessingBatch,
|
||||
insert_task_datetime, keep_ids_within_datetimes, map_bound, remove_task_datetime,
|
||||
ProcessingBatch,
|
||||
};
|
||||
use crate::{Error, Result, BEI128};
|
||||
|
||||
@@ -66,7 +66,7 @@ impl BatchQueue {
|
||||
NUMBER_OF_DATABASES
|
||||
}
|
||||
|
||||
pub(super) fn new(env: &Env<WithoutTls>, wtxn: &mut RwTxn) -> Result<Self> {
|
||||
pub(super) fn new(env: &Env, wtxn: &mut RwTxn) -> Result<Self> {
|
||||
Ok(Self {
|
||||
all_batches: env.create_database(wtxn, Some(db_name::ALL_BATCHES))?,
|
||||
status: env.create_database(wtxn, Some(db_name::BATCH_STATUS))?,
|
||||
@@ -179,11 +179,8 @@ impl BatchQueue {
|
||||
progress: None,
|
||||
details: batch.details,
|
||||
stats: batch.stats,
|
||||
embedder_stats: batch.embedder_stats.as_ref().into(),
|
||||
started_at: batch.started_at,
|
||||
finished_at: batch.finished_at,
|
||||
enqueued_at: batch.enqueued_at,
|
||||
stop_reason: batch.reason.to_string(),
|
||||
},
|
||||
)?;
|
||||
|
||||
@@ -237,25 +234,34 @@ impl BatchQueue {
|
||||
// What we know, though, is that the task date is from before the enqueued_at, and max two timestamps have been written
|
||||
// to the DB per batches.
|
||||
if let Some(ref old_batch) = old_batch {
|
||||
if let Some(enqueued_at) = old_batch.enqueued_at {
|
||||
remove_task_datetime(wtxn, self.enqueued_at, enqueued_at.earliest, old_batch.uid)?;
|
||||
remove_task_datetime(wtxn, self.enqueued_at, enqueued_at.oldest, old_batch.uid)?;
|
||||
} else {
|
||||
// If we don't have the enqueued at in the batch it means the database comes from the v1.12
|
||||
// and we still need to find the date by scrolling the database
|
||||
remove_n_tasks_datetime_earlier_than(
|
||||
wtxn,
|
||||
self.enqueued_at,
|
||||
old_batch.started_at,
|
||||
old_batch.stats.total_nb_tasks.clamp(1, 2) as usize,
|
||||
old_batch.uid,
|
||||
)?;
|
||||
let started_at = old_batch.started_at.unix_timestamp_nanos();
|
||||
|
||||
// We have either one or two enqueued at to remove
|
||||
let mut exit = old_batch.stats.total_nb_tasks.clamp(0, 2);
|
||||
let mut iterator = self.enqueued_at.rev_iter_mut(wtxn)?;
|
||||
while let Some(entry) = iterator.next() {
|
||||
let (key, mut value) = entry?;
|
||||
if key > started_at {
|
||||
continue;
|
||||
}
|
||||
if value.remove(old_batch.uid) {
|
||||
exit = exit.saturating_sub(1);
|
||||
// Safe because the key and value are owned
|
||||
unsafe {
|
||||
iterator.put_current(&key, &value)?;
|
||||
}
|
||||
if exit == 0 {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// A finished batch MUST contains at least one task and have an enqueued_at
|
||||
let enqueued_at = batch.enqueued_at.as_ref().unwrap();
|
||||
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at.earliest, batch.uid)?;
|
||||
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at.oldest, batch.uid)?;
|
||||
if let Some(enqueued_at) = batch.oldest_enqueued_at {
|
||||
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at, batch.uid)?;
|
||||
}
|
||||
if let Some(enqueued_at) = batch.earliest_enqueued_at {
|
||||
insert_task_datetime(wtxn, self.enqueued_at, enqueued_at, batch.uid)?;
|
||||
}
|
||||
|
||||
// Update the started at and finished at
|
||||
if let Some(ref old_batch) = old_batch {
|
||||
@@ -275,27 +281,19 @@ impl BatchQueue {
|
||||
pub(crate) fn get_existing_batches(
|
||||
&self,
|
||||
rtxn: &RoTxn,
|
||||
batches: impl IntoIterator<Item = BatchId>,
|
||||
tasks: impl IntoIterator<Item = BatchId>,
|
||||
processing: &ProcessingTasks,
|
||||
) -> Result<Vec<Batch>> {
|
||||
batches
|
||||
tasks
|
||||
.into_iter()
|
||||
.map(|batch_id| {
|
||||
if Some(batch_id) == processing.batch.as_ref().map(|batch| batch.uid) {
|
||||
let mut batch = processing.batch.as_ref().unwrap().to_batch();
|
||||
batch.progress = processing.get_progress_view();
|
||||
// Add progress_trace from the current progress state
|
||||
if let Some(progress) = &processing.progress {
|
||||
batch.stats.progress_trace = progress
|
||||
.accumulated_durations()
|
||||
.into_iter()
|
||||
.map(|(k, v)| (k, v.into()))
|
||||
.collect();
|
||||
}
|
||||
Ok(batch)
|
||||
} else {
|
||||
self.get_batch(rtxn, batch_id)
|
||||
.and_then(|batch| batch.ok_or(Error::CorruptedTaskQueue))
|
||||
.and_then(|task| task.ok_or(Error::CorruptedTaskQueue))
|
||||
}
|
||||
})
|
||||
.collect::<Result<_>>()
|
||||
|
||||
@@ -102,46 +102,30 @@ fn query_batches_simple() {
|
||||
.unwrap();
|
||||
assert_eq!(batches.len(), 1);
|
||||
batches[0].started_at = OffsetDateTime::UNIX_EPOCH;
|
||||
assert!(batches[0].enqueued_at.is_some());
|
||||
batches[0].enqueued_at = None;
|
||||
|
||||
if !batches[0].stats.progress_trace.is_empty() {
|
||||
batches[0].stats.progress_trace.clear();
|
||||
batches[0]
|
||||
.stats
|
||||
.progress_trace
|
||||
.insert("processing tasks".to_string(), "deterministic_duration".into());
|
||||
}
|
||||
|
||||
// Insta cannot snapshot our batches because the batch stats contains an enum as key: https://github.com/mitsuhiko/insta/issues/689
|
||||
let batch = serde_json::to_string_pretty(&batches[0]).unwrap();
|
||||
snapshot!(batch, @r###"
|
||||
{
|
||||
"uid": 0,
|
||||
"details": {
|
||||
"primaryKey": "mouse"
|
||||
},
|
||||
"stats": {
|
||||
"totalNbTasks": 1,
|
||||
"status": {
|
||||
"processing": 1
|
||||
},
|
||||
"types": {
|
||||
"indexCreation": 1
|
||||
},
|
||||
"indexUids": {
|
||||
"catto": 1
|
||||
},
|
||||
"progressTrace": {
|
||||
"processing tasks": "deterministic_duration"
|
||||
snapshot!(batch, @r#"
|
||||
{
|
||||
"uid": 0,
|
||||
"details": {
|
||||
"primaryKey": "mouse"
|
||||
},
|
||||
"stats": {
|
||||
"totalNbTasks": 1,
|
||||
"status": {
|
||||
"processing": 1
|
||||
},
|
||||
"types": {
|
||||
"indexCreation": 1
|
||||
},
|
||||
"indexUids": {
|
||||
"catto": 1
|
||||
}
|
||||
},
|
||||
"startedAt": "1970-01-01T00:00:00Z",
|
||||
"finishedAt": null
|
||||
}
|
||||
},
|
||||
"startedAt": "1970-01-01T00:00:00Z",
|
||||
"finishedAt": null,
|
||||
"enqueuedAt": null,
|
||||
"stopReason": "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task."
|
||||
}
|
||||
"###);
|
||||
"#);
|
||||
|
||||
let query = Query { statuses: Some(vec![Status::Enqueued]), ..Default::default() };
|
||||
let (batches, _) = index_scheduler
|
||||
@@ -346,11 +330,11 @@ fn query_batches_special_rules() {
|
||||
let kind = index_creation_task("doggo", "sheep");
|
||||
let _task = index_scheduler.register(kind, None, false).unwrap();
|
||||
let kind = KindWithContent::IndexSwap {
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "doggo".to_owned()), rename: false }],
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "doggo".to_owned()) }],
|
||||
};
|
||||
let _task = index_scheduler.register(kind, None, false).unwrap();
|
||||
let kind = KindWithContent::IndexSwap {
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "whalo".to_owned()), rename: false }],
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "whalo".to_owned()) }],
|
||||
};
|
||||
let _task = index_scheduler.register(kind, None, false).unwrap();
|
||||
|
||||
@@ -454,7 +438,7 @@ fn query_batches_canceled_by() {
|
||||
let kind = index_creation_task("doggo", "sheep");
|
||||
let _ = index_scheduler.register(kind, None, false).unwrap();
|
||||
let kind = KindWithContent::IndexSwap {
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "doggo".to_owned()), rename: false }],
|
||||
swaps: vec![IndexSwap { indexes: ("catto".to_owned(), "doggo".to_owned()) }],
|
||||
};
|
||||
let _task = index_scheduler.register(kind, None, false).unwrap();
|
||||
|
||||
|
||||
@@ -8,12 +8,11 @@ mod tasks_test;
|
||||
mod test;
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
use std::fs::File as StdFile;
|
||||
use std::time::Duration;
|
||||
|
||||
use file_store::FileStore;
|
||||
use meilisearch_types::batches::BatchId;
|
||||
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn, WithoutTls};
|
||||
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
|
||||
use meilisearch_types::milli::{CboRoaringBitmapCodec, BEU32};
|
||||
use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
|
||||
use roaring::RoaringBitmap;
|
||||
@@ -157,7 +156,7 @@ impl Queue {
|
||||
|
||||
/// Create an index scheduler and start its run loop.
|
||||
pub(crate) fn new(
|
||||
env: &Env<WithoutTls>,
|
||||
env: &Env,
|
||||
wtxn: &mut RwTxn,
|
||||
options: &IndexSchedulerOptions,
|
||||
) -> Result<Self> {
|
||||
@@ -217,11 +216,6 @@ impl Queue {
|
||||
}
|
||||
}
|
||||
|
||||
/// Open and returns the task's content File.
|
||||
pub fn update_file(&self, uuid: Uuid) -> file_store::Result<StdFile> {
|
||||
self.file_store.get_update(uuid)
|
||||
}
|
||||
|
||||
/// Delete a file from the index scheduler.
|
||||
///
|
||||
/// Counterpart to the [`create_update_file`](IndexScheduler::create_update_file) method.
|
||||
@@ -279,7 +273,6 @@ impl Queue {
|
||||
details: kind.default_details(),
|
||||
status: Status::Enqueued,
|
||||
kind: kind.clone(),
|
||||
network: None,
|
||||
};
|
||||
// For deletion and cancelation tasks, we want to make extra sure that they
|
||||
// don't attempt to delete/cancel tasks that are newer than themselves.
|
||||
@@ -293,6 +286,8 @@ impl Queue {
|
||||
return Ok(task);
|
||||
}
|
||||
|
||||
// Get rid of the mutability.
|
||||
let task = task;
|
||||
self.tasks.register(wtxn, &task)?;
|
||||
|
||||
Ok(task)
|
||||
@@ -310,8 +305,7 @@ impl Queue {
|
||||
| self.tasks.status.get(wtxn, &Status::Failed)?.unwrap_or_default()
|
||||
| self.tasks.status.get(wtxn, &Status::Canceled)?.unwrap_or_default();
|
||||
|
||||
let to_delete =
|
||||
RoaringBitmap::from_sorted_iter(finished.into_iter().take(100_000)).unwrap();
|
||||
let to_delete = RoaringBitmap::from_iter(finished.into_iter().rev().take(100_000));
|
||||
|
||||
// /!\ the len must be at least 2 or else we might enter an infinite loop where we only delete
|
||||
// the deletion tasks we enqueued ourselves.
|
||||
@@ -327,7 +321,7 @@ impl Queue {
|
||||
);
|
||||
|
||||
// it's safe to unwrap here because we checked the len above
|
||||
let newest_task_id = to_delete.iter().next_back().unwrap();
|
||||
let newest_task_id = to_delete.iter().last().unwrap();
|
||||
let last_task_to_delete =
|
||||
self.tasks.get_task(wtxn, newest_task_id)?.ok_or(Error::CorruptedTaskQueue)?;
|
||||
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, batch_uid: 1, status: canceled, canceled_by: 3, details: { primary_key: Some("sheep"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, batch_uid: 1, status: canceled, canceled_by: 3, details: { swaps: [IndexSwap { indexes: ("catto", "doggo"), rename: false }] }, kind: IndexSwap { swaps: [IndexSwap { indexes: ("catto", "doggo"), rename: false }] }}
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, batch_uid: 1, status: canceled, canceled_by: 3, details: { primary_key: Some("sheep") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, batch_uid: 1, status: canceled, canceled_by: 3, details: { swaps: [IndexSwap { indexes: ("catto", "doggo") }] }, kind: IndexSwap { swaps: [IndexSwap { indexes: ("catto", "doggo") }] }}
|
||||
3 {uid: 3, batch_uid: 1, status: succeeded, details: { matched_tasks: 3, canceled_tasks: Some(2), original_filter: "test_query" }, kind: TaskCancelation { query: "test_query", tasks: RoaringBitmap<[0, 1, 2]> }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
@@ -48,8 +49,8 @@ catto: { number_of_documents: 0, field_distribution: {} }
|
||||
[timestamp] [1,2,3,]
|
||||
----------------------------------------------------------------------
|
||||
### All Batches:
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"],"rename":false}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", }
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
|
||||
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, }
|
||||
----------------------------------------------------------------------
|
||||
### Batch to tasks mapping:
|
||||
0 [0,]
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("bone"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, batch_uid: 1, status: succeeded, details: { primary_key: Some("plankton"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
2 {uid: 2, batch_uid: 2, status: succeeded, details: { primary_key: Some("his_own_vomit"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "catto", primary_key: Some("his_own_vomit") }}
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, batch_uid: 1, status: succeeded, details: { primary_key: Some("plankton") }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
2 {uid: 2, batch_uid: 2, status: succeeded, details: { primary_key: Some("his_own_vomit") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("his_own_vomit") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued []
|
||||
@@ -47,9 +48,9 @@ whalo: { number_of_documents: 0, field_distribution: {} }
|
||||
[timestamp] [2,]
|
||||
----------------------------------------------------------------------
|
||||
### All Batches:
|
||||
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
|
||||
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
|
||||
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
|
||||
----------------------------------------------------------------------
|
||||
### Batch to tasks mapping:
|
||||
0 [0,]
|
||||
|
||||
@@ -1,12 +1,13 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued [0,]
|
||||
|
||||
@@ -1,13 +1,14 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("plankton"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("plankton") }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued [0,1,]
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("plankton"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
2 {uid: 2, status: enqueued, details: { primary_key: Some("his_own_vomit"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "catto", primary_key: Some("his_own_vomit") }}
|
||||
0 {uid: 0, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("plankton") }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("plankton") }}
|
||||
2 {uid: 2, status: enqueued, details: { primary_key: Some("his_own_vomit") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("his_own_vomit") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued [0,1,2,]
|
||||
|
||||
@@ -1,15 +1,16 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch Some(1):
|
||||
[1,]
|
||||
{uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
{uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("sheep"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, status: enqueued, details: { primary_key: Some("fish"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("fish") }}
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, status: enqueued, details: { primary_key: Some("sheep") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, status: enqueued, details: { primary_key: Some("fish") }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("fish") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued [1,2,]
|
||||
@@ -42,7 +43,7 @@ catto: { number_of_documents: 0, field_distribution: {} }
|
||||
[timestamp] [0,]
|
||||
----------------------------------------------------------------------
|
||||
### All Batches:
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
|
||||
----------------------------------------------------------------------
|
||||
### Batch to tasks mapping:
|
||||
0 [0,]
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
---
|
||||
source: crates/index-scheduler/src/queue/batches_test.rs
|
||||
snapshot_kind: text
|
||||
---
|
||||
### Autobatching Enabled = true
|
||||
### Processing batch None:
|
||||
[]
|
||||
----------------------------------------------------------------------
|
||||
### All Tasks:
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, batch_uid: 1, status: succeeded, details: { primary_key: Some("sheep"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, batch_uid: 2, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { primary_key: Some("fish"), old_new_uid: None, new_index_uid: None }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("fish") }}
|
||||
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
|
||||
1 {uid: 1, batch_uid: 1, status: succeeded, details: { primary_key: Some("sheep") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("sheep") }}
|
||||
2 {uid: 2, batch_uid: 2, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { primary_key: Some("fish") }, kind: IndexCreation { index_uid: "whalo", primary_key: Some("fish") }}
|
||||
----------------------------------------------------------------------
|
||||
### Status:
|
||||
enqueued []
|
||||
@@ -47,9 +48,9 @@ doggo: { number_of_documents: 0, field_distribution: {} }
|
||||
[timestamp] [2,]
|
||||
----------------------------------------------------------------------
|
||||
### All Batches:
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
|
||||
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
|
||||
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
|
||||
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, }
|
||||
----------------------------------------------------------------------
|
||||
### Batch to tasks mapping:
|
||||
0 [0,]
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user