Compare commits

..

29 Commits

Author SHA1 Message Date
Tamo
c0ea1a2c5a change the version of arroy 2025-02-27 15:51:23 +01:00
meili-bors[bot]
296ca1d58f Merge #5376
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 15s
Test suite / Run tests in debug (push) Failing after 15s
Test suite / Run Clippy (push) Failing after 17s
Test suite / Run Rustfmt (push) Failing after 7m2s
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5376: Support dumpless upgrade for all v1.13 patches r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #5373 

## What does this PR do?
- ensure v1.13 versions are known to the index scheduler upgrading code
- Forbid opening a db of v1.13.x from v1.13.y
- Keep old stat format to make sure the number of documents is available in stats during dumpless upgrade


Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-27 11:02:40 +00:00
Louis Dureuil
d43a1644b1 Keep old stat format to make sure the number of documents is available during dumpless upgrade 2025-02-27 11:57:57 +01:00
Louis Dureuil
8cbcb1476e Forbid opening a db of v1.13.x from v1.13.y 2025-02-27 11:43:58 +01:00
Louis Dureuil
79110bf7b1 Support dumpless upgrade for all v1.13 patches 2025-02-27 10:56:41 +01:00
Louis Dureuil
3d5575a3e9 Update snapshots following version bump 2025-02-27 10:56:25 +01:00
Kerollmops
8524b59e83 Update version for the next release (v1.13.2) in Cargo.toml 2025-02-27 09:40:33 +00:00
meili-bors[bot]
ceec68cf7a Merge #5360
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 16s
Test suite / Tests on ubuntu-20.04 (push) Failing after 15s
Test suite / Run Rustfmt (push) Failing after 20s
Test suite / Run Clippy (push) Failing after 1m23s
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5360: Fix the dumpless upgrade log r=Kerollmops a=Kerollmops

This PR fixes a dump less upgrade log issue where the current and target version was the same value and therefore displayed invalid logs like: _upgrading from v1.12.8 to v1.12.8_.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-26 11:24:10 +00:00
meili-bors[bot]
f296c325ad Merge #5325
5325: Documents database stats r=irevoire a=ManyTheFish

# Pull Request

## Related issue
Fixes #5319

## List

- Create a DatabaseStats struct
- Compute and store the documents database stats in the IndexStats
- Force dumpless upgrade to update the index stats
- when a document addition/modification/deletion is made, we only recompute the database stats on the added/modified/deleted documents

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2025-02-26 10:03:45 +00:00
ManyTheFish
df2fcac36c Fix fmt 2025-02-26 10:35:03 +01:00
Many the fish
5035487208 Update crates/milli/src/index.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-26 10:28:51 +01:00
Many the fish
8ccd090f40 Update crates/milli/src/index.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-26 10:28:25 +01:00
Kerollmops
834995b293 Fix the dumpless upgrade log 2025-02-26 10:14:55 +01:00
meili-bors[bot]
52b8abadf4 Merge #5367
5367: Bump mini-dashboard to v0.2.17 r=curquiza a=Strift

# Pull Request

## Related issue
Fixes #5361 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Strift <lau.cazanove@gmail.com>
2025-02-25 13:28:49 +00:00
Strift
21f7c6f5af Bump 2025-02-25 20:42:43 +08:00
meili-bors[bot]
e937ba90c2 Merge #5346
Some checks failed
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests almost all features (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 2s
Test suite / Run Clippy (push) Failing after 4s
Test suite / Run Rustfmt (push) Failing after 7s
Test suite / Tests on ubuntu-20.04 (push) Failing after 17s
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5346: Hotfix typo tolerance bug r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #5240

## What does this PR do?
- Add a test reproducing the bug
- fix the bug by relying on the exact_word database

## Explanation

The new indexer introduced in V1.12 does not put the exact attributes words in the word FST, but the old indexer was doing it.
So 2 fixes were possible:
1) Add the word from the exact-words database in the FST knowing that they should never be retrieved with a typo
2) Make the search check in the exact-word database in addition to the word FST to know if the word exists

This PR implements the second fix

## Impact of the bug

A word can't be retrieved if it only appears in attributes listed in the `typoTolerance.disableOnAttributes` setting.


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-20 08:42:46 +00:00
meili-bors[bot]
8607a166d0 Merge #5353
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 15s
Test suite / Run Rustfmt (push) Failing after 6s
Test suite / Run Clippy (push) Successful in 10m6s
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5353: Update version for the next release (v1.13.1) in Cargo.toml r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-18 10:57:55 +00:00
Kerollmops
abfce7d9b8 Update the snapshots 2025-02-18 11:31:20 +01:00
Kerollmops
9e94b101ea Update version for the next release (v1.13.1) in Cargo.toml 2025-02-18 09:19:17 +00:00
ManyTheFish
57b26f8441 fix clippy 2025-02-17 16:41:34 +01:00
ManyTheFish
9505f15c85 Dumpless upgrade 2025-02-17 16:37:17 +01:00
ManyTheFish
285c72a960 Update Snapshots 2025-02-17 16:36:58 +01:00
ManyTheFish
9a33628331 Implement Incremental document database stats computing 2025-02-17 16:36:33 +01:00
ManyTheFish
1bd57a9a94 Use checked_div in average computation 2025-02-17 11:02:04 +01:00
ManyTheFish
be676f9977 Fix zero division 2025-02-17 11:01:06 +01:00
ManyTheFish
fa27327db5 fix clippy 2025-02-17 11:01:06 +01:00
ManyTheFish
cd4ba395e4 fix snapshots 2025-02-17 11:01:06 +01:00
ManyTheFish
22bdec7e74 Add document database stats 2025-02-17 11:01:06 +01:00
ManyTheFish
96ba62da36 Check the exact_word database when computing zero typo query 2025-02-13 14:02:53 +01:00
73 changed files with 1121 additions and 2188 deletions

View File

@@ -22,16 +22,6 @@ Related product discussion:
<!---If necessary, create a list with technical/product steps-->
### Reminders when modifying the API
- [ ] Update the openAPI file with utoipa:
- [ ] If a new module has been introduced, create a new structure deriving [the OpenAPI proc-macro](https://docs.rs/utoipa/latest/utoipa/derive.OpenApi.html) and nest it in the main [openAPI structure](https://github.com/meilisearch/meilisearch/blob/f2185438eed60fa32d25b15480c5ee064f6fba4a/crates/meilisearch/src/routes/mod.rs#L64-L78).
- [ ] If a new route has been introduced, add the [path decorator](https://docs.rs/utoipa/latest/utoipa/attr.path.html) to it and add the route at the top of the file in its openAPI structure.
- [ ] If a structure which is deserialized or serialized in the API has been introduced or modified, it must derive the [`schema`](https://docs.rs/utoipa/latest/utoipa/macro.schema.html) or the [`IntoParams`](https://docs.rs/utoipa/latest/utoipa/derive.IntoParams.html) proc-macro.
If it's a **new** structure you must also add it to the big list of structures [in the main `OpenApi` structure](https://github.com/meilisearch/meilisearch/blob/f2185438eed60fa32d25b15480c5ee064f6fba4a/crates/meilisearch/src/routes/mod.rs#L88).
- [ ] Once everything is done, start Meilisearch with the swagger flag: `cargo run --features swagger`, open `http://localhost:7700/scalar` on your browser, and ensure everything works as expected.
- For more info, refer to [this presentation](https://pitch.com/v/generating-the-openapi-file-jrn3nh).
### Reminders when modifying the Setting API
<!--- Special steps to remind when adding a new index setting -->

View File

@@ -9,22 +9,22 @@ jobs:
flaky:
runs-on: ubuntu-latest
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-flaky
run: cargo install cargo-flaky
- name: Run cargo flaky in the dumps
run: cd crates/dump; cargo flaky -i 100 --release
- name: Run cargo flaky in the index-scheduler
run: cd crates/index-scheduler; cargo flaky -i 100 --release
- name: Run cargo flaky in the auth
run: cd crates/meilisearch-auth; cargo flaky -i 100 --release
- name: Run cargo flaky in meilisearch
run: cd crates/meilisearch; cargo flaky -i 100 --release
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-flaky
run: cargo install cargo-flaky
- name: Run cargo flaky in the dumps
run: cd crates/dump; cargo flaky -i 100 --release
- name: Run cargo flaky in the index-scheduler
run: cd crates/index-scheduler; cargo flaky -i 100 --release
- name: Run cargo flaky in the auth
run: cd crates/meilisearch-auth; cargo flaky -i 100 --release
- name: Run cargo flaky in meilisearch
run: cd crates/meilisearch; cargo flaky -i 100 --release

View File

@@ -18,28 +18,28 @@ jobs:
runs-on: ubuntu-latest
needs: check-version
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-deb
run: cargo install cargo-deb
- uses: actions/checkout@v3
- name: Build deb package
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
- name: Upload debian pkg to release
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/debian/meilisearch.deb
asset_name: meilisearch.deb
tag: ${{ github.ref }}
- name: Upload debian pkg to apt repository
run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Install cargo-deb
run: cargo install cargo-deb
- uses: actions/checkout@v3
- name: Build deb package
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
- name: Upload debian pkg to release
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/debian/meilisearch.deb
asset_name: meilisearch.deb
tag: ${{ github.ref }}
- name: Upload debian pkg to apt repository
run: curl -F package=@target/debian/meilisearch.deb https://${{ secrets.GEMFURY_PUSH_TOKEN }}@push.fury.io/meilisearch/
homebrew:
name: Bump Homebrew formula

View File

@@ -3,7 +3,7 @@ name: Publish binaries to GitHub release
on:
workflow_dispatch:
schedule:
- cron: "0 2 * * *" # Every day at 2:00am
- cron: '0 2 * * *' # Every day at 2:00am
release:
types: [published]
@@ -37,26 +37,26 @@ jobs:
runs-on: ubuntu-latest
needs: check-version
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/meilisearch
asset_name: meilisearch-linux-amd64
tag: ${{ github.ref }}
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.81
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/meilisearch
asset_name: meilisearch-linux-amd64
tag: ${{ github.ref }}
publish-macos-windows:
name: Publish binary for ${{ matrix.os }}
@@ -74,19 +74,19 @@ jobs:
artifact_name: meilisearch.exe
asset_name: meilisearch-windows-amd64.exe
steps:
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.81
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/${{ matrix.artifact_name }}
asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }}
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.81
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/${{ matrix.artifact_name }}
asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }}
publish-macos-apple-silicon:
name: Publish binary for macOS silicon
@@ -127,8 +127,8 @@ jobs:
env:
DEBIAN_FRONTEND: noninteractive
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
strategy:
matrix:
include:

View File

@@ -4,7 +4,7 @@ on:
workflow_dispatch:
schedule:
# Everyday at 5:00am
- cron: "0 5 * * *"
- cron: '0 5 * * *'
pull_request:
push:
# trying and staging branches are for Bors config
@@ -19,11 +19,11 @@ env:
jobs:
test-linux:
name: Tests on ubuntu-22.04
name: Tests on ubuntu-20.04
runs-on: ubuntu-latest
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
@@ -72,8 +72,8 @@ jobs:
name: Tests almost all features
runs-on: ubuntu-latest
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
@@ -84,48 +84,16 @@ jobs:
- uses: dtolnay/rust-toolchain@1.81
- name: Run cargo build with almost all features
run: |
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda,test-ollama)"
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
- name: Run cargo test with almost all features
run: |
cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda,test-ollama)"
ollama-ubuntu:
name: Test with Ollama
runs-on: ubuntu-latest
env:
MEILI_TEST_OLLAMA_SERVER: "http://localhost:11434"
steps:
- uses: actions/checkout@v1
- name: Install Ollama
run: |
curl -fsSL https://ollama.com/install.sh | sudo -E sh
- name: Start serving
run: |
# Run it in the background, there is no way to daemonise at the moment
ollama serve &
# A short pause is required before the HTTP port is opened
sleep 5
# This endpoint blocks until ready
time curl -i http://localhost:11434
- name: Pull nomic-embed-text & all-minilm
run: |
ollama pull nomic-embed-text
ollama pull all-minilm
- name: Run cargo test
uses: actions-rs/cargo@v1
with:
command: test
args: --locked --release --all --features test-ollama ollama
cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
test-disabled-tokenization:
name: Test disabled tokenization
runs-on: ubuntu-latest
container:
image: ubuntu:22.04
image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
@@ -149,8 +117,8 @@ jobs:
name: Run tests in debug
runs-on: ubuntu-latest
container:
# Use ubuntu-22.04 to compile with glibc 2.35
image: ubuntu:22.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies

53
Cargo.lock generated
View File

@@ -394,8 +394,7 @@ checksum = "96d30a06541fbafbc7f82ed10c06164cfbd2c401138f6addd8404629c4b16711"
[[package]]
name = "arroy"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dfc5f272f38fa063bbff0a7ab5219404e221493de005e2b4078c62d626ef567e"
source = "git+https://github.com/meilisearch/arroy/?tag=DO-NOT-DELETE-upgrade-v04-to-v05#053807bf38dc079f25b003f19fc30fbf3613f6e7"
dependencies = [
"bytemuck",
"byteorder",
@@ -414,7 +413,7 @@ dependencies = [
[[package]]
name = "arroy"
version = "0.5.0"
source = "git+https://github.com/meilisearch/arroy/?tag=DO-NOT-DELETE-upgrade-v04-to-v05#053807bf38dc079f25b003f19fc30fbf3613f6e7"
source = "git+https://github.com/meilisearch/arroy?rev=55fb0e8006f4f00ad3197b06bda348133f6ffffa#55fb0e8006f4f00ad3197b06bda348133f6ffffa"
dependencies = [
"bytemuck",
"byteorder",
@@ -427,7 +426,7 @@ dependencies = [
"rayon",
"roaring",
"tempfile",
"thiserror 1.0.69",
"thiserror 2.0.9",
]
[[package]]
@@ -503,7 +502,7 @@ source = "git+https://github.com/meilisearch/bbqueue#cbb87cc707b5af415ef203bdaf2
[[package]]
name = "benchmarks"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"bumpalo",
@@ -694,7 +693,7 @@ dependencies = [
[[package]]
name = "build-info"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"time",
@@ -1671,7 +1670,7 @@ dependencies = [
[[package]]
name = "dump"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"big_s",
@@ -1873,7 +1872,7 @@ checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
[[package]]
name = "file-store"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"tempfile",
"thiserror 2.0.9",
@@ -1895,7 +1894,7 @@ dependencies = [
[[package]]
name = "filter-parser"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"insta",
"nom",
@@ -1915,7 +1914,7 @@ dependencies = [
[[package]]
name = "flatten-serde-json"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"criterion",
"serde_json",
@@ -2054,7 +2053,7 @@ dependencies = [
[[package]]
name = "fuzzers"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"arbitrary",
"bumpalo",
@@ -2743,10 +2742,10 @@ checksum = "206ca75c9c03ba3d4ace2460e57b189f39f43de612c2f85836e65c929701bb2d"
[[package]]
name = "index-scheduler"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"arroy 0.5.0 (registry+https://github.com/rust-lang/crates.io-index)",
"arroy 0.5.0 (git+https://github.com/meilisearch/arroy?rev=55fb0e8006f4f00ad3197b06bda348133f6ffffa)",
"big_s",
"bincode",
"bumpalo",
@@ -2950,7 +2949,7 @@ dependencies = [
[[package]]
name = "json-depth-checker"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"criterion",
"serde_json",
@@ -3513,9 +3512,9 @@ checksum = "9374ef4228402d4b7e403e5838cb880d9ee663314b0a900d5a6aabf0c213552e"
[[package]]
name = "log"
version = "0.4.21"
version = "0.4.26"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90ed8c1e510134f979dbc4f070f87d4313098b704861a105fe34231c70a3901c"
checksum = "30bde2b3dc3671ae49d8e2e9f044c7c005836e7a023ee57cffa25ab82764bb9e"
[[package]]
name = "lzma-rs"
@@ -3569,7 +3568,7 @@ checksum = "490cc448043f947bae3cbee9c203358d62dbee0db12107a74be5c30ccfd09771"
[[package]]
name = "meili-snap"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"insta",
"md5",
@@ -3578,7 +3577,7 @@ dependencies = [
[[package]]
name = "meilisearch"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"actix-cors",
"actix-http",
@@ -3670,7 +3669,7 @@ dependencies = [
[[package]]
name = "meilisearch-auth"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"base64 0.22.1",
"enum-iterator",
@@ -3689,7 +3688,7 @@ dependencies = [
[[package]]
name = "meilisearch-types"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"actix-web",
"anyhow",
@@ -3723,7 +3722,7 @@ dependencies = [
[[package]]
name = "meilitool"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"arroy 0.5.0 (git+https://github.com/meilisearch/arroy/?tag=DO-NOT-DELETE-upgrade-v04-to-v05)",
@@ -3758,10 +3757,10 @@ dependencies = [
[[package]]
name = "milli"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"allocator-api2",
"arroy 0.5.0 (registry+https://github.com/rust-lang/crates.io-index)",
"arroy 0.5.0 (git+https://github.com/meilisearch/arroy?rev=55fb0e8006f4f00ad3197b06bda348133f6ffffa)",
"bbqueue",
"big_s",
"bimap",
@@ -4270,7 +4269,7 @@ checksum = "e3148f5046208a5d56bcfc03053e3ca6334e51da8dfb19b6cdc8b306fae3283e"
[[package]]
name = "permissive-json-pointer"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"big_s",
"serde_json",
@@ -5160,9 +5159,9 @@ dependencies = [
[[package]]
name = "serde_json"
version = "1.0.138"
version = "1.0.135"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d434192e7da787e94a6ea7e9670b26a036d0ca41e0b7efb2676dd32bae872949"
checksum = "2b0d7ba2887406110130a978386c4e1befb98c674b4fba677954e4db976630d9"
dependencies = [
"indexmap",
"itoa",
@@ -6847,7 +6846,7 @@ dependencies = [
[[package]]
name = "xtask"
version = "1.13.1"
version = "1.13.2"
dependencies = [
"anyhow",
"build-info",

View File

@@ -22,7 +22,7 @@ members = [
]
[workspace.package]
version = "1.13.1"
version = "1.13.2"
authors = [
"Quentin de Quelen <quentin@dequelen.me>",
"Clément Renault <clement@meilisearch.com>",

View File

@@ -1,5 +1,5 @@
status = [
'Tests on ubuntu-22.04',
'Tests on ubuntu-20.04',
'Tests on macos-13',
'Tests on windows-2022',
'Run Clippy',

View File

@@ -10,7 +10,7 @@ use milli::documents::PrimaryKey;
use milli::heed::{EnvOpenOptions, RwTxn};
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::Index;
use rand::seq::SliceRandom;
@@ -138,9 +138,10 @@ fn indexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -204,9 +205,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -248,9 +250,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -316,9 +319,10 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -392,9 +396,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_1_2, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -436,9 +441,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_3_4, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -476,9 +482,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS_4_4, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -542,10 +549,11 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -609,9 +617,10 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_SONGS, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -675,9 +684,10 @@ fn indexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -740,9 +750,10 @@ fn reindexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -784,9 +795,10 @@ fn reindexing_wiki(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -851,9 +863,10 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -926,10 +939,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_1_2, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -971,10 +985,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_3_4, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1012,10 +1027,11 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents =
utils::documents_from(datasets_paths::SMOL_WIKI_ARTICLES_4_4, "csv");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1079,9 +1095,10 @@ fn indexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1144,9 +1161,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1188,9 +1206,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1255,9 +1274,10 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1367,9 +1387,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_1_2, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1411,9 +1432,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_3_4, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1451,9 +1473,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::MOVIES_4_4, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1540,9 +1563,10 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1630,9 +1654,10 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1712,9 +1737,10 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::NESTED_MOVIES, "json");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1778,9 +1804,10 @@ fn indexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1843,9 +1870,10 @@ fn reindexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1887,9 +1915,10 @@ fn reindexing_geo(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer
@@ -1954,9 +1983,10 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = utils::documents_from(datasets_paths::SMOL_ALL_COUNTRIES, "jsonl");
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer

View File

@@ -12,7 +12,7 @@ use memmap2::Mmap;
use milli::heed::EnvOpenOptions;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy};
use serde_json::Value;
@@ -99,8 +99,8 @@ pub fn base_setup(conf: &Conf) -> Index {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let documents = documents_from(conf.dataset, conf.dataset_format);
let mut indexer = indexer::DocumentOperation::new();
indexer.replace_documents(&documents).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer

View File

@@ -321,8 +321,6 @@ pub(crate) mod test {
status: maplit::btreemap! { Status::Succeeded => 1 },
types: maplit::btreemap! { Kind::DocumentAdditionOrUpdate => 1 },
index_uids: maplit::btreemap! { "doggo".to_string() => 1 },
progress_trace: Default::default(),
write_channel_congestion: None,
},
enqueued_at: Some(BatchEnqueuedAt {
earliest: datetime!(2022-11-11 0:00 UTC),

View File

@@ -12,7 +12,7 @@ use milli::documents::mmap_from_objects;
use milli::heed::EnvOpenOptions;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::IndexerConfig;
use milli::update::{IndexDocumentsMethod, IndexerConfig};
use milli::vector::EmbeddingConfigs;
use milli::Index;
use serde_json::Value;
@@ -89,7 +89,9 @@ fn main() {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(
IndexDocumentsMethod::ReplaceDocuments,
);
let mut operations = Vec::new();
for op in batch.0 {
@@ -113,7 +115,7 @@ fn main() {
for op in &operations {
match op {
Either::Left(documents) => {
indexer.replace_documents(documents).unwrap()
indexer.add_documents(documents).unwrap()
}
Either::Right(ids) => indexer.delete_documents(ids),
}

View File

@@ -29,7 +29,7 @@ page_size = "0.6.0"
rayon = "1.10.0"
roaring = { version = "0.10.10", features = ["serde"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.138", features = ["preserve_order"] }
serde_json = { version = "1.0.135", features = ["preserve_order"] }
synchronoise = "1.0.1"
tempfile = "3.15.0"
thiserror = "2.0.9"
@@ -44,7 +44,7 @@ ureq = "2.12.1"
uuid = { version = "1.11.0", features = ["serde", "v4"] }
[dev-dependencies]
arroy = "0.5.0"
arroy = { git = "https://github.com/meilisearch/arroy", rev = "55fb0e8006f4f00ad3197b06bda348133f6ffffa" }
big_s = "1.0.2"
crossbeam-channel = "0.5.14"
# fixed version due to format breakages in v1.40

View File

@@ -102,6 +102,10 @@ pub struct IndexStats {
/// Stats of the documents database.
#[serde(default)]
pub documents_database_stats: DatabaseStats,
#[serde(default, skip_serializing)]
pub number_of_documents: Option<u64>,
/// Size taken up by the index' DB, in bytes.
///
/// This includes the size taken by both the used and free pages of the DB, and as the free pages
@@ -143,6 +147,7 @@ impl IndexStats {
number_of_embeddings: Some(arroy_stats.number_of_embeddings),
number_of_embedded_documents: Some(arroy_stats.documents.len()),
documents_database_stats: index.documents_stats(rtxn)?.unwrap_or_default(),
number_of_documents: None,
database_size: index.on_disk_size()?,
used_database_size: index.used_size()?,
primary_key: index.primary_key(rtxn)?.map(|s| s.to_string()),

View File

@@ -1,7 +1,7 @@
use std::collections::BTreeSet;
use std::fmt::Write;
use meilisearch_types::batches::{Batch, BatchEnqueuedAt, BatchStats};
use meilisearch_types::batches::{Batch, BatchEnqueuedAt};
use meilisearch_types::heed::types::{SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{Database, RoTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
@@ -342,11 +342,6 @@ pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec
pub fn snapshot_batch(batch: &Batch) -> String {
let mut snap = String::new();
let Batch { uid, details, stats, started_at, finished_at, progress: _, enqueued_at } = batch;
let stats = BatchStats {
progress_trace: Default::default(),
write_channel_congestion: None,
..stats.clone()
};
if let Some(finished_at) = finished_at {
assert!(finished_at > started_at);
}
@@ -357,7 +352,7 @@ pub fn snapshot_batch(batch: &Batch) -> String {
snap.push('{');
snap.push_str(&format!("uid: {uid}, "));
snap.push_str(&format!("details: {}, ", serde_json::to_string(details).unwrap()));
snap.push_str(&format!("stats: {}, ", serde_json::to_string(&stats).unwrap()));
snap.push_str(&format!("stats: {}, ", serde_json::to_string(stats).unwrap()));
snap.push('}');
snap
}

View File

@@ -5,9 +5,13 @@ tasks affecting a single index into a [batch](crate::batch::Batch).
The main function of the autobatcher is [`next_autobatch`].
*/
use meilisearch_types::tasks::TaskId;
use std::ops::ControlFlow::{self, Break, Continue};
use meilisearch_types::milli::update::IndexDocumentsMethod::{
self, ReplaceDocuments, UpdateDocuments,
};
use meilisearch_types::tasks::TaskId;
use crate::KindWithContent;
/// Succinctly describes a task's [`Kind`](meilisearch_types::tasks::Kind)
@@ -15,11 +19,19 @@ use crate::KindWithContent;
///
/// Only the non-prioritised tasks that can be grouped in a batch have a corresponding [`AutobatchKind`]
enum AutobatchKind {
DocumentImport { allow_index_creation: bool, primary_key: Option<String> },
DocumentImport {
method: IndexDocumentsMethod,
allow_index_creation: bool,
primary_key: Option<String>,
},
DocumentEdition,
DocumentDeletion { by_filter: bool },
DocumentDeletion {
by_filter: bool,
},
DocumentClear,
Settings { allow_index_creation: bool },
Settings {
allow_index_creation: bool,
},
IndexCreation,
IndexDeletion,
IndexUpdate,
@@ -48,8 +60,11 @@ impl From<KindWithContent> for AutobatchKind {
fn from(kind: KindWithContent) -> Self {
match kind {
KindWithContent::DocumentAdditionOrUpdate {
allow_index_creation, primary_key, ..
} => AutobatchKind::DocumentImport { allow_index_creation, primary_key },
method,
allow_index_creation,
primary_key,
..
} => AutobatchKind::DocumentImport { method, allow_index_creation, primary_key },
KindWithContent::DocumentEdition { .. } => AutobatchKind::DocumentEdition,
KindWithContent::DocumentDeletion { .. } => {
AutobatchKind::DocumentDeletion { by_filter: false }
@@ -84,6 +99,7 @@ pub enum BatchKind {
ids: Vec<TaskId>,
},
DocumentOperation {
method: IndexDocumentsMethod,
allow_index_creation: bool,
primary_key: Option<String>,
operation_ids: Vec<TaskId>,
@@ -156,11 +172,12 @@ impl BatchKind {
K::IndexUpdate => (Break(BatchKind::IndexUpdate { id: task_id }), false),
K::IndexSwap => (Break(BatchKind::IndexSwap { id: task_id }), false),
K::DocumentClear => (Continue(BatchKind::DocumentClear { ids: vec![task_id] }), false),
K::DocumentImport { allow_index_creation, primary_key: pk }
K::DocumentImport { method, allow_index_creation, primary_key: pk }
if primary_key.is_none() || pk.is_none() || primary_key == pk.as_deref() =>
{
(
Continue(BatchKind::DocumentOperation {
method,
allow_index_creation,
primary_key: pk,
operation_ids: vec![task_id],
@@ -169,8 +186,9 @@ impl BatchKind {
)
}
// if the primary key set in the task was different than ours we should stop and make this batch fail asap.
K::DocumentImport { allow_index_creation, primary_key } => (
K::DocumentImport { method, allow_index_creation, primary_key } => (
Break(BatchKind::DocumentOperation {
method,
allow_index_creation,
primary_key,
operation_ids: vec![task_id],
@@ -239,7 +257,7 @@ impl BatchKind {
(
BatchKind::DocumentClear { mut ids }
| BatchKind::DocumentDeletion { deletion_ids: mut ids, includes_by_filter: _ }
| BatchKind::DocumentOperation { allow_index_creation: _, primary_key: _, operation_ids: mut ids }
| BatchKind::DocumentOperation { method: _, allow_index_creation: _, primary_key: _, operation_ids: mut ids }
| BatchKind::Settings { allow_index_creation: _, settings_ids: mut ids },
K::IndexDeletion,
) => {
@@ -267,32 +285,46 @@ impl BatchKind {
K::DocumentImport { .. } | K::Settings { .. },
) => Break(this),
(
BatchKind::DocumentOperation { allow_index_creation: _, primary_key: _, mut operation_ids },
BatchKind::DocumentOperation { method: _, allow_index_creation: _, primary_key: _, mut operation_ids },
K::DocumentClear,
) => {
operation_ids.push(id);
Continue(BatchKind::DocumentClear { ids: operation_ids })
}
// we can autobatch different kind of document operations and mix replacements with updates
// we can autobatch the same kind of document additions / updates
(
BatchKind::DocumentOperation { allow_index_creation, primary_key: _, mut operation_ids },
K::DocumentImport { primary_key: pk, .. },
BatchKind::DocumentOperation { method: ReplaceDocuments, allow_index_creation, primary_key: _, mut operation_ids },
K::DocumentImport { method: ReplaceDocuments, primary_key: pk, .. },
) => {
operation_ids.push(id);
Continue(BatchKind::DocumentOperation {
method: ReplaceDocuments,
allow_index_creation,
operation_ids,
primary_key: pk,
})
}
(
BatchKind::DocumentOperation { allow_index_creation, primary_key, mut operation_ids },
BatchKind::DocumentOperation { method: UpdateDocuments, allow_index_creation, primary_key: _, mut operation_ids },
K::DocumentImport { method: UpdateDocuments, primary_key: pk, .. },
) => {
operation_ids.push(id);
Continue(BatchKind::DocumentOperation {
method: UpdateDocuments,
allow_index_creation,
primary_key: pk,
operation_ids,
})
}
(
BatchKind::DocumentOperation { method, allow_index_creation, primary_key, mut operation_ids },
K::DocumentDeletion { by_filter: false },
) => {
operation_ids.push(id);
Continue(BatchKind::DocumentOperation {
method,
allow_index_creation,
primary_key,
operation_ids,
@@ -305,6 +337,13 @@ impl BatchKind {
) => {
Break(this)
}
// but we can't autobatch documents if it's not the same kind
// this match branch MUST be AFTER the previous one
(
this @ BatchKind::DocumentOperation { .. },
K::DocumentImport { .. },
) => Break(this),
(
this @ BatchKind::DocumentOperation { .. },
K::Settings { .. },
@@ -322,11 +361,12 @@ impl BatchKind {
// we can autobatch the deletion and import if the index already exists
(
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { allow_index_creation, primary_key }
K::DocumentImport { method, allow_index_creation, primary_key }
) if index_already_exists => {
deletion_ids.push(id);
Continue(BatchKind::DocumentOperation {
method,
allow_index_creation,
primary_key,
operation_ids: deletion_ids,
@@ -335,11 +375,12 @@ impl BatchKind {
// we can autobatch the deletion and import if both can't create an index
(
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { allow_index_creation, primary_key }
K::DocumentImport { method, allow_index_creation, primary_key }
) if !allow_index_creation => {
deletion_ids.push(id);
Continue(BatchKind::DocumentOperation {
method,
allow_index_creation,
primary_key,
operation_ids: deletion_ids,

View File

@@ -92,29 +92,29 @@ fn idx_swap() -> KindWithContent {
fn autobatch_simple_operation_together() {
// we can autobatch one or multiple `ReplaceDocuments` together.
// if the index exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, false , None), doc_imp(ReplaceDocuments, false , None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, false , None), doc_imp(ReplaceDocuments, false , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// if it doesn't exists.
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp( ReplaceDocuments, true , None), doc_imp(ReplaceDocuments, true , None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
// we can autobatch one or multiple `UpdateDocuments` together.
// if the index exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// if it doesn't exists.
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1, 2] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// we can autobatch one or multiple DocumentDeletion together
debug_snapshot!(autobatch_from(true, None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
@@ -140,53 +140,53 @@ fn autobatch_simple_operation_together() {
debug_snapshot!(autobatch_from(false,None, [settings(false), settings(false), settings(false)]), @"Some((Settings { allow_index_creation: false, settings_ids: [0, 1, 2] }, false))");
// We can autobatch document addition with document deletion
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
// And the other way around
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
// But we can't autobatch document addition with document deletion by filter
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
// And the other way around
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
@@ -203,27 +203,27 @@ fn autobatch_simple_operation_together() {
}
#[test]
fn simple_different_document_operations_autobatch_together() {
// addition and updates with deletion by filter can't batch together
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
fn simple_document_operation_dont_autobatch_with_other() {
// addition, updates and deletion by filter can't batch together
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
}
@@ -231,28 +231,28 @@ fn simple_different_document_operations_autobatch_together() {
#[test]
fn document_addition_doesnt_batch_with_settings() {
// simple case
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
// multiple settings and doc addition
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None), settings(true), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
// addition and setting unordered
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
// Doesn't batch with other forbidden operations
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_create()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_update()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), idx_swap()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
}
#[test]
@@ -280,8 +280,8 @@ fn clear_and_additions_and_settings() {
debug_snapshot!(autobatch_from(true, None, [doc_clr(), settings(true)]), @"Some((DocumentClear { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [settings(true), doc_clr(), settings(true)]), @"Some((ClearAndSettings { other: [1], allow_index_creation: true, settings_ids: [0, 2] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), settings(true), doc_clr()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
}
#[test]
@@ -333,17 +333,17 @@ fn anything_and_index_deletion() {
#[test]
fn allowed_and_disallowed_index_creation() {
// `DocumentImport` can't be mixed with those disallowed to do so except if the index already exists.
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, true, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
// batch deletion and addition
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
@@ -356,40 +356,40 @@ fn allowed_and_disallowed_index_creation() {
fn autobatch_primary_key() {
// ==> If I have a pk
// With a single update
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// With a multiple updates
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other"))]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0, 1] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, Some("id"), [doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("other")), doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// ==> If I don't have a pk
// With a single update
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("other"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("other"), operation_ids: [0] }, true))"###);
// With a multiple updates
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(ReplaceDocuments, true, Some("id"))]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("id")), doc_imp(ReplaceDocuments, true, None)]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("id"), operation_ids: [0] }, true))"###);
}

View File

@@ -54,8 +54,7 @@ pub(crate) enum Batch {
#[derive(Debug)]
pub(crate) enum DocumentOperation {
Replace(Uuid),
Update(Uuid),
Add(Uuid),
Delete(Vec<String>),
}
@@ -65,6 +64,7 @@ pub(crate) enum IndexOperation {
DocumentOperation {
index_uid: String,
primary_key: Option<String>,
method: IndexDocumentsMethod,
operations: Vec<DocumentOperation>,
tasks: Vec<Task>,
},
@@ -254,7 +254,7 @@ impl IndexScheduler {
_ => unreachable!(),
}
}
BatchKind::DocumentOperation { operation_ids, .. } => {
BatchKind::DocumentOperation { method, operation_ids, .. } => {
let tasks = self.queue.get_existing_tasks_for_processing_batch(
rtxn,
current_batch,
@@ -276,17 +276,9 @@ impl IndexScheduler {
for task in tasks.iter() {
match task.kind {
KindWithContent::DocumentAdditionOrUpdate {
content_file, method, ..
} => match method {
IndexDocumentsMethod::ReplaceDocuments => {
operations.push(DocumentOperation::Replace(content_file))
}
IndexDocumentsMethod::UpdateDocuments => {
operations.push(DocumentOperation::Update(content_file))
}
_ => unreachable!("Unknown document merging method"),
},
KindWithContent::DocumentAdditionOrUpdate { content_file, .. } => {
operations.push(DocumentOperation::Add(content_file));
}
KindWithContent::DocumentDeletion { ref documents_ids, .. } => {
operations.push(DocumentOperation::Delete(documents_ids.clone()));
}
@@ -298,6 +290,7 @@ impl IndexScheduler {
op: IndexOperation::DocumentOperation {
index_uid,
primary_key,
method,
operations,
tasks,
},

View File

@@ -215,16 +215,14 @@ impl IndexScheduler {
let mut stop_scheduler_forever = false;
let mut wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?;
let mut canceled = RoaringBitmap::new();
let mut congestion = None;
match res {
Ok((tasks, cong)) => {
Ok(tasks) => {
#[cfg(test)]
self.breakpoint(crate::test_utils::Breakpoint::ProcessBatchSucceeded);
let (task_progress, task_progress_obj) = AtomicTaskStep::new(tasks.len() as u32);
progress.update_progress(task_progress_obj);
congestion = cong;
let mut success = 0;
let mut failure = 0;
let mut canceled_by = None;
@@ -341,28 +339,6 @@ impl IndexScheduler {
// We must re-add the canceled task so they're part of the same batch.
ids |= canceled;
processing_batch.stats.progress_trace =
progress.accumulated_durations().into_iter().map(|(k, v)| (k, v.into())).collect();
processing_batch.stats.write_channel_congestion = congestion.map(|congestion| {
let mut congestion_info = serde_json::Map::new();
congestion_info.insert("attempts".into(), congestion.attempts.into());
congestion_info.insert("blocking_attempts".into(), congestion.blocking_attempts.into());
congestion_info.insert("blocking_ratio".into(), congestion.congestion_ratio().into());
congestion_info
});
if let Some(congestion) = congestion {
tracing::debug!(
"Channel congestion metrics - Attempts: {}, Blocked attempts: {} ({:.1}% congestion)",
congestion.attempts,
congestion.blocking_attempts,
congestion.congestion_ratio(),
);
}
tracing::debug!("call trace: {:?}", progress.accumulated_durations());
self.queue.write_batch(&mut wtxn, processing_batch, &ids)?;
#[cfg(test)]

View File

@@ -5,7 +5,7 @@ use std::sync::atomic::Ordering;
use meilisearch_types::batches::{BatchEnqueuedAt, BatchId};
use meilisearch_types::heed::{RoTxn, RwTxn};
use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use meilisearch_types::milli::{self, ChannelCongestion};
use meilisearch_types::milli::{self};
use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task};
use milli::update::Settings as MilliSettings;
use roaring::RoaringBitmap;
@@ -35,7 +35,7 @@ impl IndexScheduler {
batch: Batch,
current_batch: &mut ProcessingBatch,
progress: Progress,
) -> Result<(Vec<Task>, Option<ChannelCongestion>)> {
) -> Result<Vec<Task>> {
#[cfg(test)]
{
self.maybe_fail(crate::test_utils::FailureLocation::InsideProcessBatch)?;
@@ -76,7 +76,7 @@ impl IndexScheduler {
canceled_tasks.push(task);
Ok((canceled_tasks, None))
Ok(canceled_tasks)
}
Batch::TaskDeletions(mut tasks) => {
// 1. Retrieve the tasks that matched the query at enqueue-time.
@@ -115,14 +115,10 @@ impl IndexScheduler {
_ => unreachable!(),
}
}
Ok((tasks, None))
}
Batch::SnapshotCreation(tasks) => {
self.process_snapshot(progress, tasks).map(|tasks| (tasks, None))
}
Batch::Dump(task) => {
self.process_dump_creation(progress, task).map(|tasks| (tasks, None))
Ok(tasks)
}
Batch::SnapshotCreation(tasks) => self.process_snapshot(progress, tasks),
Batch::Dump(task) => self.process_dump_creation(progress, task),
Batch::IndexOperation { op, must_create_index } => {
let index_uid = op.index_uid().to_string();
let index = if must_create_index {
@@ -139,8 +135,7 @@ impl IndexScheduler {
.set_currently_updating_index(Some((index_uid.clone(), index.clone())));
let mut index_wtxn = index.write_txn()?;
let (tasks, congestion) =
self.apply_index_operation(&mut index_wtxn, &index, op, progress)?;
let tasks = self.apply_index_operation(&mut index_wtxn, &index, op, progress)?;
{
let span = tracing::trace_span!(target: "indexing::scheduler", "commit");
@@ -171,7 +166,7 @@ impl IndexScheduler {
),
}
Ok((tasks, congestion))
Ok(tasks)
}
Batch::IndexCreation { index_uid, primary_key, task } => {
progress.update_progress(CreateIndexProgress::CreatingTheIndex);
@@ -239,7 +234,7 @@ impl IndexScheduler {
),
}
Ok((vec![task], None))
Ok(vec![task])
}
Batch::IndexDeletion { index_uid, index_has_been_created, mut tasks } => {
progress.update_progress(DeleteIndexProgress::DeletingTheIndex);
@@ -273,7 +268,7 @@ impl IndexScheduler {
};
}
Ok((tasks, None))
Ok(tasks)
}
Batch::IndexSwap { mut task } => {
progress.update_progress(SwappingTheIndexes::EnsuringCorrectnessOfTheSwap);
@@ -321,7 +316,7 @@ impl IndexScheduler {
}
wtxn.commit()?;
task.status = Status::Succeeded;
Ok((vec![task], None))
Ok(vec![task])
}
Batch::UpgradeDatabase { mut tasks } => {
let KindWithContent::UpgradeDatabase { from } = tasks.last().unwrap().kind else {
@@ -351,7 +346,7 @@ impl IndexScheduler {
task.error = None;
}
Ok((tasks, None))
Ok(tasks)
}
}
}

View File

@@ -5,7 +5,7 @@ use meilisearch_types::milli::documents::PrimaryKey;
use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::update::new::indexer::{self, UpdateByFunction};
use meilisearch_types::milli::update::DocumentAdditionResult;
use meilisearch_types::milli::{self, ChannelCongestion, Filter, ThreadPoolNoAbortBuilder};
use meilisearch_types::milli::{self, Filter, ThreadPoolNoAbortBuilder};
use meilisearch_types::settings::apply_settings_to_builder;
use meilisearch_types::tasks::{Details, KindWithContent, Status, Task};
use meilisearch_types::Index;
@@ -33,8 +33,9 @@ impl IndexScheduler {
index: &'i Index,
operation: IndexOperation,
progress: Progress,
) -> Result<(Vec<Task>, Option<ChannelCongestion>)> {
) -> Result<Vec<Task>> {
let indexer_alloc = Bump::new();
let started_processing_at = std::time::Instant::now();
let must_stop_processing = self.scheduler.must_stop_processing.clone();
@@ -59,23 +60,25 @@ impl IndexScheduler {
};
}
Ok((tasks, None))
Ok(tasks)
}
IndexOperation::DocumentOperation { index_uid, primary_key, operations, mut tasks } => {
IndexOperation::DocumentOperation {
index_uid,
primary_key,
method,
operations,
mut tasks,
} => {
progress.update_progress(DocumentOperationProgress::RetrievingConfig);
// TODO: at some point, for better efficiency we might want to reuse the bumpalo for successive batches.
// this is made difficult by the fact we're doing private clones of the index scheduler and sending it
// to a fresh thread.
let mut content_files = Vec::new();
for operation in &operations {
match operation {
DocumentOperation::Replace(content_uuid)
| DocumentOperation::Update(content_uuid) => {
let content_file = self.queue.file_store.get_update(*content_uuid)?;
let mmap = unsafe { memmap2::Mmap::map(&content_file)? };
content_files.push(mmap);
}
_ => (),
if let DocumentOperation::Add(content_uuid) = operation {
let content_file = self.queue.file_store.get_update(*content_uuid)?;
let mmap = unsafe { memmap2::Mmap::map(&content_file)? };
content_files.push(mmap);
}
}
@@ -84,23 +87,17 @@ impl IndexScheduler {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut content_files_iter = content_files.iter();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(method);
let embedders = index
.embedding_configs(index_wtxn)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?;
for operation in operations {
match operation {
DocumentOperation::Replace(_content_uuid) => {
DocumentOperation::Add(_content_uuid) => {
let mmap = content_files_iter.next().unwrap();
indexer
.replace_documents(mmap)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
}
DocumentOperation::Update(_content_uuid) => {
let mmap = content_files_iter.next().unwrap();
indexer
.update_documents(mmap)
.add_documents(mmap)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
}
DocumentOperation::Delete(document_ids) => {
@@ -172,24 +169,21 @@ impl IndexScheduler {
}
progress.update_progress(DocumentOperationProgress::Indexing);
let mut congestion = None;
if tasks.iter().any(|res| res.error.is_none()) {
congestion = Some(
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
primary_key,
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?,
);
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
primary_key,
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
let addition = DocumentAdditionResult {
indexed_documents: candidates_count,
@@ -201,7 +195,7 @@ impl IndexScheduler {
tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done");
}
Ok((tasks, congestion))
Ok(tasks)
}
IndexOperation::DocumentEdition { index_uid, mut task } => {
progress.update_progress(DocumentEditionProgress::RetrievingConfig);
@@ -249,7 +243,7 @@ impl IndexScheduler {
edited_documents: Some(0),
});
return Ok((vec![task], None));
return Ok(vec![task]);
}
let rtxn = index.read_txn()?;
@@ -264,7 +258,6 @@ impl IndexScheduler {
let result_count = Ok((candidates.len(), candidates.len())) as Result<_>;
let mut congestion = None;
if task.error.is_none() {
let local_pool;
let indexer_config = self.index_mapper.indexer_config();
@@ -295,22 +288,20 @@ impl IndexScheduler {
let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentEditionProgress::Indexing);
congestion = Some(
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
None, // cannot change primary key in DocumentEdition
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
None, // cannot change primary key in DocumentEdition
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
let addition = DocumentAdditionResult {
indexed_documents: candidates_count,
@@ -346,7 +337,7 @@ impl IndexScheduler {
}
}
Ok((vec![task], congestion))
Ok(vec![task])
}
IndexOperation::DocumentDeletion { mut tasks, index_uid } => {
progress.update_progress(DocumentDeletionProgress::RetrievingConfig);
@@ -413,7 +404,7 @@ impl IndexScheduler {
}
if to_delete.is_empty() {
return Ok((tasks, None));
return Ok(tasks);
}
let rtxn = index.read_txn()?;
@@ -427,7 +418,6 @@ impl IndexScheduler {
PrimaryKey::new_or_insert(primary_key, &mut new_fields_ids_map)
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?;
let mut congestion = None;
if !tasks.iter().all(|res| res.error.is_some()) {
let local_pool;
let indexer_config = self.index_mapper.indexer_config();
@@ -453,22 +443,20 @@ impl IndexScheduler {
let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentDeletionProgress::Indexing);
congestion = Some(
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
None, // document deletion never changes primary key
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
indexer::index(
index_wtxn,
index,
pool,
indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
None, // document deletion never changes primary key
&document_changes,
embedders,
&|| must_stop_processing.get(),
&progress,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
let addition = DocumentAdditionResult {
indexed_documents: candidates_count,
@@ -480,7 +468,7 @@ impl IndexScheduler {
tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done");
}
Ok((tasks, congestion))
Ok(tasks)
}
IndexOperation::Settings { index_uid, settings, mut tasks } => {
progress.update_progress(SettingsProgress::RetrievingAndMergingTheSettings);
@@ -505,7 +493,7 @@ impl IndexScheduler {
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
Ok((tasks, None))
Ok(tasks)
}
IndexOperation::DocumentClearAndSetting {
index_uid,
@@ -513,7 +501,7 @@ impl IndexScheduler {
settings,
settings_tasks,
} => {
let (mut import_tasks, _congestion) = self.apply_index_operation(
let mut import_tasks = self.apply_index_operation(
index_wtxn,
index,
IndexOperation::DocumentClear {
@@ -523,7 +511,7 @@ impl IndexScheduler {
progress.clone(),
)?;
let (settings_tasks, _congestion) = self.apply_index_operation(
let settings_tasks = self.apply_index_operation(
index_wtxn,
index,
IndexOperation::Settings { index_uid, settings, tasks: settings_tasks },
@@ -532,7 +520,7 @@ impl IndexScheduler {
let mut tasks = settings_tasks;
tasks.append(&mut import_tasks);
Ok((tasks, None))
Ok(tasks)
}
}
}

View File

@@ -1,5 +1,6 @@
---
source: crates/index-scheduler/src/scheduler/test_document_addition.rs
snapshot_kind: text
---
### Autobatching Enabled = true
### Processing batch None:
@@ -7,15 +8,15 @@ source: crates/index-scheduler/src/scheduler/test_document_addition.rs
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
1 {uid: 1, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000001, documents_count: 1, allow_index_creation: true }}
2 {uid: 2, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000002, documents_count: 1, allow_index_creation: true }}
3 {uid: 3, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000003, documents_count: 1, allow_index_creation: true }}
4 {uid: 4, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000004, documents_count: 1, allow_index_creation: true }}
5 {uid: 5, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000005, documents_count: 1, allow_index_creation: true }}
6 {uid: 6, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000006, documents_count: 1, allow_index_creation: true }}
7 {uid: 7, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000007, documents_count: 1, allow_index_creation: true }}
8 {uid: 8, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000008, documents_count: 1, allow_index_creation: true }}
9 {uid: 9, batch_uid: 0, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000009, documents_count: 1, allow_index_creation: true }}
1 {uid: 1, batch_uid: 1, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000001, documents_count: 1, allow_index_creation: true }}
2 {uid: 2, batch_uid: 2, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000002, documents_count: 1, allow_index_creation: true }}
3 {uid: 3, batch_uid: 3, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000003, documents_count: 1, allow_index_creation: true }}
4 {uid: 4, batch_uid: 4, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000004, documents_count: 1, allow_index_creation: true }}
5 {uid: 5, batch_uid: 5, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000005, documents_count: 1, allow_index_creation: true }}
6 {uid: 6, batch_uid: 6, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000006, documents_count: 1, allow_index_creation: true }}
7 {uid: 7, batch_uid: 7, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000007, documents_count: 1, allow_index_creation: true }}
8 {uid: 8, batch_uid: 8, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000008, documents_count: 1, allow_index_creation: true }}
9 {uid: 9, batch_uid: 9, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000009, documents_count: 1, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued []
@@ -47,35 +48,97 @@ doggos: { number_of_documents: 10, field_distribution: {"doggo": 10, "id": 10} }
[timestamp] [9,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,1,2,3,4,5,6,7,8,9,]
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
[timestamp] [6,]
[timestamp] [7,]
[timestamp] [8,]
[timestamp] [9,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,1,2,3,4,5,6,7,8,9,]
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
[timestamp] [6,]
[timestamp] [7,]
[timestamp] [8,]
[timestamp] [9,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":10,"indexedDocuments":10}, stats: {"totalNbTasks":10,"status":{"succeeded":10},"types":{"documentAdditionOrUpdate":10},"indexUids":{"doggos":10}}, }
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
1 {uid: 1, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
2 {uid: 2, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
3 {uid: 3, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
4 {uid: 4, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
5 {uid: 5, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
6 {uid: 6, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
7 {uid: 7, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
8 {uid: 8, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
9 {uid: 9, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"doggos":1}}, }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,1,2,3,4,5,6,7,8,9,]
0 [0,]
1 [1,]
2 [2,]
3 [3,]
4 [4,]
5 [5,]
6 [6,]
7 [7,]
8 [8,]
9 [9,]
----------------------------------------------------------------------
### Batches Status:
succeeded [0,]
succeeded [0,1,2,3,4,5,6,7,8,9,]
----------------------------------------------------------------------
### Batches Kind:
"documentAdditionOrUpdate" [0,]
"documentAdditionOrUpdate" [0,1,2,3,4,5,6,7,8,9,]
----------------------------------------------------------------------
### Batches Index Tasks:
doggos [0,]
doggos [0,1,2,3,4,5,6,7,8,9,]
----------------------------------------------------------------------
### Batches Enqueued At:
[timestamp] [0,]
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
[timestamp] [6,]
[timestamp] [7,]
[timestamp] [8,]
[timestamp] [9,]
----------------------------------------------------------------------
### Batches Started At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
[timestamp] [6,]
[timestamp] [7,]
[timestamp] [8,]
[timestamp] [9,]
----------------------------------------------------------------------
### Batches Finished At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
[timestamp] [6,]
[timestamp] [7,]
[timestamp] [8,]
[timestamp] [9,]
----------------------------------------------------------------------
### File Store:

View File

@@ -6,7 +6,7 @@ source: crates/index-scheduler/src/scheduler/test_failure.rs
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, batch_uid: 0, status: succeeded, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
1 {uid: 1, batch_uid: 1, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
2 {uid: 2, batch_uid: 2, status: succeeded, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
3 {uid: 3, batch_uid: 3, status: failed, error: ResponseError { code: 200, message: "Index `doggo` already exists.", error_code: "index_already_exists", error_type: "invalid_request", error_link: "https://docs.meilisearch.com/errors#index_already_exists" }, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
@@ -57,7 +57,7 @@ girafo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [4,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.1"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.2"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
1 {uid: 1, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, }
2 {uid: 2, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }
3 {uid: 3, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, }

View File

@@ -1,13 +1,12 @@
---
source: crates/index-scheduler/src/scheduler/test_failure.rs
snapshot_kind: text
---
### Autobatching Enabled = true
### Processing batch None:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, status: enqueued, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
----------------------------------------------------------------------
### Status:
enqueued [0,]

View File

@@ -1,13 +1,12 @@
---
source: crates/index-scheduler/src/scheduler/test_failure.rs
snapshot_kind: text
---
### Autobatching Enabled = true
### Processing batch None:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, status: enqueued, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
1 {uid: 1, status: enqueued, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
----------------------------------------------------------------------
### Status:

View File

@@ -6,7 +6,7 @@ source: crates/index-scheduler/src/scheduler/test_failure.rs
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, batch_uid: 0, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
1 {uid: 1, status: enqueued, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
----------------------------------------------------------------------
### Status:
@@ -37,7 +37,7 @@ catto [1,]
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.1"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.2"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -6,7 +6,7 @@ source: crates/index-scheduler/src/scheduler/test_failure.rs
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, batch_uid: 0, status: failed, error: ResponseError { code: 200, message: "Planned failure for tests.", error_code: "internal", error_type: "internal", error_link: "https://docs.meilisearch.com/errors#internal" }, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
1 {uid: 1, status: enqueued, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
2 {uid: 2, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
----------------------------------------------------------------------
@@ -40,7 +40,7 @@ doggo [2,]
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.1"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.2"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -6,7 +6,7 @@ source: crates/index-scheduler/src/scheduler/test_failure.rs
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { from: (1, 12, 0), to: (1, 13, 1) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
0 {uid: 0, batch_uid: 0, status: succeeded, details: { from: (1, 12, 0), to: (1, 13, 2) }, kind: UpgradeDatabase { from: (1, 12, 0) }}
1 {uid: 1, status: enqueued, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
2 {uid: 2, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
3 {uid: 3, status: enqueued, details: { primary_key: Some("bone") }, kind: IndexCreation { index_uid: "doggo", primary_key: Some("bone") }}
@@ -43,7 +43,7 @@ doggo [2,3,]
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.1"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
0 {uid: 0, details: {"upgradeFrom":"v1.12.0","upgradeTo":"v1.13.2"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"upgradeDatabase":1},"indexUids":{}}, }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -298,8 +298,11 @@ fn test_mixed_document_addition() {
}
snapshot!(snapshot_index_scheduler(&index_scheduler), name: "after_registering_the_10_tasks");
// All tasks should've been batched and processed together since any indexing task (updates with replacements) can be batched together
handle.advance_n_successful_batches(1);
// Only half of the task should've been processed since we can't autobatch replace and update together.
handle.advance_n_successful_batches(5);
snapshot!(snapshot_index_scheduler(&index_scheduler), name: "five_tasks_processed");
handle.advance_n_successful_batches(5);
snapshot!(snapshot_index_scheduler(&index_scheduler), name: "all_tasks_processed");
// has everything being pushed successfully in milli?

View File

@@ -24,10 +24,11 @@ pub fn upgrade_index_scheduler(
let current_minor = to.1;
let current_patch = to.2;
let upgrade_functions: &[&dyn UpgradeIndexScheduler] = &[&V1_12_ToCurrent {}];
let upgrade_functions: &[&dyn UpgradeIndexScheduler] = &[&ToCurrentNoOp {}];
let start = match from {
(1, 12, _) => 0,
(1, 13, _) => 0,
(major, minor, patch) => {
if major > current_major
|| (major == current_major && minor > current_minor)
@@ -85,9 +86,9 @@ pub fn upgrade_index_scheduler(
}
#[allow(non_camel_case_types)]
struct V1_12_ToCurrent {}
struct ToCurrentNoOp {}
impl UpgradeIndexScheduler for V1_12_ToCurrent {
impl UpgradeIndexScheduler for ToCurrentNoOp {
fn upgrade(
&self,
_env: &Env,

View File

@@ -14,7 +14,6 @@ license.workspace = true
actix-web = { version = "4.9.0", default-features = false }
anyhow = "1.0.95"
bumpalo = "3.16.0"
bumparaw-collections = "0.1.4"
convert_case = "0.6.0"
csv = "1.3.1"
deserr = { version = "0.6.3", features = ["actix-web"] }
@@ -25,11 +24,12 @@ flate2 = "1.0.35"
fst = "0.4.7"
memmap2 = "0.9.5"
milli = { path = "../milli" }
bumparaw-collections = "0.1.4"
roaring = { version = "0.10.10", features = ["serde"] }
rustc-hash = "2.1.0"
serde = { version = "1.0.217", features = ["derive"] }
serde-cs = "0.2.4"
serde_json = { version = "1.0.135", features = ["preserve_order"] }
serde_json = "1.0.135"
tar = "0.4.43"
tempfile = "3.15.0"
thiserror = "2.0.9"

View File

@@ -60,8 +60,4 @@ pub struct BatchStats {
pub status: BTreeMap<Status, u32>,
pub types: BTreeMap<Kind, u32>,
pub index_uids: BTreeMap<String, u32>,
#[serde(default, skip_serializing_if = "serde_json::Map::is_empty")]
pub progress_trace: serde_json::Map<String, serde_json::Value>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub write_channel_congestion: Option<serde_json::Map<String, serde_json::Value>>,
}

View File

@@ -145,7 +145,6 @@ zip = { version = "2.2.2", optional = true }
[features]
default = ["meilisearch-types/all-tokenizations", "mini-dashboard"]
swagger = ["utoipa-scalar"]
test-ollama = []
mini-dashboard = [
"static-files",
"anyhow",
@@ -170,5 +169,5 @@ german = ["meilisearch-types/german"]
turkish = ["meilisearch-types/turkish"]
[package.metadata.mini-dashboard]
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.2.18/build.zip"
sha1 = "b408a30dcb6e20cddb0c153c23385bcac4c8e912"
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.2.17/build.zip"
sha1 = "29e92ce25f306208a9c86f013279c736bdc1e034"

View File

@@ -364,7 +364,7 @@ fn check_version(
let (bin_major, bin_minor, bin_patch) = binary_version;
let (db_major, db_minor, db_patch) = get_version(&opt.db_path)?;
if db_major != bin_major || db_minor != bin_minor || db_patch > bin_patch {
if db_major != bin_major || db_minor != bin_minor || db_patch != bin_patch {
if opt.experimental_dumpless_upgrade {
update_version_file_for_dumpless_upgrade(
opt,

View File

@@ -514,7 +514,10 @@ pub struct IndexStats {
impl From<index_scheduler::IndexStats> for IndexStats {
fn from(stats: index_scheduler::IndexStats) -> Self {
IndexStats {
number_of_documents: stats.inner_stats.documents_database_stats.number_of_entries(),
number_of_documents: stats
.inner_stats
.number_of_documents
.unwrap_or(stats.inner_stats.documents_database_stats.number_of_entries()),
raw_document_db_size: stats.inner_stats.documents_database_stats.total_value_size(),
avg_document_size: stats.inner_stats.documents_database_stats.average_value_size(),
is_indexing: stats.is_indexing,

View File

@@ -27,12 +27,6 @@ use crate::Opt;
/// It also generates a `configure` function that configures the routes for the settings.
macro_rules! make_setting_routes {
($({route: $route:literal, update_verb: $update_verb:ident, value_type: $type:ty, err_type: $err_ty:ty, attr: $attr:ident, camelcase_attr: $camelcase_attr:literal, analytics: $analytics:ident},)*) => {
const _: fn(&meilisearch_types::settings::Settings<meilisearch_types::settings::Unchecked>) = |s| {
// This pattern match will fail at compile time if any field in Settings is not listed in the macro
match *s {
meilisearch_types::settings::Settings { $($attr: _,)* _kind: _ } => {}
}
};
$(
make_setting_route!($route, $update_verb, $type, $err_ty, $attr, $camelcase_attr, $analytics);
)*
@@ -66,7 +60,7 @@ macro_rules! make_setting_routes {
#[macro_export]
macro_rules! make_setting_route {
($route:literal, $update_verb:ident, $type:ty, $err_type:ty, $attr:ident, $camelcase_attr:literal, $analytics:ident) => {
($route:literal, $update_verb:ident, $type:ty, $err_ty:ty, $attr:ident, $camelcase_attr:literal, $analytics:ident) => {
pub mod $attr {
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse, Resource};
@@ -186,7 +180,7 @@ macro_rules! make_setting_route {
Data<IndexScheduler>,
>,
index_uid: actix_web::web::Path<String>,
body: deserr::actix_web::AwebJson<Option<$type>, $err_type>,
body: deserr::actix_web::AwebJson<Option<$type>, $err_ty>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<Analytics>,

View File

@@ -275,15 +275,8 @@ async fn test_summarized_document_addition_or_update() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -301,30 +294,21 @@ async fn test_summarized_document_addition_or_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]",
"writeChannelCongestion": "[writeChannelCongestion]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
let (task, _status_code) =
index.add_documents(json!({ "id": 42, "content": "doggos & fluff" }), Some("id")).await;
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -342,15 +326,13 @@ async fn test_summarized_document_addition_or_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]",
"writeChannelCongestion": "[writeChannelCongestion]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -361,15 +343,8 @@ async fn test_summarized_delete_documents_by_batch() {
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -387,29 +362,21 @@ async fn test_summarized_delete_documents_by_batch() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
index.create(None).await;
let (task, _status_code) = index.delete_batch(vec![42]).await;
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(2).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 2,
"progress": null,
@@ -427,14 +394,13 @@ async fn test_summarized_delete_documents_by_batch() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -447,15 +413,8 @@ async fn test_summarized_delete_documents_by_filter() {
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -474,14 +433,13 @@ async fn test_summarized_delete_documents_by_filter() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
index.create(None).await;
let (task, _status_code) =
@@ -489,15 +447,8 @@ async fn test_summarized_delete_documents_by_filter() {
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(2).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 2,
"progress": null,
@@ -516,14 +467,13 @@ async fn test_summarized_delete_documents_by_filter() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
index.update_settings(json!({ "filterableAttributes": ["doggo"] })).await;
let (task, _status_code) =
@@ -531,14 +481,7 @@ async fn test_summarized_delete_documents_by_filter() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(4).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 4,
@@ -558,8 +501,7 @@ async fn test_summarized_delete_documents_by_filter() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
@@ -575,16 +517,7 @@ async fn test_summarized_delete_document_by_id() {
let (task, _status_code) = index.delete_document(1).await;
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".uid" => "[uid]",
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
snapshot!(batch,
@r#"
{
"uid": "[uid]",
@@ -603,8 +536,7 @@ async fn test_summarized_delete_document_by_id() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
@@ -617,14 +549,7 @@ async fn test_summarized_delete_document_by_id() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(2).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 2,
@@ -643,8 +568,7 @@ async fn test_summarized_delete_document_by_id() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
@@ -673,15 +597,8 @@ async fn test_summarized_settings_update() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -708,14 +625,13 @@ async fn test_summarized_settings_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -726,15 +642,8 @@ async fn test_summarized_index_creation() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -749,28 +658,20 @@ async fn test_summarized_index_creation() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
let (task, _status_code) = index.create(Some("doggos")).await;
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -787,14 +688,13 @@ async fn test_summarized_index_creation() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -915,15 +815,8 @@ async fn test_summarized_index_update() {
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -938,28 +831,20 @@ async fn test_summarized_index_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
let (task, _status_code) = index.update(Some("bones")).await;
index.wait_task(task.uid()).await.failed();
let (batch, _) = index.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -976,14 +861,13 @@ async fn test_summarized_index_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
// And run the same two tests once the index do exists.
index.create(None).await;
@@ -992,14 +876,7 @@ async fn test_summarized_index_update() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(3).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 3,
@@ -1015,8 +892,7 @@ async fn test_summarized_index_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
@@ -1028,15 +904,8 @@ async fn test_summarized_index_update() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(4).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 4,
"progress": null,
@@ -1053,14 +922,13 @@ async fn test_summarized_index_update() {
},
"indexUids": {
"test": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -1074,15 +942,8 @@ async fn test_summarized_index_swap() {
server.wait_task(task.uid()).await.failed();
let (batch, _) = server.get_batch(0).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -1104,14 +965,13 @@ async fn test_summarized_index_swap() {
"types": {
"indexSwap": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
server.index("doggos").create(None).await;
let (task, _status_code) = server.index("cattos").create(None).await;
@@ -1123,15 +983,8 @@ async fn test_summarized_index_swap() {
server.wait_task(task.uid()).await.succeeded();
let (batch, _) = server.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -1146,14 +999,13 @@ async fn test_summarized_index_swap() {
},
"indexUids": {
"doggos": 1
},
"progressTrace": "[progressTrace]"
}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -1167,15 +1019,8 @@ async fn test_summarized_batch_cancelation() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -1192,14 +1037,13 @@ async fn test_summarized_batch_cancelation() {
"types": {
"taskCancelation": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -1213,15 +1057,8 @@ async fn test_summarized_batch_deletion() {
index.wait_task(task.uid()).await.succeeded();
let (batch, _) = index.get_batch(1).await;
assert_json_snapshot!(batch,
{
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 1,
"progress": null,
@@ -1238,14 +1075,13 @@ async fn test_summarized_batch_deletion() {
"types": {
"taskDeletion": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}
#[actix_web::test]
@@ -1255,16 +1091,8 @@ async fn test_summarized_dump_creation() {
server.wait_task(task.uid()).await;
let (batch, _) = server.get_batch(0).await;
assert_json_snapshot!(batch,
{
".details.dumpUid" => "[dumpUid]",
".duration" => "[duration]",
".enqueuedAt" => "[date]",
".startedAt" => "[date]",
".finishedAt" => "[date]",
".stats.progressTrace" => "[progressTrace]",
".stats.writeChannelCongestion" => "[writeChannelCongestion]"
},
@r###"
{ ".details.dumpUid" => "[dumpUid]", ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r#"
{
"uid": 0,
"progress": null,
@@ -1279,12 +1107,11 @@ async fn test_summarized_dump_creation() {
"types": {
"dumpCreation": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
"#);
}

View File

@@ -234,26 +234,6 @@ impl Server<Shared> {
(value, code)
}
pub async fn list_indexes(
&self,
offset: Option<usize>,
limit: Option<usize>,
) -> (Value, StatusCode) {
let (offset, limit) = (
offset.map(|offset| format!("offset={offset}")),
limit.map(|limit| format!("limit={limit}")),
);
let query_parameter = offset
.as_ref()
.zip(limit.as_ref())
.map(|(offset, limit)| format!("{offset}&{limit}"))
.or_else(|| offset.xor(limit));
if let Some(query_parameter) = query_parameter {
self.service.get(format!("/indexes?{query_parameter}")).await
} else {
self.service.get("/indexes").await
}
}
pub async fn update_raw_index_fail(
&self,
uid: impl AsRef<str>,

View File

@@ -2230,13 +2230,7 @@ async fn import_dump_v6_containing_batches_and_enqueued_tasks() {
let (tasks, _) = server.tasks().await;
snapshot!(json_string!(tasks, { ".results[1].startedAt" => "[date]", ".results[1].finishedAt" => "[date]", ".results[1].duration" => "[date]" }), name: "tasks");
let (batches, _) = server.batches().await;
snapshot!(json_string!(batches, {
".results[0].startedAt" => "[date]",
".results[0].finishedAt" => "[date]",
".results[0].duration" => "[date]",
".results[0].stats.progressTrace" => "[progressTrace]",
".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]",
}), name: "batches");
snapshot!(json_string!(batches, { ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].duration" => "[date]" }), name: "batches");
let (indexes, code) = server.list_indexes(None, None).await;
assert_eq!(code, 200, "{indexes}");

View File

@@ -1,5 +1,6 @@
---
source: crates/meilisearch/tests/dumps/mod.rs
snapshot_kind: text
---
{
"results": [
@@ -20,9 +21,7 @@ source: crates/meilisearch/tests/dumps/mod.rs
},
"indexUids": {
"kefir": 1
},
"progressTrace": "[progressTrace]",
"writeChannelCongestion": "[writeChannelCongestion]"
}
},
"duration": "[date]",
"startedAt": "[date]",

View File

@@ -1,22 +1,22 @@
use crate::json;
use meili_snap::{json_string, snapshot};
use serde_json::Value;
use serde_json::{json, Value};
use crate::common::{shared_does_not_exists_index, Server};
use crate::common::Server;
#[actix_rt::test]
async fn create_and_get_index() {
let server = Server::new_shared();
let index = server.unique_index();
let (response, code) = index.create(None).await;
let server = Server::new().await;
let index = server.index("test");
let (task, code) = index.create(None).await;
assert_eq!(code, 202);
index.wait_task(response.uid()).await.succeeded();
index.wait_task(task.uid()).await.succeeded();
let (response, code) = index.get().await;
assert_eq!(code, 200);
assert_eq!(response["uid"], "test");
assert!(response.get("createdAt").is_some());
assert!(response.get("updatedAt").is_some());
assert_eq!(response["createdAt"], response["updatedAt"]);
@@ -26,12 +26,13 @@ async fn create_and_get_index() {
#[actix_rt::test]
async fn error_get_unexisting_index() {
let index = shared_does_not_exists_index().await;
let server = Server::new().await;
let index = server.index("test");
let (response, code) = index.get().await;
let expected_response = json!({
"message": "Index `DOES_NOT_EXISTS` not found.",
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
@@ -178,14 +179,14 @@ async fn get_and_paginate_indexes() {
#[actix_rt::test]
async fn get_invalid_index_uid() {
let server = Server::new_shared();
let (response, code) =
server.create_index_fail(json!({ "uid": "this is not a valid index name" })).await;
let server = Server::new().await;
let index = server.index("this is not a valid index name");
let (response, code) = index.get().await;
snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###"
{
"message": "Invalid value at `.uid`: `this is not a valid index name` is not a valid index uid. Index uid can be an integer or a string containing only alphanumeric characters, hyphens (-) and underscores (_), and can not be more than 512 bytes.",
"message": "`this is not a valid index name` is not a valid index uid. Index uid can be an integer or a string containing only alphanumeric characters, hyphens (-) and underscores (_), and can not be more than 512 bytes.",
"code": "invalid_index_uid",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_index_uid"

View File

@@ -43,7 +43,7 @@ async fn version_too_old() {
std::fs::write(db_path.join("VERSION"), "1.11.9999").unwrap();
let options = Opt { experimental_dumpless_upgrade: true, ..default_settings };
let err = Server::new_with_options(options).await.map(|_| ()).unwrap_err();
snapshot!(err, @"Database version 1.11.9999 is too old for the experimental dumpless upgrade feature. Please generate a dump using the v1.11.9999 and import it in the v1.13.1");
snapshot!(err, @"Database version 1.11.9999 is too old for the experimental dumpless upgrade feature. Please generate a dump using the v1.11.9999 and import it in the v1.13.2");
}
#[actix_rt::test]
@@ -58,7 +58,7 @@ async fn version_requires_downgrade() {
std::fs::write(db_path.join("VERSION"), format!("{major}.{minor}.{patch}")).unwrap();
let options = Opt { experimental_dumpless_upgrade: true, ..default_settings };
let err = Server::new_with_options(options).await.map(|_| ()).unwrap_err();
snapshot!(err, @"Database version 1.13.2 is higher than the Meilisearch version 1.13.1. Downgrade is not supported");
snapshot!(err, @"Database version 1.13.3 is higher than the Meilisearch version 1.13.2. Downgrade is not supported");
}
#[actix_rt::test]

View File

@@ -8,7 +8,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"progress": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"stats": {
"totalNbTasks": 1,
@@ -18,8 +18,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"types": {
"upgradeDatabase": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",

View File

@@ -8,7 +8,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"progress": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"stats": {
"totalNbTasks": 1,
@@ -18,8 +18,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"types": {
"upgradeDatabase": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",

View File

@@ -8,7 +8,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"progress": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"stats": {
"totalNbTasks": 1,
@@ -18,8 +18,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"types": {
"upgradeDatabase": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",

View File

@@ -12,7 +12,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"canceledBy": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"error": null,
"duration": "[duration]",

View File

@@ -12,7 +12,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"canceledBy": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"error": null,
"duration": "[duration]",

View File

@@ -12,7 +12,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"canceledBy": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"error": null,
"duration": "[duration]",

View File

@@ -8,7 +8,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"progress": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"stats": {
"totalNbTasks": 1,
@@ -18,8 +18,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"types": {
"upgradeDatabase": 1
},
"indexUids": {},
"progressTrace": "[progressTrace]"
"indexUids": {}
},
"duration": "[duration]",
"startedAt": "[date]",

View File

@@ -12,7 +12,7 @@ source: crates/meilisearch/tests/upgrade/v1_12/v1_12_0.rs
"canceledBy": null,
"details": {
"upgradeFrom": "v1.12.0",
"upgradeTo": "v1.13.1"
"upgradeTo": "v1.13.2"
},
"error": null,
"duration": "[duration]",

View File

@@ -2,7 +2,6 @@
// It must test pretty much all the features of meilisearch because the other tests will only tests
// the new features they introduced.
use insta::assert_json_snapshot;
use manifest_dir_macros::exist_relative_path;
use meili_snap::{json_string, snapshot};
use meilisearch::Opt;
@@ -127,14 +126,10 @@ async fn check_the_index_scheduler(server: &Server) {
"#);
// And their metadata are still right
let (stats, _) = server.stats().await;
assert_json_snapshot!(stats, {
".databaseSize" => "[bytes]",
".usedDatabaseSize" => "[bytes]"
},
@r###"
snapshot!(stats, @r###"
{
"databaseSize": "[bytes]",
"usedDatabaseSize": "[bytes]",
"databaseSize": 438272,
"usedDatabaseSize": 196608,
"lastUpdate": "2025-01-23T11:36:22.634859166Z",
"indexes": {
"kefir": {
@@ -166,7 +161,7 @@ async fn check_the_index_scheduler(server: &Server) {
let (tasks, _) = server.tasks_filter("limit=1000").await;
snapshot!(json_string!(tasks, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "the_whole_task_queue_once_everything_has_been_processed");
let (batches, _) = server.batches_filter("limit=1000").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "the_whole_batch_queue_once_everything_has_been_processed");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "the_whole_batch_queue_once_everything_has_been_processed");
// Tests all the tasks query parameters
let (tasks, _) = server.tasks_filter("uids=10").await;
@@ -193,36 +188,32 @@ async fn check_the_index_scheduler(server: &Server) {
// Tests all the batches query parameters
let (batches, _) = server.batches_filter("uids=10").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_uids_equal_10");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_uids_equal_10");
let (batches, _) = server.batches_filter("batchUids=10").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_batchUids_equal_10");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_batchUids_equal_10");
let (batches, _) = server.batches_filter("statuses=canceled").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_statuses_equal_canceled");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_statuses_equal_canceled");
// types has already been tested above to retrieve the upgrade database
let (batches, _) = server.batches_filter("canceledBy=19").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_canceledBy_equal_19");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_canceledBy_equal_19");
let (batches, _) = server.batches_filter("beforeEnqueuedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_beforeEnqueuedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_beforeEnqueuedAt_equal_2025-01-16T16_47_41");
let (batches, _) = server.batches_filter("afterEnqueuedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_afterEnqueuedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_afterEnqueuedAt_equal_2025-01-16T16_47_41");
let (batches, _) = server.batches_filter("beforeStartedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_beforeStartedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_beforeStartedAt_equal_2025-01-16T16_47_41");
let (batches, _) = server.batches_filter("afterStartedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_afterStartedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_afterStartedAt_equal_2025-01-16T16_47_41");
let (batches, _) = server.batches_filter("beforeFinishedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_beforeFinishedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_beforeFinishedAt_equal_2025-01-16T16_47_41");
let (batches, _) = server.batches_filter("afterFinishedAt=2025-01-16T16:47:41Z").await;
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]", ".results[0].stats.progressTrace" => "[progressTrace]", ".results[0].stats.writeChannelCongestion" => "[writeChannelCongestion]" }), name: "batches_filter_afterFinishedAt_equal_2025-01-16T16_47_41");
snapshot!(json_string!(batches, { ".results[0].duration" => "[duration]", ".results[0].enqueuedAt" => "[date]", ".results[0].startedAt" => "[date]", ".results[0].finishedAt" => "[date]" }), name: "batches_filter_afterFinishedAt_equal_2025-01-16T16_47_41");
let (stats, _) = server.stats().await;
assert_json_snapshot!(stats, {
".databaseSize" => "[bytes]",
".usedDatabaseSize" => "[bytes]"
},
@r###"
snapshot!(stats, @r###"
{
"databaseSize": "[bytes]",
"usedDatabaseSize": "[bytes]",
"databaseSize": 438272,
"usedDatabaseSize": 196608,
"lastUpdate": "2025-01-23T11:36:22.634859166Z",
"indexes": {
"kefir": {

View File

@@ -1,6 +1,4 @@
mod binary_quantized;
#[cfg(feature = "test-ollama")]
mod ollama;
mod openai;
mod rest;
mod settings;

View File

@@ -1,733 +0,0 @@
//! Tests ollama embedders with the server at the location described by `MEILI_TEST_OLLAMA_SERVER` environment variable.
use std::env::VarError;
use meili_snap::{json_string, snapshot};
use crate::common::{GetAllDocumentsOptions, Value};
use crate::json;
use crate::vector::get_server_vector;
pub enum Endpoint {
/// Deprecated, undocumented endpoint
Embeddings,
/// Current endpoint
Embed,
}
impl Endpoint {
fn suffix(&self) -> &'static str {
match self {
Endpoint::Embeddings => "/api/embeddings",
Endpoint::Embed => "/api/embed",
}
}
}
pub enum Model {
Nomic,
AllMinilm,
}
impl Model {
fn name(&self) -> &'static str {
match self {
Model::Nomic => "nomic-embed-text",
Model::AllMinilm => "all-minilm",
}
}
}
const DOGGO_TEMPLATE: &str = r#"{%- if doc.gender == "F" -%}Une chienne nommée {{doc.name}}, née en {{doc.birthyear}}
{%- else -%}
Un chien nommé {{doc.name}}, né en {{doc.birthyear}}
{%- endif %}, de race {{doc.breed}}."#;
fn create_ollama_config_with_template(
document_template: &str,
model: Model,
endpoint: Endpoint,
) -> Option<Value> {
let ollama_base_url = match std::env::var("MEILI_TEST_OLLAMA_SERVER") {
Ok(ollama_base_url) => ollama_base_url,
Err(VarError::NotPresent) => return None,
Err(VarError::NotUnicode(s)) => panic!(
"`MEILI_TEST_OLLAMA_SERVER` was not properly utf-8, `{:?}`",
s.as_encoded_bytes()
),
};
Some(json!({
"source": "ollama",
"url": format!("{ollama_base_url}{}", endpoint.suffix()),
"documentTemplate": document_template,
"documentTemplateMaxBytes": 8000000,
"model": model.name()
}))
}
#[actix_rt::test]
async fn test_both_apis() {
let Some(embed_settings) =
create_ollama_config_with_template(DOGGO_TEMPLATE, Model::AllMinilm, Endpoint::Embed)
else {
panic!("Missing `MEILI_TEST_OLLAMA_SERVER` environment variable, skipping `test_both_apis` test.");
};
let Some(embeddings_settings) =
create_ollama_config_with_template(DOGGO_TEMPLATE, Model::AllMinilm, Endpoint::Embeddings)
else {
return;
};
let Some(nomic_embed_settings) =
create_ollama_config_with_template(DOGGO_TEMPLATE, Model::Nomic, Endpoint::Embed)
else {
return;
};
let Some(nomic_embeddings_settings) =
create_ollama_config_with_template(DOGGO_TEMPLATE, Model::Nomic, Endpoint::Embeddings)
else {
return;
};
let server = get_server_vector().await;
let index = server.index("doggo");
let (response, code) = index
.update_settings(json!({
"embedders": {
"embed": embed_settings,
"embeddings": embeddings_settings,
"nomic_embed": nomic_embed_settings,
"nomic_embeddings": nomic_embeddings_settings,
},
}))
.await;
snapshot!(code, @"202 Accepted");
let task = server.wait_task(response.uid()).await;
snapshot!(task["status"], @r###""succeeded""###);
let documents = json!([
{"id": 0, "name": "kefir", "gender": "M", "birthyear": 2023, "breed": "Patou"},
{"id": 1, "name": "Intel", "gender": "M", "birthyear": 2011, "breed": "Beagle"},
{"id": 2, "name": "Vénus", "gender": "F", "birthyear": 2003, "breed": "Jack Russel Terrier"},
{"id": 3, "name": "Max", "gender": "M", "birthyear": 1995, "breed": "Labrador Retriever"},
]);
let (value, code) = index.add_documents(documents, None).await;
snapshot!(code, @"202 Accepted");
let task = index.wait_task(value.uid()).await;
snapshot!(task, @r###"
{
"uid": "[uid]",
"batchUid": "[batch_uid]",
"indexUid": "doggo",
"status": "succeeded",
"type": "documentAdditionOrUpdate",
"canceledBy": null,
"details": {
"receivedDocuments": 4,
"indexedDocuments": 4
},
"error": null,
"duration": "[duration]",
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
let (documents, _code) = index
.get_all_documents(GetAllDocumentsOptions { retrieve_vectors: true, ..Default::default() })
.await;
snapshot!(json_string!(documents, {".results.*._vectors.*.embeddings" => "[vector]"}), @r###"
{
"results": [
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou",
"_vectors": {
"embed": {
"embeddings": "[vector]",
"regenerate": true
},
"embeddings": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embed": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embeddings": {
"embeddings": "[vector]",
"regenerate": true
}
}
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle",
"_vectors": {
"embed": {
"embeddings": "[vector]",
"regenerate": true
},
"embeddings": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embed": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embeddings": {
"embeddings": "[vector]",
"regenerate": true
}
}
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier",
"_vectors": {
"embed": {
"embeddings": "[vector]",
"regenerate": true
},
"embeddings": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embed": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embeddings": {
"embeddings": "[vector]",
"regenerate": true
}
}
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever",
"_vectors": {
"embed": {
"embeddings": "[vector]",
"regenerate": true
},
"embeddings": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embed": {
"embeddings": "[vector]",
"regenerate": true
},
"nomic_embeddings": {
"embeddings": "[vector]",
"regenerate": true
}
}
}
],
"offset": 0,
"limit": 20,
"total": 4
}
"###);
let (response, code) = index
.search_post(json!({
"q": "chien de chasse",
"hybrid": {"semanticRatio": 1.0, "embedder": "embed"},
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "chien de chasse",
"hybrid": {"semanticRatio": 1.0, "embedder": "embeddings"},
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "petit chien",
"hybrid": {"semanticRatio": 1.0, "embedder": "embed"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "petit chien",
"hybrid": {"semanticRatio": 1.0, "embedder": "embeddings"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "grand chien de berger des montagnes",
"hybrid": {"semanticRatio": 1.0, "embedder": "embed"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "grand chien de berger des montagnes",
"hybrid": {"semanticRatio": 1.0, "embedder": "embeddings"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "chien de chasse",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embed"},
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "chien de chasse",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embeddings"},
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "petit chien",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embed"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "petit chien",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embeddings"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "grand chien de berger des montagnes",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embed"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
}
]
"###);
let (response, code) = index
.search_post(json!({
"q": "grand chien de berger des montagnes",
"hybrid": {"semanticRatio": 1.0, "embedder": "nomic_embeddings"}
}))
.await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]), @r###"
[
{
"id": 0,
"name": "kefir",
"gender": "M",
"birthyear": 2023,
"breed": "Patou"
},
{
"id": 3,
"name": "Max",
"gender": "M",
"birthyear": 1995,
"breed": "Labrador Retriever"
},
{
"id": 2,
"name": "Vénus",
"gender": "F",
"birthyear": 2003,
"breed": "Jack Russel Terrier"
},
{
"id": 1,
"name": "Intel",
"gender": "M",
"birthyear": 2011,
"breed": "Beagle"
}
]
"###);
}

View File

@@ -85,7 +85,7 @@ rhai = { git = "https://github.com/rhaiscript/rhai", rev = "ef3df63121d27aacd838
"no_time",
"sync",
] }
arroy = "0.5.0"
arroy = { git = "https://github.com/meilisearch/arroy", rev = "55fb0e8006f4f00ad3197b06bda348133f6ffffa" }
rand = "0.8.5"
tracing = "0.1.41"
ureq = { version = "2.12.1", features = ["json"] }

View File

@@ -1905,15 +1905,9 @@ pub(crate) mod tests {
let embedders =
InnerIndexSettings::from_index(&self.inner, &rtxn, None)?.embedding_configs;
let mut indexer = indexer::DocumentOperation::new();
match self.index_documents_config.update_method {
IndexDocumentsMethod::ReplaceDocuments => {
indexer.replace_documents(&documents).unwrap()
}
IndexDocumentsMethod::UpdateDocuments => {
indexer.update_documents(&documents).unwrap()
}
}
let mut indexer =
indexer::DocumentOperation::new(self.index_documents_config.update_method);
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, operation_stats, primary_key) = indexer.into_changes(
@@ -2000,7 +1994,8 @@ pub(crate) mod tests {
let embedders =
InnerIndexSettings::from_index(&self.inner, &rtxn, None)?.embedding_configs;
let mut indexer = indexer::DocumentOperation::new();
let mut indexer =
indexer::DocumentOperation::new(self.index_documents_config.update_method);
let external_document_ids: Vec<_> =
external_document_ids.iter().map(AsRef::as_ref).collect();
indexer.delete_documents(external_document_ids.as_slice());
@@ -2077,13 +2072,13 @@ pub(crate) mod tests {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let payload = documents!([
{ "id": 1, "name": "kevin" },
{ "id": 2, "name": "bob", "age": 20 },
{ "id": 2, "name": "bob", "age": 20 },
]);
indexer.replace_documents(&payload).unwrap();
indexer.add_documents(&payload).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer

View File

@@ -74,7 +74,6 @@ pub use self::search::{
FacetDistribution, Filter, FormatOptions, MatchBounds, MatcherBuilder, MatchingWords, OrderBy,
Search, SearchResult, SemanticSearch, TermsMatchingStrategy, DEFAULT_VALUES_PER_FACET,
};
pub use self::update::ChannelCongestion;
pub type Result<T> = std::result::Result<T, error::Error>;

View File

@@ -3,10 +3,7 @@ use std::borrow::Cow;
use std::marker::PhantomData;
use std::sync::atomic::{AtomicU32, Ordering};
use std::sync::{Arc, RwLock};
use std::time::{Duration, Instant};
use indexmap::IndexMap;
use itertools::Itertools;
use serde::Serialize;
use utoipa::ToSchema;
@@ -18,42 +15,28 @@ pub trait Step: 'static + Send + Sync {
#[derive(Clone, Default)]
pub struct Progress {
steps: Arc<RwLock<InnerProgress>>,
}
#[derive(Default)]
struct InnerProgress {
/// The hierarchy of steps.
steps: Vec<(TypeId, Box<dyn Step>, Instant)>,
/// The durations associated to each steps.
durations: Vec<(String, Duration)>,
steps: Arc<RwLock<Vec<(TypeId, Box<dyn Step>)>>>,
}
impl Progress {
pub fn update_progress<P: Step>(&self, sub_progress: P) {
let mut inner = self.steps.write().unwrap();
let InnerProgress { steps, durations } = &mut *inner;
let now = Instant::now();
let mut steps = self.steps.write().unwrap();
let step_type = TypeId::of::<P>();
if let Some(idx) = steps.iter().position(|(id, _, _)| *id == step_type) {
push_steps_durations(steps, durations, now, idx);
if let Some(idx) = steps.iter().position(|(id, _)| *id == step_type) {
steps.truncate(idx);
}
steps.push((step_type, Box::new(sub_progress), now));
steps.push((step_type, Box::new(sub_progress)));
}
// TODO: This code should be in meilisearch_types but cannot because milli can't depend on meilisearch_types
pub fn as_progress_view(&self) -> ProgressView {
let inner = self.steps.read().unwrap();
let InnerProgress { steps, .. } = &*inner;
let steps = self.steps.read().unwrap();
let mut percentage = 0.0;
let mut prev_factors = 1.0;
let mut step_view = Vec::with_capacity(steps.len());
for (_, step, _) in steps.iter() {
for (_, step) in steps.iter() {
prev_factors *= step.total() as f32;
percentage += step.current() as f32 / prev_factors;
@@ -66,29 +49,6 @@ impl Progress {
ProgressView { steps: step_view, percentage: percentage * 100.0 }
}
pub fn accumulated_durations(&self) -> IndexMap<String, String> {
let mut inner = self.steps.write().unwrap();
let InnerProgress { steps, durations, .. } = &mut *inner;
let now = Instant::now();
push_steps_durations(steps, durations, now, 0);
durations.drain(..).map(|(name, duration)| (name, format!("{duration:.2?}"))).collect()
}
}
/// Generate the names associated with the durations and push them.
fn push_steps_durations(
steps: &[(TypeId, Box<dyn Step>, Instant)],
durations: &mut Vec<(String, Duration)>,
now: Instant,
idx: usize,
) {
for (i, (_, _, started_at)) in steps.iter().skip(idx).enumerate().rev() {
let full_name = steps.iter().take(idx + i + 1).map(|(_, s, _)| s.name()).join(" > ");
durations.push((full_name, now.duration_since(*started_at)));
}
}
/// This trait lets you use the AtomicSubStep defined right below.
@@ -204,7 +164,7 @@ pub struct ProgressStepView {
/// Used when the name can change but it's still the same step.
/// To avoid conflicts on the `TypeId`, create a unique type every time you use this step:
/// ```text
/// enum UpgradeVersion {}
/// enum UpgradeVersion {}
///
/// progress.update_progress(VariableNameStep::<UpgradeVersion>::new(
/// "v1 to v2",

View File

@@ -7,7 +7,7 @@ use maplit::{btreemap, hashset};
use crate::progress::Progress;
use crate::update::new::indexer;
use crate::update::{IndexerConfig, Settings};
use crate::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use crate::vector::EmbeddingConfigs;
use crate::{db_snap, Criterion, Index};
pub const CONTENT: &str = include_str!("../../../../tests/assets/test_set.ndjson");
@@ -55,7 +55,7 @@ pub fn setup_search_index_with_criteria(criteria: &[Criterion]) -> Index {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let mut file = tempfile::tempfile().unwrap();
file.write_all(CONTENT.as_bytes()).unwrap();
@@ -63,7 +63,7 @@ pub fn setup_search_index_with_criteria(criteria: &[Criterion]) -> Index {
let payload = unsafe { memmap2::Mmap::map(&file).unwrap() };
// index documents
indexer.replace_documents(&payload).unwrap();
indexer.add_documents(&payload).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, operation_stats, primary_key) = indexer

View File

@@ -776,7 +776,7 @@ mod tests {
use crate::search::TermsMatchingStrategy;
use crate::update::new::indexer;
use crate::update::Setting;
use crate::{all_obkv_to_json, db_snap, Filter, Search, UserError};
use crate::{db_snap, Filter, Search, UserError};
#[test]
fn simple_document_replacement() {
@@ -1956,11 +1956,11 @@ mod tests {
let db_fields_ids_map = index.inner.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
indexer.replace_documents(&doc1).unwrap();
indexer.replace_documents(&doc2).unwrap();
indexer.replace_documents(&doc3).unwrap();
indexer.replace_documents(&doc4).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.add_documents(&doc1).unwrap();
indexer.add_documents(&doc2).unwrap();
indexer.add_documents(&doc3).unwrap();
indexer.add_documents(&doc4).unwrap();
let indexer_alloc = Bump::new();
let (_document_changes, operation_stats, _primary_key) = indexer
@@ -1979,174 +1979,6 @@ mod tests {
assert_eq!(operation_stats.iter().filter(|ps| ps.error.is_some()).count(), 3);
}
#[test]
fn mixing_documents_replace_with_updates() {
let index = TempIndex::new_with_map_size(4096 * 100);
let doc1 = documents! {[{
"id": 1,
"title": "asdsad",
"description": "Wat wat wat, wat"
}]};
let doc2 = documents! {[{
"id": 1,
"title": "something",
}]};
let doc3 = documents! {[{
"id": 1,
"title": "another something",
}]};
let doc4 = documents! {[{
"id": 1,
"description": "This is it!",
}]};
let rtxn = index.inner.read_txn().unwrap();
let db_fields_ids_map = index.inner.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
indexer.replace_documents(&doc1).unwrap();
indexer.update_documents(&doc2).unwrap();
indexer.update_documents(&doc3).unwrap();
indexer.update_documents(&doc4).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, operation_stats, primary_key) = indexer
.into_changes(
&indexer_alloc,
&index.inner,
&rtxn,
None,
&mut new_fields_ids_map,
&|| false,
Progress::default(),
)
.unwrap();
assert_eq!(operation_stats.iter().filter(|ps| ps.error.is_none()).count(), 4);
let mut wtxn = index.write_txn().unwrap();
indexer::index(
&mut wtxn,
&index.inner,
&crate::ThreadPoolNoAbortBuilder::new().build().unwrap(),
index.indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
&|| false,
&Progress::default(),
)
.unwrap();
wtxn.commit().unwrap();
let rtxn = index.read_txn().unwrap();
let obkv = index.document(&rtxn, 0).unwrap();
let fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let json_document = all_obkv_to_json(obkv, &fields_ids_map).unwrap();
let expected = serde_json::json!({
"id": 1,
"title": "another something",
"description": "This is it!",
});
let expected = expected.as_object().unwrap();
assert_eq!(&json_document, expected);
}
#[test]
fn mixing_documents_replace_with_updates_even_more() {
let index = TempIndex::new_with_map_size(4096 * 100);
let doc1 = documents! {[{
"id": 1,
"title": "asdsad",
"description": "Wat wat wat, wat"
}]};
let doc2 = documents! {[{
"id": 1,
"title": "something",
}]};
let doc3 = documents! {[{
"id": 1,
"title": "another something",
}]};
let doc4 = documents! {[{
"id": 1,
"title": "Woooof",
}]};
let doc5 = documents! {[{
"id": 1,
"description": "This is it!",
}]};
let rtxn = index.inner.read_txn().unwrap();
let db_fields_ids_map = index.inner.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let mut indexer = indexer::DocumentOperation::new();
indexer.replace_documents(&doc1).unwrap();
indexer.update_documents(&doc2).unwrap();
indexer.update_documents(&doc3).unwrap();
indexer.replace_documents(&doc4).unwrap();
indexer.update_documents(&doc5).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, operation_stats, primary_key) = indexer
.into_changes(
&indexer_alloc,
&index.inner,
&rtxn,
None,
&mut new_fields_ids_map,
&|| false,
Progress::default(),
)
.unwrap();
assert_eq!(operation_stats.iter().filter(|ps| ps.error.is_none()).count(), 5);
let mut wtxn = index.write_txn().unwrap();
indexer::index(
&mut wtxn,
&index.inner,
&crate::ThreadPoolNoAbortBuilder::new().build().unwrap(),
index.indexer_config.grenad_parameters(),
&db_fields_ids_map,
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
&|| false,
&Progress::default(),
)
.unwrap();
wtxn.commit().unwrap();
let rtxn = index.read_txn().unwrap();
let obkv = index.document(&rtxn, 0).unwrap();
let fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let json_document = all_obkv_to_json(obkv, &fields_ids_map).unwrap();
let expected = serde_json::json!({
"id": 1,
"title": "Woooof",
"description": "This is it!",
});
let expected = expected.as_object().unwrap();
assert_eq!(&json_document, expected);
}
#[test]
fn primary_key_must_not_contain_whitespace() {
let index = TempIndex::new();
@@ -2285,8 +2117,8 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
indexer.replace_documents(&documents).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.add_documents(&documents).unwrap();
indexer.delete_documents(&["2"]);
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2338,14 +2170,14 @@ mod tests {
{ "id": 2, "doggo": { "name": "bob", "age": 20 } },
{ "id": 3, "name": "jean", "age": 25 },
]);
let mut indexer = indexer::DocumentOperation::new();
indexer.update_documents(&documents).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
indexer.add_documents(&documents).unwrap();
let documents = documents!([
{ "id": 2, "catto": "jorts" },
{ "id": 3, "legs": 4 },
]);
indexer.update_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
indexer.delete_documents(&["1", "2"]);
let indexer_alloc = Bump::new();
@@ -2400,8 +2232,8 @@ mod tests {
]);
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
indexer.update_documents(&documents).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2451,8 +2283,8 @@ mod tests {
]);
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
indexer.update_documents(&documents).unwrap();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
indexer.add_documents(&documents).unwrap();
indexer.delete_documents(&["1", "2"]);
let (document_changes, _operation_stats, primary_key) = indexer
@@ -2500,14 +2332,14 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
indexer.delete_documents(&["1", "2"]);
let documents = documents!([
{ "id": 2, "doggo": { "name": "jean", "age": 20 } },
{ "id": 3, "name": "bob", "age": 25 },
]);
indexer.update_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2555,7 +2387,7 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
indexer.delete_documents(&["1", "2", "1", "2"]);
@@ -2564,7 +2396,7 @@ mod tests {
{ "id": 2, "doggo": { "name": "jean", "age": 20 } },
{ "id": 3, "name": "bob", "age": 25 },
]);
indexer.update_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
indexer.delete_documents(&["1", "2", "1", "2"]);
@@ -2613,12 +2445,12 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::UpdateDocuments);
let documents = documents!([
{ "id": 1, "doggo": "kevin" },
]);
indexer.update_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2662,7 +2494,7 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.delete_documents(&["1"]);
@@ -2670,7 +2502,7 @@ mod tests {
{ "id": 1, "catto": "jorts" },
]);
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2856,14 +2688,14 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
// OP
let documents = documents!([
{ "id": 1, "doggo": "bernese" },
]);
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
// FINISHING
let (document_changes, _operation_stats, primary_key) = indexer
@@ -2916,14 +2748,14 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.delete_documents(&["1"]);
let documents = documents!([
{ "id": 0, "catto": "jorts" },
]);
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
@@ -2974,12 +2806,12 @@ mod tests {
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let documents = documents!([
{ "id": 1, "catto": "jorts" },
]);
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(

View File

@@ -5,7 +5,6 @@ pub use self::facet::bulk::FacetsUpdateBulk;
pub use self::facet::incremental::FacetsUpdateIncrementalInner;
pub use self::index_documents::*;
pub use self::indexer_config::IndexerConfig;
pub use self::new::ChannelCongestion;
pub use self::settings::{validate_embedding_settings, Setting, Settings};
pub use self::update_step::UpdateIndexingStep;
pub use self::word_prefix_docids::WordPrefixDocids;

View File

@@ -27,7 +27,7 @@ pub struct Update<'doc> {
docid: DocumentId,
external_document_id: &'doc str,
new: Versions<'doc>,
from_scratch: bool,
has_deletion: bool,
}
pub struct Insertion<'doc> {
@@ -109,9 +109,9 @@ impl<'doc> Update<'doc> {
docid: DocumentId,
external_document_id: &'doc str,
new: Versions<'doc>,
from_scratch: bool,
has_deletion: bool,
) -> Self {
Update { docid, new, external_document_id, from_scratch }
Update { docid, new, external_document_id, has_deletion }
}
pub fn docid(&self) -> DocumentId {
@@ -154,7 +154,7 @@ impl<'doc> Update<'doc> {
index: &'t Index,
mapper: &'t Mapper,
) -> Result<MergedDocument<'_, 'doc, 't, Mapper>> {
if self.from_scratch {
if self.has_deletion {
Ok(MergedDocument::without_db(DocumentFromVersions::new(&self.new)))
} else {
MergedDocument::with_db(
@@ -207,8 +207,8 @@ impl<'doc> Update<'doc> {
cached_current = Some(current);
}
if !self.from_scratch {
// no field deletion or update, so fields that don't appear in `updated` cannot have changed
if !self.has_deletion {
// no field deletion, so fields that don't appear in `updated` cannot have changed
return Ok(changed);
}
@@ -257,7 +257,7 @@ impl<'doc> Update<'doc> {
doc_alloc: &'doc Bump,
embedders: &'doc EmbeddingConfigs,
) -> Result<Option<MergedVectorDocument<'doc>>> {
if self.from_scratch {
if self.has_deletion {
MergedVectorDocument::without_db(
self.external_document_id,
&self.new,

View File

@@ -23,39 +23,25 @@ use crate::update::new::{Deletion, Insertion, Update};
use crate::update::{AvailableIds, IndexDocumentsMethod};
use crate::{DocumentId, Error, FieldsIdsMap, Index, InternalError, Result, UserError};
#[derive(Default)]
pub struct DocumentOperation<'pl> {
operations: Vec<Payload<'pl>>,
method: MergeMethod,
}
impl<'pl> DocumentOperation<'pl> {
pub fn new() -> Self {
Self { operations: Default::default() }
pub fn new(method: IndexDocumentsMethod) -> Self {
Self { operations: Default::default(), method: MergeMethod::from(method) }
}
/// Append a replacement of documents.
///
/// TODO please give me a type
/// The payload is expected to be in the NDJSON format
pub fn replace_documents(&mut self, payload: &'pl Mmap) -> Result<()> {
pub fn add_documents(&mut self, payload: &'pl Mmap) -> Result<()> {
#[cfg(unix)]
payload.advise(memmap2::Advice::Sequential)?;
self.operations.push(Payload::Replace(&payload[..]));
self.operations.push(Payload::Addition(&payload[..]));
Ok(())
}
/// Append an update of documents.
///
/// The payload is expected to be in the NDJSON format
pub fn update_documents(&mut self, payload: &'pl Mmap) -> Result<()> {
#[cfg(unix)]
payload.advise(memmap2::Advice::Sequential)?;
self.operations.push(Payload::Update(&payload[..]));
Ok(())
}
/// Append a deletion of documents IDs.
///
/// The list is a set of external documents IDs.
pub fn delete_documents(&mut self, to_delete: &'pl [&'pl str]) {
self.operations.push(Payload::Deletion(to_delete))
}
@@ -76,7 +62,7 @@ impl<'pl> DocumentOperation<'pl> {
MSP: Fn() -> bool,
{
progress.update_progress(IndexingStep::PreparingPayloads);
let Self { operations } = self;
let Self { operations, method } = self;
let documents_ids = index.documents_ids(rtxn)?;
let mut operations_stats = Vec::new();
@@ -96,7 +82,7 @@ impl<'pl> DocumentOperation<'pl> {
let mut bytes = 0;
let result = match operation {
Payload::Replace(payload) => extract_addition_payload_changes(
Payload::Addition(payload) => extract_addition_payload_changes(
indexer,
index,
rtxn,
@@ -106,20 +92,7 @@ impl<'pl> DocumentOperation<'pl> {
&mut available_docids,
&mut bytes,
&docids_version_offsets,
IndexDocumentsMethod::ReplaceDocuments,
payload,
),
Payload::Update(payload) => extract_addition_payload_changes(
indexer,
index,
rtxn,
primary_key_from_op,
&mut primary_key,
new_fields_ids_map,
&mut available_docids,
&mut bytes,
&docids_version_offsets,
IndexDocumentsMethod::UpdateDocuments,
method,
payload,
),
Payload::Deletion(to_delete) => extract_deletion_payload_changes(
@@ -127,6 +100,7 @@ impl<'pl> DocumentOperation<'pl> {
rtxn,
&mut available_docids,
&docids_version_offsets,
method,
to_delete,
),
};
@@ -152,15 +126,20 @@ impl<'pl> DocumentOperation<'pl> {
docids_version_offsets.drain().collect_in(indexer);
// Reorder the offsets to make sure we iterate on the file sequentially
// And finally sort them. This clearly speeds up reading the update files.
docids_version_offsets
.sort_unstable_by_key(|(_, po)| first_update_pointer(&po.operations).unwrap_or(0));
// And finally sort them
docids_version_offsets.sort_unstable_by_key(|(_, po)| method.sort_key(&po.operations));
let docids_version_offsets = docids_version_offsets.into_bump_slice();
Ok((DocumentOperationChanges { docids_version_offsets }, operations_stats, primary_key))
}
}
impl Default for DocumentOperation<'_> {
fn default() -> Self {
DocumentOperation::new(IndexDocumentsMethod::default())
}
}
#[allow(clippy::too_many_arguments)]
fn extract_addition_payload_changes<'r, 'pl: 'r>(
indexer: &'pl Bump,
@@ -172,11 +151,9 @@ fn extract_addition_payload_changes<'r, 'pl: 'r>(
available_docids: &mut AvailableIds,
bytes: &mut u64,
main_docids_version_offsets: &hashbrown::HashMap<&'pl str, PayloadOperations<'pl>>,
method: IndexDocumentsMethod,
method: MergeMethod,
payload: &'pl [u8],
) -> Result<hashbrown::HashMap<&'pl str, PayloadOperations<'pl>>> {
use IndexDocumentsMethod::{ReplaceDocuments, UpdateDocuments};
let mut new_docids_version_offsets = hashbrown::HashMap::<&str, PayloadOperations<'pl>>::new();
let mut previous_offset = 0;
@@ -227,82 +204,48 @@ fn extract_addition_payload_changes<'r, 'pl: 'r>(
None => {
match index.external_documents_ids().get(rtxn, external_id) {
Ok(Some(docid)) => match new_docids_version_offsets.entry(external_id) {
Entry::Occupied(mut entry) => match method {
ReplaceDocuments => entry.get_mut().push_replacement(document_offset),
UpdateDocuments => entry.get_mut().push_update(document_offset),
},
Entry::Occupied(mut entry) => {
entry.get_mut().push_addition(document_offset)
}
Entry::Vacant(entry) => {
match method {
ReplaceDocuments => {
entry.insert(PayloadOperations::new_replacement(
docid,
false, // is new
document_offset,
));
}
UpdateDocuments => {
entry.insert(PayloadOperations::new_update(
docid,
false, // is new
document_offset,
));
}
}
entry.insert(PayloadOperations::new_addition(
method,
docid,
false, // is new
document_offset,
));
}
},
Ok(None) => match new_docids_version_offsets.entry(external_id) {
Entry::Occupied(mut entry) => match method {
ReplaceDocuments => entry.get_mut().push_replacement(document_offset),
UpdateDocuments => entry.get_mut().push_update(document_offset),
},
Entry::Occupied(mut entry) => {
entry.get_mut().push_addition(document_offset)
}
Entry::Vacant(entry) => {
let docid = match available_docids.next() {
Some(docid) => docid,
None => return Err(UserError::DocumentLimitReached.into()),
};
match method {
ReplaceDocuments => {
entry.insert(PayloadOperations::new_replacement(
docid,
true, // is new
document_offset,
));
}
UpdateDocuments => {
entry.insert(PayloadOperations::new_update(
docid,
true, // is new
document_offset,
));
}
}
entry.insert(PayloadOperations::new_addition(
method,
docid,
true, // is new
document_offset,
));
}
},
Err(e) => return Err(e.into()),
}
}
Some(payload_operations) => match new_docids_version_offsets.entry(external_id) {
Entry::Occupied(mut entry) => match method {
ReplaceDocuments => entry.get_mut().push_replacement(document_offset),
UpdateDocuments => entry.get_mut().push_update(document_offset),
},
Entry::Vacant(entry) => match method {
ReplaceDocuments => {
entry.insert(PayloadOperations::new_replacement(
payload_operations.docid,
payload_operations.is_new,
document_offset,
));
}
UpdateDocuments => {
entry.insert(PayloadOperations::new_update(
payload_operations.docid,
payload_operations.is_new,
document_offset,
));
}
},
Entry::Occupied(mut entry) => entry.get_mut().push_addition(document_offset),
Entry::Vacant(entry) => {
entry.insert(PayloadOperations::new_addition(
method,
payload_operations.docid,
payload_operations.is_new,
document_offset,
));
}
},
}
@@ -335,6 +278,7 @@ fn extract_deletion_payload_changes<'s, 'pl: 's>(
rtxn: &RoTxn,
available_docids: &mut AvailableIds,
main_docids_version_offsets: &hashbrown::HashMap<&'s str, PayloadOperations<'pl>>,
method: MergeMethod,
to_delete: &'pl [&'pl str],
) -> Result<hashbrown::HashMap<&'s str, PayloadOperations<'pl>>> {
let mut new_docids_version_offsets = hashbrown::HashMap::<&str, PayloadOperations<'pl>>::new();
@@ -348,7 +292,7 @@ fn extract_deletion_payload_changes<'s, 'pl: 's>(
Entry::Occupied(mut entry) => entry.get_mut().push_deletion(),
Entry::Vacant(entry) => {
entry.insert(PayloadOperations::new_deletion(
docid, false, // is new
method, docid, false, // is new
));
}
}
@@ -362,7 +306,7 @@ fn extract_deletion_payload_changes<'s, 'pl: 's>(
Entry::Occupied(mut entry) => entry.get_mut().push_deletion(),
Entry::Vacant(entry) => {
entry.insert(PayloadOperations::new_deletion(
docid, true, // is new
method, docid, true, // is new
));
}
}
@@ -374,6 +318,7 @@ fn extract_deletion_payload_changes<'s, 'pl: 's>(
Entry::Occupied(mut entry) => entry.get_mut().push_deletion(),
Entry::Vacant(entry) => {
entry.insert(PayloadOperations::new_deletion(
method,
payload_operations.docid,
payload_operations.is_new,
));
@@ -424,7 +369,13 @@ impl<'pl> DocumentChanges<'pl> for DocumentOperationChanges<'pl> {
'pl: 'doc,
{
let (external_doc, payload_operations) = item;
payload_operations.merge(external_doc, &context.doc_alloc)
payload_operations.merge_method.merge(
payload_operations.docid,
external_doc,
payload_operations.is_new,
&context.doc_alloc,
&payload_operations.operations[..],
)
}
fn len(&self) -> usize {
@@ -437,8 +388,7 @@ pub struct DocumentOperationChanges<'pl> {
}
pub enum Payload<'pl> {
Replace(&'pl [u8]),
Update(&'pl [u8]),
Addition(&'pl [u8]),
Deletion(&'pl [&'pl str]),
}
@@ -455,30 +405,31 @@ pub struct PayloadOperations<'pl> {
pub is_new: bool,
/// The operations to perform, in order, on this document.
pub operations: Vec<InnerDocOp<'pl>>,
/// The merge method we are using to merge payloads and documents.
merge_method: MergeMethod,
}
impl<'pl> PayloadOperations<'pl> {
fn new_replacement(docid: DocumentId, is_new: bool, offset: DocumentOffset<'pl>) -> Self {
Self { docid, is_new, operations: vec![InnerDocOp::Replace(offset)] }
fn new_deletion(merge_method: MergeMethod, docid: DocumentId, is_new: bool) -> Self {
Self { docid, is_new, operations: vec![InnerDocOp::Deletion], merge_method }
}
fn new_update(docid: DocumentId, is_new: bool, offset: DocumentOffset<'pl>) -> Self {
Self { docid, is_new, operations: vec![InnerDocOp::Update(offset)] }
}
fn new_deletion(docid: DocumentId, is_new: bool) -> Self {
Self { docid, is_new, operations: vec![InnerDocOp::Deletion] }
fn new_addition(
merge_method: MergeMethod,
docid: DocumentId,
is_new: bool,
offset: DocumentOffset<'pl>,
) -> Self {
Self { docid, is_new, operations: vec![InnerDocOp::Addition(offset)], merge_method }
}
}
impl<'pl> PayloadOperations<'pl> {
fn push_replacement(&mut self, offset: DocumentOffset<'pl>) {
self.operations.clear();
self.operations.push(InnerDocOp::Replace(offset))
}
fn push_update(&mut self, offset: DocumentOffset<'pl>) {
self.operations.push(InnerDocOp::Update(offset))
fn push_addition(&mut self, offset: DocumentOffset<'pl>) {
if self.merge_method.useless_previous_changes() {
self.operations.clear();
}
self.operations.push(InnerDocOp::Addition(offset))
}
fn push_deletion(&mut self) {
@@ -488,114 +439,16 @@ impl<'pl> PayloadOperations<'pl> {
fn append_operations(&mut self, mut operations: Vec<InnerDocOp<'pl>>) {
debug_assert!(!operations.is_empty());
if matches!(operations.first(), Some(InnerDocOp::Deletion | InnerDocOp::Replace(_))) {
if self.merge_method.useless_previous_changes() {
self.operations.clear();
}
self.operations.append(&mut operations);
}
/// Returns only the most recent version of a document based on the updates from the payloads.
///
/// This function is only meant to be used when doing a replacement and not an update.
fn merge<'doc>(
&self,
external_doc: &'doc str,
doc_alloc: &'doc Bump,
) -> Result<Option<DocumentChange<'doc>>>
where
'pl: 'doc,
{
match self.operations.last() {
Some(InnerDocOp::Replace(DocumentOffset { content })) => {
let document = serde_json::from_slice(content).unwrap();
let document =
RawMap::from_raw_value_and_hasher(document, FxBuildHasher, doc_alloc)
.map_err(UserError::SerdeJson)?;
if self.is_new {
Ok(Some(DocumentChange::Insertion(Insertion::create(
self.docid,
external_doc,
Versions::single(document),
))))
} else {
Ok(Some(DocumentChange::Update(Update::create(
self.docid,
external_doc,
Versions::single(document),
true,
))))
}
}
Some(InnerDocOp::Update(_)) => {
// Search the first operation that is a tombstone which resets the document.
let last_tombstone = self
.operations
.iter()
.rposition(|op| matches!(op, InnerDocOp::Deletion | InnerDocOp::Replace(_)));
// Track when we must ignore previous document versions from the rtxn.
let from_scratch = last_tombstone.is_some();
// We ignore deletion and keep the replacement to create the appropriate versions.
let operations = match last_tombstone {
Some(i) => match self.operations[i] {
InnerDocOp::Deletion => &self.operations[i + 1..],
InnerDocOp::Replace(_) => &self.operations[i..],
InnerDocOp::Update(_) => unreachable!("Found a non-tombstone operation"),
},
None => &self.operations[..],
};
// We collect the versions to generate the appropriate document.
let versions = operations.iter().map(|operation| {
let DocumentOffset { content } = match operation {
InnerDocOp::Replace(offset) | InnerDocOp::Update(offset) => offset,
InnerDocOp::Deletion => unreachable!("Deletion in document operations"),
};
let document = serde_json::from_slice(content).unwrap();
let document =
RawMap::from_raw_value_and_hasher(document, FxBuildHasher, doc_alloc)
.map_err(UserError::SerdeJson)?;
Ok(document)
});
let Some(versions) = Versions::multiple(versions)? else { return Ok(None) };
if self.is_new {
Ok(Some(DocumentChange::Insertion(Insertion::create(
self.docid,
external_doc,
versions,
))))
} else {
Ok(Some(DocumentChange::Update(Update::create(
self.docid,
external_doc,
versions,
from_scratch,
))))
}
}
Some(InnerDocOp::Deletion) => {
return if self.is_new {
Ok(None)
} else {
let deletion = Deletion::create(self.docid, external_doc);
Ok(Some(DocumentChange::Deletion(deletion)))
};
}
None => unreachable!("We must not have an empty set of operations on a document"),
}
}
}
#[derive(Clone)]
pub enum InnerDocOp<'pl> {
Replace(DocumentOffset<'pl>),
Update(DocumentOffset<'pl>),
Addition(DocumentOffset<'pl>),
Deletion,
}
@@ -607,14 +460,231 @@ pub struct DocumentOffset<'pl> {
pub content: &'pl [u8],
}
/// Returns the first pointer of the first change in a document.
///
/// This is used to sort the documents in update file content order
/// and read the update file in order to largely speed up the indexation.
pub fn first_update_pointer(docops: &[InnerDocOp]) -> Option<usize> {
docops.iter().find_map(|ido: &_| match ido {
InnerDocOp::Replace(replace) => Some(replace.content.as_ptr() as usize),
InnerDocOp::Update(update) => Some(update.content.as_ptr() as usize),
InnerDocOp::Deletion => None,
})
trait MergeChanges {
/// Whether the payloads in the list of operations are useless or not.
fn useless_previous_changes(&self) -> bool;
/// Returns a key that is used to order the payloads the right way.
fn sort_key(&self, docops: &[InnerDocOp]) -> usize;
fn merge<'doc>(
&self,
docid: DocumentId,
external_docid: &'doc str,
is_new: bool,
doc_alloc: &'doc Bump,
operations: &'doc [InnerDocOp],
) -> Result<Option<DocumentChange<'doc>>>;
}
#[derive(Debug, Clone, Copy)]
enum MergeMethod {
ForReplacement(MergeDocumentForReplacement),
ForUpdates(MergeDocumentForUpdates),
}
impl MergeChanges for MergeMethod {
fn useless_previous_changes(&self) -> bool {
match self {
MergeMethod::ForReplacement(merge) => merge.useless_previous_changes(),
MergeMethod::ForUpdates(merge) => merge.useless_previous_changes(),
}
}
fn sort_key(&self, docops: &[InnerDocOp]) -> usize {
match self {
MergeMethod::ForReplacement(merge) => merge.sort_key(docops),
MergeMethod::ForUpdates(merge) => merge.sort_key(docops),
}
}
fn merge<'doc>(
&self,
docid: DocumentId,
external_docid: &'doc str,
is_new: bool,
doc_alloc: &'doc Bump,
operations: &'doc [InnerDocOp],
) -> Result<Option<DocumentChange<'doc>>> {
match self {
MergeMethod::ForReplacement(merge) => {
merge.merge(docid, external_docid, is_new, doc_alloc, operations)
}
MergeMethod::ForUpdates(merge) => {
merge.merge(docid, external_docid, is_new, doc_alloc, operations)
}
}
}
}
impl From<IndexDocumentsMethod> for MergeMethod {
fn from(method: IndexDocumentsMethod) -> Self {
match method {
IndexDocumentsMethod::ReplaceDocuments => {
MergeMethod::ForReplacement(MergeDocumentForReplacement)
}
IndexDocumentsMethod::UpdateDocuments => {
MergeMethod::ForUpdates(MergeDocumentForUpdates)
}
}
}
}
#[derive(Debug, Clone, Copy)]
struct MergeDocumentForReplacement;
impl MergeChanges for MergeDocumentForReplacement {
fn useless_previous_changes(&self) -> bool {
true
}
/// Reorders to read only the last change.
fn sort_key(&self, docops: &[InnerDocOp]) -> usize {
let f = |ido: &_| match ido {
InnerDocOp::Addition(add) => Some(add.content.as_ptr() as usize),
InnerDocOp::Deletion => None,
};
docops.iter().rev().find_map(f).unwrap_or(0)
}
/// Returns only the most recent version of a document based on the updates from the payloads.
///
/// This function is only meant to be used when doing a replacement and not an update.
fn merge<'doc>(
&self,
docid: DocumentId,
external_doc: &'doc str,
is_new: bool,
doc_alloc: &'doc Bump,
operations: &'doc [InnerDocOp],
) -> Result<Option<DocumentChange<'doc>>> {
match operations.last() {
Some(InnerDocOp::Addition(DocumentOffset { content })) => {
let document = serde_json::from_slice(content).unwrap();
let document =
RawMap::from_raw_value_and_hasher(document, FxBuildHasher, doc_alloc)
.map_err(UserError::SerdeJson)?;
if is_new {
Ok(Some(DocumentChange::Insertion(Insertion::create(
docid,
external_doc,
Versions::single(document),
))))
} else {
Ok(Some(DocumentChange::Update(Update::create(
docid,
external_doc,
Versions::single(document),
true,
))))
}
}
Some(InnerDocOp::Deletion) => {
return if is_new {
Ok(None)
} else {
let deletion = Deletion::create(docid, external_doc);
Ok(Some(DocumentChange::Deletion(deletion)))
};
}
None => unreachable!("We must not have empty set of operations on a document"),
}
}
}
#[derive(Debug, Clone, Copy)]
struct MergeDocumentForUpdates;
impl MergeChanges for MergeDocumentForUpdates {
fn useless_previous_changes(&self) -> bool {
false
}
/// Reorders to read the first changes first so that it's faster to read the first one and then the rest.
fn sort_key(&self, docops: &[InnerDocOp]) -> usize {
let f = |ido: &_| match ido {
InnerDocOp::Addition(add) => Some(add.content.as_ptr() as usize),
InnerDocOp::Deletion => None,
};
docops.iter().find_map(f).unwrap_or(0)
}
/// Reads the previous version of a document from the database, the new versions
/// in the grenad update files and merges them to generate a new boxed obkv.
///
/// This function is only meant to be used when doing an update and not a replacement.
fn merge<'doc>(
&self,
docid: DocumentId,
external_docid: &'doc str,
is_new: bool,
doc_alloc: &'doc Bump,
operations: &'doc [InnerDocOp],
) -> Result<Option<DocumentChange<'doc>>> {
if operations.is_empty() {
unreachable!("We must not have empty set of operations on a document");
}
let last_deletion = operations.iter().rposition(|op| matches!(op, InnerDocOp::Deletion));
let operations = &operations[last_deletion.map_or(0, |i| i + 1)..];
let has_deletion = last_deletion.is_some();
if operations.is_empty() {
return if is_new {
Ok(None)
} else {
let deletion = Deletion::create(docid, external_docid);
Ok(Some(DocumentChange::Deletion(deletion)))
};
}
let versions = match operations {
[single] => {
let DocumentOffset { content } = match single {
InnerDocOp::Addition(offset) => offset,
InnerDocOp::Deletion => {
unreachable!("Deletion in document operations")
}
};
let document = serde_json::from_slice(content).unwrap();
let document =
RawMap::from_raw_value_and_hasher(document, FxBuildHasher, doc_alloc)
.map_err(UserError::SerdeJson)?;
Some(Versions::single(document))
}
operations => {
let versions = operations.iter().map(|operation| {
let DocumentOffset { content } = match operation {
InnerDocOp::Addition(offset) => offset,
InnerDocOp::Deletion => {
unreachable!("Deletion in document operations")
}
};
let document = serde_json::from_slice(content).unwrap();
let document =
RawMap::from_raw_value_and_hasher(document, FxBuildHasher, doc_alloc)
.map_err(UserError::SerdeJson)?;
Ok(document)
});
Versions::multiple(versions)?
}
};
let Some(versions) = versions else { return Ok(None) };
if is_new {
Ok(Some(DocumentChange::Insertion(Insertion::create(docid, external_docid, versions))))
} else {
Ok(Some(DocumentChange::Update(Update::create(
docid,
external_docid,
versions,
has_deletion,
))))
}
}
}

View File

@@ -292,7 +292,7 @@ where
&indexing_context.must_stop_processing,
)?;
}
indexing_context.progress.update_progress(IndexingStep::WaitingForDatabaseWrites);
indexing_context.progress.update_progress(IndexingStep::WritingToDatabase);
finished_extraction.store(true, std::sync::atomic::Ordering::Relaxed);
Result::Ok((facet_field_ids_delta, index_embeddings))

View File

@@ -10,7 +10,6 @@ use hashbrown::HashMap;
use heed::RwTxn;
pub use partial_dump::PartialDump;
pub use update_by_function::UpdateByFunction;
pub use write::ChannelCongestion;
use write::{build_vectors, update_index, write_to_db};
use super::channel::*;
@@ -54,7 +53,7 @@ pub fn index<'pl, 'indexer, 'index, DC, MSP>(
embedders: EmbeddingConfigs,
must_stop_processing: &'indexer MSP,
progress: &'indexer Progress,
) -> Result<ChannelCongestion>
) -> Result<()>
where
DC: DocumentChanges<'pl>,
MSP: Fn() -> bool + Sync,
@@ -132,7 +131,7 @@ where
let mut document_ids = index.documents_ids(wtxn)?;
let mut modified_docids = roaring::RoaringBitmap::new();
let congestion = thread::scope(|s| -> Result<ChannelCongestion> {
thread::scope(|s| -> Result<()> {
let indexer_span = tracing::Span::current();
let embedders = &embedders;
let finished_extraction = &finished_extraction;
@@ -186,8 +185,7 @@ where
let mut arroy_writers = arroy_writers?;
let congestion =
write_to_db(writer_receiver, finished_extraction, index, wtxn, &arroy_writers)?;
write_to_db(writer_receiver, finished_extraction, index, wtxn, &arroy_writers)?;
indexing_context.progress.update_progress(IndexingStep::WaitingForExtractors);
@@ -215,7 +213,7 @@ where
indexing_context.progress.update_progress(IndexingStep::Finalizing);
Ok(congestion) as Result<_>
Ok(()) as Result<_>
})?;
// required to into_inner the new_fields_ids_map
@@ -233,5 +231,5 @@ where
modified_docids,
)?;
Ok(congestion)
Ok(())
}

View File

@@ -14,13 +14,13 @@ use crate::update::settings::InnerIndexSettings;
use crate::vector::{ArroyWrapper, Embedder, EmbeddingConfigs, Embeddings};
use crate::{Error, Index, InternalError, Result};
pub fn write_to_db(
pub(super) fn write_to_db(
mut writer_receiver: WriterBbqueueReceiver<'_>,
finished_extraction: &AtomicBool,
index: &Index,
wtxn: &mut RwTxn<'_>,
arroy_writers: &HashMap<u8, (&str, &Embedder, ArroyWrapper, usize)>,
) -> Result<ChannelCongestion> {
) -> Result<()> {
// Used by by the ArroySetVector to copy the embedding into an
// aligned memory area, required by arroy to accept a new vector.
let mut aligned_embedding = Vec::new();
@@ -75,29 +75,21 @@ pub fn write_to_db(
write_from_bbqueue(&mut writer_receiver, index, wtxn, arroy_writers, &mut aligned_embedding)?;
Ok(ChannelCongestion {
attempts: writer_receiver.sent_messages_attempts(),
blocking_attempts: writer_receiver.blocking_sent_messages_attempts(),
})
}
let direct_attempts = writer_receiver.sent_messages_attempts();
let blocking_attempts = writer_receiver.blocking_sent_messages_attempts();
let congestion_pct = (blocking_attempts as f64 / direct_attempts as f64) * 100.0;
tracing::debug!(
"Channel congestion metrics - \
Attempts: {direct_attempts}, \
Blocked attempts: {blocking_attempts} \
({congestion_pct:.1}% congestion)"
);
/// Stats exposing the congestion of a channel.
#[derive(Debug, Copy, Clone)]
pub struct ChannelCongestion {
/// Number of attempts to send a message into the bbqueue buffer.
pub attempts: usize,
/// Number of blocking attempts which require a retry.
pub blocking_attempts: usize,
}
impl ChannelCongestion {
pub fn congestion_ratio(&self) -> f32 {
self.blocking_attempts as f32 / self.attempts as f32
}
Ok(())
}
#[tracing::instrument(level = "debug", skip_all, target = "indexing::vectors")]
pub fn build_vectors<MSP>(
pub(super) fn build_vectors<MSP>(
index: &Index,
wtxn: &mut RwTxn<'_>,
index_embeddings: Vec<IndexEmbeddingConfig>,

View File

@@ -1,5 +1,4 @@
pub use document_change::{Deletion, DocumentChange, Insertion, Update};
pub use indexer::ChannelCongestion;
pub use merger::{
merge_and_send_docids, merge_and_send_facet_docids, FacetDatabases, FacetFieldIdsDelta,
};

View File

@@ -14,7 +14,7 @@ pub enum IndexingStep {
ExtractingWordProximity,
ExtractingEmbeddings,
WritingGeoPoints,
WaitingForDatabaseWrites,
WritingToDatabase,
WaitingForExtractors,
WritingEmbeddingsToDatabase,
PostProcessingFacets,
@@ -32,7 +32,7 @@ impl Step for IndexingStep {
IndexingStep::ExtractingWordProximity => "extracting word proximity",
IndexingStep::ExtractingEmbeddings => "extracting embeddings",
IndexingStep::WritingGeoPoints => "writing geo points",
IndexingStep::WaitingForDatabaseWrites => "waiting for database writes",
IndexingStep::WritingToDatabase => "writing to database",
IndexingStep::WaitingForExtractors => "waiting for extractors",
IndexingStep::WritingEmbeddingsToDatabase => "writing embeddings to database",
IndexingStep::PostProcessingFacets => "post-processing facets",

View File

@@ -39,9 +39,8 @@ pub fn upgrade(
(1, 12, 0..=2) => 0,
(1, 12, 3..) => 1,
(1, 13, 0) => 2,
(1, 13, 1) => 3,
// We must handle the current version in the match because in case of a failure some index may have been upgraded but not other.
(1, 13, _) => return Ok(false),
(1, 13, _) => 3,
(major, minor, patch) => {
return Err(InternalError::CannotUpgradeToVersion(major, minor, patch).into())
}

View File

@@ -5,7 +5,7 @@ use maplit::hashset;
use milli::documents::mmap_from_objects;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::{FacetDistribution, Index, Object, OrderBy};
use serde_json::{from_value, json};
@@ -36,7 +36,7 @@ fn test_facet_distribution_with_no_facet_values() {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let doc1: Object = from_value(
json!({ "id": 123, "title": "What a week, hu...", "genres": [], "tags": ["blue"] }),
@@ -47,7 +47,7 @@ fn test_facet_distribution_with_no_facet_values() {
let documents = mmap_from_objects(vec![doc1, doc2]);
// index documents
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer

View File

@@ -9,7 +9,7 @@ use heed::EnvOpenOptions;
use maplit::{btreemap, hashset};
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::{AscDesc, Criterion, DocumentId, Index, Member, TermsMatchingStrategy};
use serde::{Deserialize, Deserializer};
@@ -72,7 +72,7 @@ pub fn setup_search_index_with_criteria(criteria: &[Criterion]) -> Index {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let mut file = tempfile::tempfile().unwrap();
file.write_all(CONTENT.as_bytes()).unwrap();
@@ -80,7 +80,7 @@ pub fn setup_search_index_with_criteria(criteria: &[Criterion]) -> Index {
let payload = unsafe { memmap2::Mmap::map(&file).unwrap() };
// index documents
indexer.replace_documents(&payload).unwrap();
indexer.add_documents(&payload).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, operation_stats, primary_key) = indexer

View File

@@ -7,7 +7,7 @@ use itertools::Itertools;
use maplit::hashset;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::{AscDesc, Criterion, Index, Member, Search, SearchResult, TermsMatchingStrategy};
use rand::Rng;
@@ -288,7 +288,7 @@ fn criteria_ascdesc() {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
let mut file = tempfile::tempfile().unwrap();
(0..ASC_DESC_CANDIDATES_THRESHOLD + 1).for_each(|_| {
@@ -318,7 +318,7 @@ fn criteria_ascdesc() {
file.sync_all().unwrap();
let payload = unsafe { memmap2::Mmap::map(&file).unwrap() };
indexer.replace_documents(&payload).unwrap();
indexer.add_documents(&payload).unwrap();
let (document_changes, _operation_stats, primary_key) = indexer
.into_changes(
&indexer_alloc,

View File

@@ -5,7 +5,7 @@ use heed::EnvOpenOptions;
use milli::documents::mmap_from_objects;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::update::{IndexDocumentsMethod, IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::{Criterion, Index, Object, Search, TermsMatchingStrategy};
use serde_json::from_value;
@@ -123,9 +123,9 @@ fn test_typo_disabled_on_word() {
let db_fields_ids_map = index.fields_ids_map(&rtxn).unwrap();
let mut new_fields_ids_map = db_fields_ids_map.clone();
let embedders = EmbeddingConfigs::default();
let mut indexer = indexer::DocumentOperation::new();
let mut indexer = indexer::DocumentOperation::new(IndexDocumentsMethod::ReplaceDocuments);
indexer.replace_documents(&documents).unwrap();
indexer.add_documents(&documents).unwrap();
let indexer_alloc = Bump::new();
let (document_changes, _operation_stats, primary_key) = indexer

View File

@@ -6,8 +6,8 @@ use xtask::bench::BenchDeriveArgs;
/// List features available in the workspace
#[derive(Parser, Debug)]
struct ListFeaturesDeriveArgs {
/// Feature to exclude from the list. Use a comma to separate multiple features.
#[arg(short, long, value_delimiter = ',')]
/// Feature to exclude from the list. Repeat the argument to exclude multiple features
#[arg(short, long)]
exclude_feature: Vec<String>,
}