Compare commits

...

72 Commits

Author SHA1 Message Date
f82ab3cc50 Experiments with Quentin 2024-07-15 16:20:02 +02:00
b64b4ab6ca Merge #4762
4762: Add search benchmarks r=Kerollmops a=dureuill

# Pull Request

## What does this PR do?
- [x] Modifies `xtask bench` so that workloads support an optional `target` argument. `target` defaults to `indexing::=trace`
- [x] Refactor the spans in the search to offer finer profiling granularity
- [x] Add search workloads  
- [x] Updates documentation in `BENCHMARKS.md`


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-03 08:39:29 +00:00
427861b323 Update documentation in BENCHMARKS.md 2024-07-02 16:13:54 +02:00
d29cb75061 Add search workloads 2024-07-02 16:13:54 +02:00
128e6c7502 Search: spans with a finer granularity 2024-07-02 16:13:53 +02:00
3129f96603 xtask bench: Add support for overriding the profiling target 2024-07-02 16:12:50 +02:00
c701d89fdc Merge #4754
4754: bring back v1.9.0 changes to main r=irevoire a=ManyTheFish



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-07-02 13:30:50 +00:00
3d9befd64f fix warning 2024-07-02 15:30:16 +02:00
ee14d5196c fix the tests 2024-07-02 15:18:30 +02:00
d96372b9c4 Merge branch 'main' into tmp-release-v1.9.0 2024-07-02 14:48:50 +02:00
ea67816a21 Merge #4758
4758: Bump docker/build-push-action from 5 to 6 r=curquiza a=dependabot[bot]

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/build-push-action/releases">docker/build-push-action's releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<ul>
<li>Export build record and generate <a href="https://docs.docker.com/build/ci/github-actions/build-summary/">build summary</a> by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1120">docker/build-push-action#1120</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.24.0 to 0.26.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1132">docker/build-push-action#1132</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1136">docker/build-push-action#1136</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1138">docker/build-push-action#1138</a></li>
<li>Bump braces from 3.0.2 to 3.0.3 in <a href="https://redirect.github.com/docker/build-push-action/pull/1137">docker/build-push-action#1137</a></li>
</ul>
<blockquote>
<p>[!NOTE]
This major release adds support for generating <a href="https://docs.docker.com/build/ci/github-actions/build-summary/">Build summary</a> and exporting build record for your build. You can disable this feature by setting <a href="https://docs.docker.com/build/ci/github-actions/build-summary/#disable-job-summary"> <code>DOCKER_BUILD_NO_SUMMARY: true</code> environment variable in your workflow</a>.</p>
</blockquote>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.4.0...v6.0.0">https://github.com/docker/build-push-action/compare/v5.4.0...v6.0.0</a></p>
<h2>v5.4.0</h2>
<ul>
<li>Show builder information before building by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1128">docker/build-push-action#1128</a></li>
<li>Handle attestations correctly with provenance and sbom inputs by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1086">docker/build-push-action#1086</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.19.0 to 0.24.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1088">docker/build-push-action#1088</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1105">docker/build-push-action#1105</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1121">docker/build-push-action#1121</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1127">docker/build-push-action#1127</a></li>
<li>Bump undici from 5.28.3 to 5.28.4 in <a href="https://redirect.github.com/docker/build-push-action/pull/1090">docker/build-push-action#1090</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.3.0...v5.4.0">https://github.com/docker/build-push-action/compare/v5.3.0...v5.4.0</a></p>
<h2>v5.3.0</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.18.0 to 0.19.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1080">docker/build-push-action#1080</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.2.0...v5.3.0">https://github.com/docker/build-push-action/compare/v5.2.0...v5.3.0</a></p>
<h2>v5.2.0</h2>
<ul>
<li>Disable quotes detection for <code>outputs</code> input by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1074">docker/build-push-action#1074</a></li>
<li>Warn about ignored inputs by <a href="https://github.com/favonia"><code>`@​favonia</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1019">docker/build-push-action#1019</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.14.0 to 0.18.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1070">docker/build-push-action#1070</a></li>
<li>Bump undici from 5.26.3 to 5.28.3 in <a href="https://redirect.github.com/docker/build-push-action/pull/1057">docker/build-push-action#1057</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.1.0...v5.2.0">https://github.com/docker/build-push-action/compare/v5.1.0...v5.2.0</a></p>
<h2>v5.1.0</h2>
<ul>
<li>Add <code>annotations</code> input by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/992">docker/build-push-action#992</a></li>
<li>Add <code>secret-envs</code> input by <a href="https://github.com/elias-lundgren"><code>`@​elias-lundgren</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/980">docker/build-push-action#980</a></li>
<li>Bump <code>`@​babel/traverse</code>` from 7.17.3 to 7.23.2 in <a href="https://redirect.github.com/docker/build-push-action/pull/991">docker/build-push-action#991</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.13.0-rc.1 to 0.14.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/990">docker/build-push-action#990</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1006">docker/build-push-action#1006</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.0.0...v5.1.0">https://github.com/docker/build-push-action/compare/v5.0.0...v5.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="15560696de"><code>1556069</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1158">#1158</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li><a href="57e1d34ac3"><code>57e1d34</code></a> chore: update generated content</li>
<li><a href="309982ebc9"><code>309982e</code></a> chore(deps): Bump <code>`@​docker/actions-toolkit</code>` from 0.27.0 to 0.28.0</li>
<li><a href="9476c25b2a"><code>9476c25</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1153">#1153</a> from crazy-max/export-retention</li>
<li><a href="97be5a4928"><code>97be5a4</code></a> chore: update generated content</li>
<li><a href="9cac6c8ea0"><code>9cac6c8</code></a> use default retention days for build export artifact</li>
<li><a href="31159d49c0"><code>31159d4</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1149">#1149</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li><a href="07e1c3e148"><code>07e1c3e</code></a> chore: update generated content</li>
<li><a href="f7febd621d"><code>f7febd6</code></a> chore(deps): Bump <code>`@​docker/actions-toolkit</code>` from 0.26.2 to 0.27.0</li>
<li><a href="f6010ea701"><code>f6010ea</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1147">#1147</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li>Additional commits viewable in <a href="https://github.com/docker/build-push-action/compare/v5...v6">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/build-push-action&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 12:36:19 +00:00
c885fcebcc Bump docker/build-push-action from 5 to 6
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v5...v6)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-02 12:28:28 +00:00
b6e1a1f2f5 Merge #4761
4761: Add vX Docker tag when publishing Docker image r=Kerollmops a=curquiza

Following this: https://github.com/meilisearch/meilisearch/discussions/4759

Co-authored-by: Clémentine <clementine@meilisearch.com>
2024-07-02 11:11:39 +00:00
277f4883f6 Add vX Docker tag when publishing Docker image 2024-07-02 12:11:44 +02:00
015d90a962 merge main 2024-07-01 11:50:36 +02:00
0df84bbba7 Merge #4746
4746: Fix hybrid search limit offset r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4745

## What does this PR do?
- Apply offset and limit to the keyword search results when they are returned early.
- Add a test that is initially failing, and then passes


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-27 12:47:08 +00:00
e53de15b8e Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
8c4921b9dd Add failing test on limit+offset for hybrid search 2024-06-27 14:21:34 +02:00
809e742253 Merge #4731
4731: Fix the missing geo distance when one or both of the lat / lng are string r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4193

## What does this PR do?
- Properly extract the lat / lng when one or both of them are string
- Add a test 


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 07:33:22 +00:00
decdfe03bc Merge #4724
4724: Improve tenant token error messages r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes  #4727

## What does this PR do?
- Introduce a bunch of new error messages around tenant tokens
- Ignore the error messages in most tests that were doing for loop over multiple kinds of errors
- Introduce new tests that specifically test these error messages


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 06:47:40 +00:00
aae5c324d7 Merge #4703
4703: Update yaup r=ManyTheFish a=irevoire

There was a bug in `yaup` where serializing a structure with an array would give you a wrong query parameter.

Now, yaup is also in charge of sending the initial `?` before the query parameters.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 06:10:15 +00:00
a108d8f6f3 update yaup 2024-06-26 16:03:51 +02:00
34cf576339 Merge #4706
4706: specify the rust toolchain r=irevoire a=irevoire

The action we were using was not working with the `rust-toolchain.toml` file.
But the repository is not maintained anymore.
While looking for a solution, I found out that [helix](https://github.com/helix-editor/rust-toolchain) solved the issue on their side by forking the repo and adding a few fixes. That's what I use currently, but I don't know if it's a sustainable solution in the long term

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-26 12:56:18 +00:00
eb292a7a62 Fix the missing geo distance when one or both of the lat / lng are string 2024-06-26 14:50:15 +02:00
e28332a904 set the rust toolchain to the v1.75.0 2024-06-26 14:01:28 +02:00
a1dcde6b9a Update meilisearch/src/extractors/authentication/mod.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2024-06-26 14:00:21 +02:00
544e98ca99 use teh current version for clippy 2024-06-26 13:58:25 +02:00
1e4699b82c Merge #4716
4716: Fix bad http status and error message on wrong payload  r=irevoire a=Karribalu

# Pull Request

## Related issue
Fixes #4698

## What does this PR do?
- Fixes bad http status when bad payload with gzip Content-Encoding

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2024-06-26 08:00:51 +00:00
2c09c324f7 Merge #4730
4730: fix a possibly flaky test r=irevoire a=irevoire

On slow CI, it was possible for a document addition to _not_ to be processed and then get autobatched with an index deletion, which changed their task summary details in the end.
Now, I wait for the task to finish, and the result will always be the same

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-26 07:32:51 +00:00
3d6b61d8d2 fix flakyness for real 2024-06-26 09:24:09 +02:00
1374b661d1 fix a possibly flaky test 2024-06-26 09:14:59 +02:00
7e3c306c54 Merge #4725
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong

# Pull Request

## Related issue
Fixes #4696 

## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: JWSong <thdwjddn123@gmail.com>
2024-06-26 07:06:04 +00:00
2608a596a0 Update error message and add tests for incomplete compressed document 2024-06-25 18:36:29 +01:00
e16edb2c35 use the helix action since the official one doesn't support the rust-toolchain file 2024-06-25 17:00:50 +02:00
5c758438fc Update the CI to take the rust-toolchain file into account 2024-06-25 16:59:23 +02:00
ab6cac2321 specify the rust toolchain 2024-06-25 16:59:23 +02:00
6fb36ed30e get rid of the redundant info in document_addition_with_huge_int_primary_key 2024-06-25 23:54:27 +09:00
dcdc83946f accept large number as string 2024-06-25 21:41:47 +09:00
3c4c46377b Merge #4665
4665: Add missing Korean support r=ManyTheFish a=junhochoi

Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.

# Pull Request

## Related issue

#3443 #3882 

## What does this PR do?
- Improvement on enabling Korean support

Inspired by the work (#3882) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Junho Choi <jh.choi@catenoid.net>
2024-06-25 11:51:21 +00:00
7da21bb601 introduce as many custom error message as possible 2024-06-25 12:40:51 +02:00
13161fd7d0 Merge #4722
4722: Grow by 1TB instead of 1MB r=dureuill a=dureuill

When an index reaches 1TB, increases its size by 1TB rather than 1MB

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-25 10:17:58 +00:00
b81e2951a9 Merge #4723
4723: Fixes for Rust v1.79 r=ManyTheFish a=dureuill

cherry-picked from the `release-v1.9.0` branch

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-25 09:21:29 +00:00
d75e0098c7 Fixes for Rust v1.79 2024-06-25 11:16:06 +02:00
27496354e2 Grow by 1TB instead of 1MB 2024-06-25 09:01:11 +02:00
2e0ff56f3f Add missing Korean support
Some configuration is missing `korean` features and
add a test case in `milli/src/search/mod.rs`.
2024-06-25 12:45:21 +09:00
a74fb87d1e start introducing new error messages 2024-06-24 19:00:53 +02:00
558b66e535 makes most tests works with variable error messages 2024-06-24 19:00:44 +02:00
cade18bd47 Update README.md (#4721) 2024-06-24 15:47:10 +02:00
2a38f5c757 Run Rustfmt 2024-06-21 00:14:26 +01:00
133d33d72c Merge remote-tracking branch 'origin/main' 2024-06-20 23:55:17 +01:00
fb683fe88b Fix bad http status and error message on wrong payload 2024-06-20 23:55:09 +01:00
534f696b29 Update the README to link more demos (#4711)
This Pull Request adds two new interesting demos to a brand new list, which replaces the short _Try it_ text just below the Where2Watch showcase image hoping people will notice them.
2024-06-20 09:53:06 +02:00
b347b66619 Revert "Add june 11th webinar banner" (#4705) 2024-06-18 18:45:50 +02:00
d1962b2b0f Merge #4691
4691: Add june 11th webinar banner r=curquiza a=Strift

# Pull Request

This PR adds a banner in the README to promote tomorrow's webinar event.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Strift <laurent@meilisearch.com>
2024-06-10 16:17:21 +00:00
8b450b84f8 Add june 11th webinar banner 2024-06-10 17:45:14 +02:00
93f5defedc Merge #4656
4656: Adding a new `searchableAttribute` no longer re-index all the attributes r=ManyTheFish a=Kerollmops

Fixes #4492.

## To Do
 - [x] Do not call the `InnerSettingsDiff::only_additional_fields` function too many times
 - [ ] Add tests

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-06-05 14:51:14 +00:00
33241a6b12 Fix condition mistake 2024-06-05 16:00:24 +02:00
ff87b4db26 Avoid running proximity when only the exact attributes changes 2024-06-05 12:48:44 +02:00
ba9fadc8f1 Put only_additional_fields to None if the difference gives an empty result. 2024-06-05 10:51:16 +02:00
d29d4f88da Skip iterating over documents when the faceted field list doesn't change 2024-06-04 15:31:24 +02:00
17c5ceeb9d iterate over the faceted fields instead of over the whole document 2024-06-04 14:04:20 +02:00
c32d746069 Rename the embeddings workloads 2024-05-30 16:46:57 +02:00
b9a0ff0dd6 Cache a lot of operations to know if a field must be indexed 2024-05-30 16:18:23 +02:00
75496af985 Add a span for the prepare_for_documents_reindexing 2024-05-30 12:14:22 +02:00
0e9eb9eedb Add a span for the settings diff creation 2024-05-30 12:08:27 +02:00
3a78e988da Reduce the number of complex calls to settings diff functions 2024-05-30 11:23:07 +02:00
d9e5074189 Introduce a new way to determine the operations to perform on the fields 2024-05-30 11:23:07 +02:00
bc210bdc00 Introduce a dedicated function to write proximity entries in database 2024-05-30 11:23:06 +02:00
4bf83f701c Give the settings diff to the write_typed_chunk_into_index function 2024-05-30 11:23:06 +02:00
db3887929f Fix an issue with settings diff and * in the searchable attributes 2024-05-30 11:22:50 +02:00
9af103a88e Introducing a new into_del_add_obkv_conditional_operation function 2024-05-30 11:22:49 +02:00
99211eb375 Introduce the SettingDiff only_additional_fields method 2024-05-30 11:22:49 +02:00
65 changed files with 2021 additions and 248 deletions

View File

@ -18,11 +18,9 @@ jobs:
timeout-minutes: 180 # 3h timeout-minutes: 180 # 3h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }} - name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
run: | run: |

View File

@ -35,11 +35,9 @@ jobs:
fetch-depth: 0 # fetch full history to be able to get main commit sha fetch-depth: 0 # fetch full history to be able to get main commit sha
ref: ${{ steps.comment-branch.outputs.head_ref }} ref: ${{ steps.comment-branch.outputs.head_ref }}
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
- name: Run benchmarks on PR ${{ github.event.issue.id }} - name: Run benchmarks on PR ${{ github.event.issue.id }}
run: | run: |

View File

@ -12,11 +12,9 @@ jobs:
timeout-minutes: 180 # 3h timeout-minutes: 180 # 3h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Run benchmarks # Run benchmarks
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }} - name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}

View File

@ -18,11 +18,9 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Set variables # Set variables
- name: Set current branch name - name: Set current branch name

View File

@ -13,11 +13,9 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
- name: Check for Command - name: Check for Command
id: command id: command

View File

@ -16,11 +16,9 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Set variables # Set variables
- name: Set current branch name - name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Set variables # Set variables
- name: Set current branch name - name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Set variables # Set variables
- name: Set current branch name - name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks runs-on: benchmarks
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Set variables # Set variables
- name: Set current branch name - name: Set current branch name

View File

@ -16,10 +16,7 @@ jobs:
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Install cargo-flaky - name: Install cargo-flaky
run: cargo install cargo-flaky run: cargo install cargo-flaky
- name: Run cargo flaky in the dumps - name: Run cargo flaky in the dumps

View File

@ -12,11 +12,9 @@ jobs:
timeout-minutes: 4320 # 72h timeout-minutes: 4320 # 72h
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
# Run benchmarks # Run benchmarks
- name: Run the fuzzer - name: Run the fuzzer

View File

@ -25,10 +25,7 @@ jobs:
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Install cargo-deb - name: Install cargo-deb
run: cargo install cargo-deb run: cargo install cargo-deb
- uses: actions/checkout@v3 - uses: actions/checkout@v3

View File

@ -45,10 +45,7 @@ jobs:
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Build - name: Build
run: cargo build --release --locked run: cargo build --release --locked
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
@ -78,10 +75,7 @@ jobs:
asset_name: meilisearch-windows-amd64.exe asset_name: meilisearch-windows-amd64.exe
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Build - name: Build
run: cargo build --release --locked run: cargo build --release --locked
# No need to upload binaries for dry run (cron) # No need to upload binaries for dry run (cron)
@ -107,12 +101,10 @@ jobs:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v3 uses: actions/checkout@v3
- name: Installing Rust toolchain - name: Installing Rust toolchain
uses: actions-rs/toolchain@v1 uses: helix-editor/rust-toolchain@v1
with: with:
toolchain: stable
profile: minimal profile: minimal
target: ${{ matrix.target }} target: ${{ matrix.target }}
override: true
- name: Cargo build - name: Cargo build
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -154,12 +146,10 @@ jobs:
add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update -y && apt-get install -y docker-ce apt-get update -y && apt-get install -y docker-ce
- name: Installing Rust toolchain - name: Installing Rust toolchain
uses: actions-rs/toolchain@v1 uses: helix-editor/rust-toolchain@v1
with: with:
toolchain: stable
profile: minimal profile: minimal
target: ${{ matrix.target }} target: ${{ matrix.target }}
override: true
- name: Configure target aarch64 GNU - name: Configure target aarch64 GNU
## Environment variable is not passed using env: ## Environment variable is not passed using env:
## LD gold won't work with MUSL ## LD gold won't work with MUSL

View File

@ -80,10 +80,11 @@ jobs:
type=ref,event=tag type=ref,event=tag
type=raw,value=nightly,enable=${{ github.event_name != 'push' }} type=raw,value=nightly,enable=${{ github.event_name != 'push' }}
type=semver,pattern=v{{major}}.{{minor}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }} type=semver,pattern=v{{major}}.{{minor}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }}
type=semver,pattern=v{{major}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }}
type=raw,value=latest,enable=${{ steps.check-tag-format.outputs.stable == 'true' && steps.check-tag-format.outputs.latest == 'true' }} type=raw,value=latest,enable=${{ steps.check-tag-format.outputs.stable == 'true' && steps.check-tag-format.outputs.latest == 'true' }}
- name: Build and push - name: Build and push
uses: docker/build-push-action@v5 uses: docker/build-push-action@v6
with: with:
push: true push: true
platforms: linux/amd64,linux/arm64 platforms: linux/amd64,linux/arm64

View File

@ -31,10 +31,7 @@ jobs:
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- name: Setup test with Rust stable - name: Setup test with Rust stable
uses: actions-rs/toolchain@v1 uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1 uses: Swatinem/rust-cache@v2.7.1
- name: Run cargo check without any default features - name: Run cargo check without any default features
@ -59,10 +56,7 @@ jobs:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1 uses: Swatinem/rust-cache@v2.7.1
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Run cargo check without any default features - name: Run cargo check without any default features
uses: actions-rs/cargo@v1 uses: actions-rs/cargo@v1
with: with:
@ -87,10 +81,7 @@ jobs:
run: | run: |
apt-get update apt-get update
apt-get install --assume-yes build-essential curl apt-get install --assume-yes build-essential curl
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Run cargo build with almost all features - name: Run cargo build with almost all features
run: | run: |
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)" cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
@ -110,10 +101,7 @@ jobs:
run: | run: |
apt-get update apt-get update
apt-get install --assume-yes build-essential curl apt-get install --assume-yes build-essential curl
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Run cargo tree without default features and check lindera is not present - name: Run cargo tree without default features and check lindera is not present
run: | run: |
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then
@ -137,10 +125,7 @@ jobs:
run: | run: |
apt-get update && apt-get install -y curl apt-get update && apt-get install -y curl
apt-get install build-essential -y apt-get install build-essential -y
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with:
toolchain: stable
override: true
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1 uses: Swatinem/rust-cache@v2.7.1
- name: Run tests in debug - name: Run tests in debug
@ -154,11 +139,9 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: 1.75.0
override: true
components: clippy components: clippy
- name: Cache dependencies - name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1 uses: Swatinem/rust-cache@v2.7.1
@ -173,10 +156,10 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: nightly toolchain: nightly-2024-06-25
override: true override: true
components: rustfmt components: rustfmt
- name: Cache dependencies - name: Cache dependencies

View File

@ -18,11 +18,9 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3 - uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1 - uses: helix-editor/rust-toolchain@v1
with: with:
profile: minimal profile: minimal
toolchain: stable
override: true
- name: Install sd - name: Install sd
run: cargo install sd run: cargo install sd
- name: Update Cargo.toml file - name: Update Cargo.toml file

View File

@ -109,6 +109,12 @@ They are JSON files with the following structure (comments are not actually supp
"run_count": 3, "run_count": 3,
// List of arguments to add to the Meilisearch command line. // List of arguments to add to the Meilisearch command line.
"extra_cli_args": ["--max-indexing-threads=1"], "extra_cli_args": ["--max-indexing-threads=1"],
// An expression that can be parsed as a comma-separated list of targets and levels
// as described in [tracing_subscriber's documentation](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/targets/struct.Targets.html#examples).
// The expression is used to filter the spans that are measured for profiling purposes.
// Optional, defaults to "indexing::=trace" (for indexing workloads), common other values is
// "search::=trace"
"target": "indexing::=trace",
// List of named assets that can be used in the commands. // List of named assets that can be used in the commands.
"assets": { "assets": {
// name of the asset. // name of the asset.

8
Cargo.lock generated
View File

@ -2191,7 +2191,6 @@ dependencies = [
"bytemuck", "bytemuck",
"byteorder", "byteorder",
"rayon", "rayon",
"tempfile",
] ]
[[package]] [[package]]
@ -6080,12 +6079,13 @@ dependencies = [
[[package]] [[package]]
name = "yaup" name = "yaup"
version = "0.2.1" version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a59e7d27bed43f7c37c25df5192ea9d435a8092a902e02203359ac9ce3e429d9" checksum = "b0144f1a16a199846cb21024da74edd930b43443463292f536b7110b4855b5c6"
dependencies = [ dependencies = [
"form_urlencoded",
"serde", "serde",
"url", "thiserror",
] ]
[[package]] [[package]]

View File

@ -25,7 +25,7 @@
<p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p> <p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p>
[Meilisearch](https://www.meilisearch.com) helps you shape a delightful search experience in a snap, offering features that work out of the box to speed up your workflow. [Meilisearch](https://www.meilisearch.com?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=intro) helps you shape a delightful search experience in a snap, offering features that work out of the box to speed up your workflow.
<p align="center" name="demo"> <p align="center" name="demo">
<a href="https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demo-gif#gh-light-mode-only" target="_blank"> <a href="https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demo-gif#gh-light-mode-only" target="_blank">
@ -36,11 +36,18 @@
</a> </a>
</p> </p>
🔥 [**Try it!**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demo-link) 🔥 ## 🖥 Examples
- [**Movies**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=organization) — An application to help you find streaming platforms to watch movies using [hybrid search](https://www.meilisearch.com/solutions/hybrid-search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos).
- [**Ecommerce**](https://ecommerce.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Ecommerce website using disjunctive [facets](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos), range and rating filtering, and pagination.
- [**Songs**](https://music.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search through 47 million of songs.
- [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application.
See the list of all our example apps in our [demos repository](https://github.com/meilisearch/demos).
## ✨ Features ## ✨ Features
- **Hybrid search:** Combine the best of both [semantic](https://www.meilisearch.com/docs/learn/experimental/vector_search) & full-text search to get the most relevant results - **Hybrid search:** Combine the best of both [semantic](https://www.meilisearch.com/docs/learn/experimental/vector_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) & full-text search to get the most relevant results
- **Search-as-you-type:** find & display results in less than 50 milliseconds to provide an intuitive experience - **Search-as-you-type:** Find & display results in less than 50 milliseconds to provide an intuitive experience
- **[Typo tolerance](https://www.meilisearch.com/docs/learn/configuration/typo_tolerance?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** get relevant matches even when queries contain typos and misspellings - **[Typo tolerance](https://www.meilisearch.com/docs/learn/configuration/typo_tolerance?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** get relevant matches even when queries contain typos and misspellings
- **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your users' search experience with custom filters and build a faceted search interface in a few lines of code - **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your users' search experience with custom filters and build a faceted search interface in a few lines of code
- **[Sorting](https://www.meilisearch.com/docs/learn/fine_tuning_results/sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** sort results based on price, date, or pretty much anything else your users need - **[Sorting](https://www.meilisearch.com/docs/learn/fine_tuning_results/sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** sort results based on price, date, or pretty much anything else your users need
@ -59,7 +66,7 @@ You can consult Meilisearch's documentation at [meilisearch.com/docs](https://ww
## 🚀 Getting started ## 🚀 Getting started
For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our [Quick Start](https://www.meilisearch.com/docs/learn/getting_started/quick_start?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) guide. For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our [documentation](https://www.meilisearch.com/docs?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) guide.
## 🌍 Supercharge your Meilisearch experience ## 🌍 Supercharge your Meilisearch experience
@ -83,7 +90,7 @@ Finally, for more in-depth information, refer to our articles explaining fundame
## 📊 Telemetry ## 📊 Telemetry
Meilisearch collects **anonymized** data from users to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want. Meilisearch collects **anonymized** user data to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want.
To request deletion of collected data, please write to us at [privacy@meilisearch.com](mailto:privacy@meilisearch.com). Remember to include your `Instance UID` in the message, as this helps us quickly find and delete your data. To request deletion of collected data, please write to us at [privacy@meilisearch.com](mailto:privacy@meilisearch.com). Remember to include your `Instance UID` in the message, as this helps us quickly find and delete your data.
@ -105,11 +112,11 @@ Thank you for your support!
## 👩‍💻 Contributing ## 👩‍💻 Contributing
Meilisearch is, and will always be, open-source! If you want to contribute to the project, please take a look at [our contribution guidelines](CONTRIBUTING.md). Meilisearch is, and will always be, open-source! If you want to contribute to the project, please look at [our contribution guidelines](CONTRIBUTING.md).
## 📦 Versioning ## 📦 Versioning
Meilisearch releases and their associated binaries are available [in this GitHub page](https://github.com/meilisearch/meilisearch/releases). Meilisearch releases and their associated binaries are available on the project's [releases page](https://github.com/meilisearch/meilisearch/releases).
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](https://github.com/meilisearch/engine-team/blob/main/resources/versioning-policy.md). The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](https://github.com/meilisearch/engine-team/blob/main/resources/versioning-policy.md).

View File

@ -1811,7 +1811,7 @@ mod tests {
task_db_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose. task_db_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose.
index_base_map_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose. index_base_map_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose.
enable_mdb_writemap: false, enable_mdb_writemap: false,
index_growth_amount: 1000 * 1000, // 1 MB index_growth_amount: 1000 * 1000 * 1000 * 1000, // 1 TB
index_count: 5, index_count: 5,
indexer_config, indexer_config,
autobatching_enabled: true, autobatching_enabled: true,

View File

@ -188,6 +188,12 @@ impl AuthFilter {
self.allow_index_creation && self.is_index_authorized(index) self.allow_index_creation && self.is_index_authorized(index)
} }
#[inline]
/// Return true if a tenant token was used to generate the search rules.
pub fn is_tenant_token(&self) -> bool {
self.search_rules.is_some()
}
pub fn with_allowed_indexes(allowed_indexes: HashSet<IndexUidPattern>) -> Self { pub fn with_allowed_indexes(allowed_indexes: HashSet<IndexUidPattern>) -> Self {
Self { Self {
search_rules: None, search_rules: None,
@ -205,6 +211,7 @@ impl AuthFilter {
.unwrap_or(true) .unwrap_or(true)
} }
/// Check if the index is authorized by the API key and the tenant token.
pub fn is_index_authorized(&self, index: &str) -> bool { pub fn is_index_authorized(&self, index: &str) -> bool {
self.key_authorized_indexes.is_index_authorized(index) self.key_authorized_indexes.is_index_authorized(index)
&& self && self
@ -214,6 +221,44 @@ impl AuthFilter {
.unwrap_or(true) .unwrap_or(true)
} }
/// Only check if the index is authorized by the API key
pub fn api_key_is_index_authorized(&self, index: &str) -> bool {
self.key_authorized_indexes.is_index_authorized(index)
}
/// Only check if the index is authorized by the tenant token
pub fn tenant_token_is_index_authorized(&self, index: &str) -> bool {
self.search_rules
.as_ref()
.map(|search_rules| search_rules.is_index_authorized(index))
.unwrap_or(true)
}
/// Return the list of authorized indexes by the tenant token if any
pub fn tenant_token_list_index_authorized(&self) -> Vec<String> {
match self.search_rules {
Some(ref search_rules) => {
let mut indexes: Vec<_> = match search_rules {
SearchRules::Set(set) => set.iter().map(|s| s.to_string()).collect(),
SearchRules::Map(map) => map.keys().map(|s| s.to_string()).collect(),
};
indexes.sort_unstable();
indexes
}
None => Vec::new(),
}
}
/// Return the list of authorized indexes by the api key if any
pub fn api_key_list_index_authorized(&self) -> Vec<String> {
let mut indexes: Vec<_> = match self.key_authorized_indexes {
SearchRules::Set(ref set) => set.iter().map(|s| s.to_string()).collect(),
SearchRules::Map(ref map) => map.keys().map(|s| s.to_string()).collect(),
};
indexes.sort_unstable();
indexes
}
pub fn get_index_search_rules(&self, index: &str) -> Option<IndexSearchRules> { pub fn get_index_search_rules(&self, index: &str) -> Option<IndexSearchRules> {
if !self.is_index_authorized(index) { if !self.is_index_authorized(index) {
return None; return None;

View File

@ -54,6 +54,8 @@ chinese-pinyin = ["milli/chinese-pinyin"]
hebrew = ["milli/hebrew"] hebrew = ["milli/hebrew"]
# japanese specialized tokenization # japanese specialized tokenization
japanese = ["milli/japanese"] japanese = ["milli/japanese"]
# korean specialized tokenization
korean = ["milli/korean"]
# thai specialized tokenization # thai specialized tokenization
thai = ["milli/thai"] thai = ["milli/thai"]
# allow greek specialized tokenization # allow greek specialized tokenization

View File

@ -98,7 +98,6 @@ tokio-stream = "0.1.14"
toml = "0.8.8" toml = "0.8.8"
uuid = { version = "1.6.1", features = ["serde", "v4"] } uuid = { version = "1.6.1", features = ["serde", "v4"] }
walkdir = "2.4.0" walkdir = "2.4.0"
yaup = "0.2.1"
serde_urlencoded = "0.7.1" serde_urlencoded = "0.7.1"
termcolor = "1.4.1" termcolor = "1.4.1"
url = { version = "2.5.0", features = ["serde"] } url = { version = "2.5.0", features = ["serde"] }
@ -118,7 +117,7 @@ maplit = "1.0.2"
meili-snap = { path = "../meili-snap" } meili-snap = { path = "../meili-snap" }
temp-env = "0.3.6" temp-env = "0.3.6"
urlencoding = "2.1.3" urlencoding = "2.1.3"
yaup = "0.2.1" yaup = "0.3.1"
[build-dependencies] [build-dependencies]
anyhow = { version = "1.0.79", optional = true } anyhow = { version = "1.0.79", optional = true }
@ -151,6 +150,7 @@ chinese = ["meilisearch-types/chinese"]
chinese-pinyin = ["meilisearch-types/chinese-pinyin"] chinese-pinyin = ["meilisearch-types/chinese-pinyin"]
hebrew = ["meilisearch-types/hebrew"] hebrew = ["meilisearch-types/hebrew"]
japanese = ["meilisearch-types/japanese"] japanese = ["meilisearch-types/japanese"]
korean = ["meilisearch-types/korean"]
thai = ["meilisearch-types/thai"] thai = ["meilisearch-types/thai"]
greek = ["meilisearch-types/greek"] greek = ["meilisearch-types/greek"]
khmer = ["meilisearch-types/khmer"] khmer = ["meilisearch-types/khmer"]

View File

@ -98,14 +98,29 @@ impl From<MeilisearchHttpError> for aweb::Error {
impl From<aweb::error::PayloadError> for MeilisearchHttpError { impl From<aweb::error::PayloadError> for MeilisearchHttpError {
fn from(error: aweb::error::PayloadError) -> Self { fn from(error: aweb::error::PayloadError) -> Self {
MeilisearchHttpError::Payload(PayloadError::Payload(error)) match error {
aweb::error::PayloadError::Incomplete(_) => MeilisearchHttpError::Payload(
PayloadError::Payload(ActixPayloadError::IncompleteError),
),
_ => MeilisearchHttpError::Payload(PayloadError::Payload(
ActixPayloadError::OtherError(error),
)),
} }
}
}
#[derive(Debug, thiserror::Error)]
pub enum ActixPayloadError {
#[error("The provided payload is incomplete and cannot be parsed")]
IncompleteError,
#[error(transparent)]
OtherError(aweb::error::PayloadError),
} }
#[derive(Debug, thiserror::Error)] #[derive(Debug, thiserror::Error)]
pub enum PayloadError { pub enum PayloadError {
#[error(transparent)] #[error(transparent)]
Payload(aweb::error::PayloadError), Payload(ActixPayloadError),
#[error(transparent)] #[error(transparent)]
Json(JsonPayloadError), Json(JsonPayloadError),
#[error(transparent)] #[error(transparent)]
@ -122,7 +137,8 @@ impl ErrorCode for PayloadError {
fn error_code(&self) -> Code { fn error_code(&self) -> Code {
match self { match self {
PayloadError::Payload(e) => match e { PayloadError::Payload(e) => match e {
aweb::error::PayloadError::Incomplete(_) => Code::Internal, ActixPayloadError::IncompleteError => Code::BadRequest,
ActixPayloadError::OtherError(error) => match error {
aweb::error::PayloadError::EncodingCorrupted => Code::Internal, aweb::error::PayloadError::EncodingCorrupted => Code::Internal,
aweb::error::PayloadError::Overflow => Code::PayloadTooLarge, aweb::error::PayloadError::Overflow => Code::PayloadTooLarge,
aweb::error::PayloadError::UnknownLength => Code::Internal, aweb::error::PayloadError::UnknownLength => Code::Internal,
@ -130,6 +146,7 @@ impl ErrorCode for PayloadError {
aweb::error::PayloadError::Io(_) => Code::Internal, aweb::error::PayloadError::Io(_) => Code::Internal,
_ => todo!(), _ => todo!(),
}, },
},
PayloadError::Json(err) => match err { PayloadError::Json(err) => match err {
JsonPayloadError::Overflow { .. } => Code::PayloadTooLarge, JsonPayloadError::Overflow { .. } => Code::PayloadTooLarge,
JsonPayloadError::ContentType => Code::UnsupportedMediaType, JsonPayloadError::ContentType => Code::UnsupportedMediaType,

View File

@ -12,6 +12,8 @@ use futures::Future;
use meilisearch_auth::{AuthController, AuthFilter}; use meilisearch_auth::{AuthController, AuthFilter};
use meilisearch_types::error::{Code, ResponseError}; use meilisearch_types::error::{Code, ResponseError};
use self::policies::AuthError;
pub struct GuardedData<P, D> { pub struct GuardedData<P, D> {
data: D, data: D,
filters: AuthFilter, filters: AuthFilter,
@ -35,12 +37,12 @@ impl<P, D> GuardedData<P, D> {
let missing_master_key = auth.get_master_key().is_none(); let missing_master_key = auth.get_master_key().is_none();
match Self::authenticate(auth, token, index).await? { match Self::authenticate(auth, token, index).await? {
Some(filters) => match data { Ok(filters) => match data {
Some(data) => Ok(Self { data, filters, _marker: PhantomData }), Some(data) => Ok(Self { data, filters, _marker: PhantomData }),
None => Err(AuthenticationError::IrretrievableState.into()), None => Err(AuthenticationError::IrretrievableState.into()),
}, },
None if missing_master_key => Err(AuthenticationError::MissingMasterKey.into()), Err(_) if missing_master_key => Err(AuthenticationError::MissingMasterKey.into()),
None => Err(AuthenticationError::InvalidToken.into()), Err(e) => Err(ResponseError::from_msg(e.to_string(), Code::InvalidApiKey)),
} }
} }
@ -51,12 +53,12 @@ impl<P, D> GuardedData<P, D> {
let missing_master_key = auth.get_master_key().is_none(); let missing_master_key = auth.get_master_key().is_none();
match Self::authenticate(auth, String::new(), None).await? { match Self::authenticate(auth, String::new(), None).await? {
Some(filters) => match data { Ok(filters) => match data {
Some(data) => Ok(Self { data, filters, _marker: PhantomData }), Some(data) => Ok(Self { data, filters, _marker: PhantomData }),
None => Err(AuthenticationError::IrretrievableState.into()), None => Err(AuthenticationError::IrretrievableState.into()),
}, },
None if missing_master_key => Err(AuthenticationError::MissingMasterKey.into()), Err(_) if missing_master_key => Err(AuthenticationError::MissingMasterKey.into()),
None => Err(AuthenticationError::MissingAuthorizationHeader.into()), Err(_) => Err(AuthenticationError::MissingAuthorizationHeader.into()),
} }
} }
@ -64,7 +66,7 @@ impl<P, D> GuardedData<P, D> {
auth: Data<AuthController>, auth: Data<AuthController>,
token: String, token: String,
index: Option<String>, index: Option<String>,
) -> Result<Option<AuthFilter>, ResponseError> ) -> Result<Result<AuthFilter, AuthError>, ResponseError>
where where
P: Policy + 'static, P: Policy + 'static,
{ {
@ -127,13 +129,14 @@ pub trait Policy {
auth: Data<AuthController>, auth: Data<AuthController>,
token: &str, token: &str,
index: Option<&str>, index: Option<&str>,
) -> Option<AuthFilter>; ) -> Result<AuthFilter, policies::AuthError>;
} }
pub mod policies { pub mod policies {
use actix_web::web::Data; use actix_web::web::Data;
use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation}; use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation};
use meilisearch_auth::{AuthController, AuthFilter, SearchRules}; use meilisearch_auth::{AuthController, AuthFilter, SearchRules};
use meilisearch_types::error::{Code, ErrorCode};
// reexport actions in policies in order to be used in routes configuration. // reexport actions in policies in order to be used in routes configuration.
pub use meilisearch_types::keys::{actions, Action}; pub use meilisearch_types::keys::{actions, Action};
use serde::{Deserialize, Serialize}; use serde::{Deserialize, Serialize};
@ -144,11 +147,53 @@ pub mod policies {
enum TenantTokenOutcome { enum TenantTokenOutcome {
NotATenantToken, NotATenantToken,
Invalid,
Expired,
Valid(Uuid, SearchRules), Valid(Uuid, SearchRules),
} }
#[derive(thiserror::Error, Debug)]
pub enum AuthError {
#[error("Tenant token expired. Was valid up to `{exp}` and we're now `{now}`.")]
ExpiredTenantToken { exp: i64, now: i64 },
#[error("The provided API key is invalid.")]
InvalidApiKey,
#[error("The provided tenant token cannot acces the index `{index}`, allowed indexes are {allowed:?}.")]
TenantTokenAccessingnUnauthorizedIndex { index: String, allowed: Vec<String> },
#[error(
"The API key used to generate this tenant token cannot acces the index `{index}`."
)]
TenantTokenApiKeyAccessingnUnauthorizedIndex { index: String },
#[error(
"The API key cannot acces the index `{index}`, authorized indexes are {allowed:?}."
)]
ApiKeyAccessingnUnauthorizedIndex { index: String, allowed: Vec<String> },
#[error("The provided tenant token is invalid.")]
InvalidTenantToken,
#[error("Could not decode tenant token, {0}.")]
CouldNotDecodeTenantToken(jsonwebtoken::errors::Error),
#[error("Invalid action `{0}`.")]
InternalInvalidAction(u8),
}
impl From<jsonwebtoken::errors::Error> for AuthError {
fn from(error: jsonwebtoken::errors::Error) -> Self {
use jsonwebtoken::errors::ErrorKind;
match error.kind() {
ErrorKind::InvalidToken => AuthError::InvalidTenantToken,
_ => AuthError::CouldNotDecodeTenantToken(error),
}
}
}
impl ErrorCode for AuthError {
fn error_code(&self) -> Code {
match self {
AuthError::InternalInvalidAction(_) => Code::Internal,
_ => Code::InvalidApiKey,
}
}
}
fn tenant_token_validation() -> Validation { fn tenant_token_validation() -> Validation {
let mut validation = Validation::default(); let mut validation = Validation::default();
validation.validate_exp = false; validation.validate_exp = false;
@ -158,15 +203,15 @@ pub mod policies {
} }
/// Extracts the key id used to sign the payload, without performing any validation. /// Extracts the key id used to sign the payload, without performing any validation.
fn extract_key_id(token: &str) -> Option<Uuid> { fn extract_key_id(token: &str) -> Result<Uuid, AuthError> {
let mut validation = tenant_token_validation(); let mut validation = tenant_token_validation();
validation.insecure_disable_signature_validation(); validation.insecure_disable_signature_validation();
let dummy_key = DecodingKey::from_secret(b"secret"); let dummy_key = DecodingKey::from_secret(b"secret");
let token_data = decode::<Claims>(token, &dummy_key, &validation).ok()?; let token_data = decode::<Claims>(token, &dummy_key, &validation)?;
// get token fields without validating it. // get token fields without validating it.
let Claims { api_key_uid, .. } = token_data.claims; let Claims { api_key_uid, .. } = token_data.claims;
Some(api_key_uid) Ok(api_key_uid)
} }
fn is_keys_action(action: u8) -> bool { fn is_keys_action(action: u8) -> bool {
@ -187,76 +232,102 @@ pub mod policies {
auth: Data<AuthController>, auth: Data<AuthController>,
token: &str, token: &str,
index: Option<&str>, index: Option<&str>,
) -> Option<AuthFilter> { ) -> Result<AuthFilter, AuthError> {
// authenticate if token is the master key. // authenticate if token is the master key.
// Without a master key, all routes are accessible except the key-related routes. // Without a master key, all routes are accessible except the key-related routes.
if auth.get_master_key().map_or_else(|| !is_keys_action(A), |mk| mk == token) { if auth.get_master_key().map_or_else(|| !is_keys_action(A), |mk| mk == token) {
return Some(AuthFilter::default()); return Ok(AuthFilter::default());
} }
let (key_uuid, search_rules) = let (key_uuid, search_rules) =
match ActionPolicy::<A>::authenticate_tenant_token(&auth, token) { match ActionPolicy::<A>::authenticate_tenant_token(&auth, token) {
TenantTokenOutcome::Valid(key_uuid, search_rules) => { Ok(TenantTokenOutcome::Valid(key_uuid, search_rules)) => {
(key_uuid, Some(search_rules)) (key_uuid, Some(search_rules))
} }
TenantTokenOutcome::Expired => return None, Ok(TenantTokenOutcome::NotATenantToken)
TenantTokenOutcome::Invalid => return None, | Err(AuthError::InvalidTenantToken) => (
TenantTokenOutcome::NotATenantToken => { auth.get_optional_uid_from_encoded_key(token.as_bytes())
(auth.get_optional_uid_from_encoded_key(token.as_bytes()).ok()??, None) .map_err(|_e| AuthError::InvalidApiKey)?
} .ok_or(AuthError::InvalidApiKey)?,
None,
),
Err(e) => return Err(e),
}; };
// check that the indexes are allowed // check that the indexes are allowed
let action = Action::from_repr(A)?; let action = Action::from_repr(A).ok_or(AuthError::InternalInvalidAction(A))?;
let auth_filter = auth.get_key_filters(key_uuid, search_rules).ok()?; let auth_filter = auth
if auth.is_key_authorized(key_uuid, action, index).unwrap_or(false) .get_key_filters(key_uuid, search_rules)
&& index.map(|index| auth_filter.is_index_authorized(index)).unwrap_or(true) .map_err(|_e| AuthError::InvalidApiKey)?;
{
return Some(auth_filter); // First check if the index is authorized in the tenant token, this is a public
// information, we can return a nice error message.
if let Some(index) = index {
if !auth_filter.tenant_token_is_index_authorized(index) {
return Err(AuthError::TenantTokenAccessingnUnauthorizedIndex {
index: index.to_string(),
allowed: auth_filter.tenant_token_list_index_authorized(),
});
}
if !auth_filter.api_key_is_index_authorized(index) {
if auth_filter.is_tenant_token() {
// If the error comes from a tenant token we cannot share the list
// of authorized indexes in the API key. This is not public information.
return Err(AuthError::TenantTokenApiKeyAccessingnUnauthorizedIndex {
index: index.to_string(),
});
} else {
// Otherwise we can share the list
// of authorized indexes in the API key.
return Err(AuthError::ApiKeyAccessingnUnauthorizedIndex {
index: index.to_string(),
allowed: auth_filter.api_key_list_index_authorized(),
});
}
}
}
if auth.is_key_authorized(key_uuid, action, index).unwrap_or(false) {
return Ok(auth_filter);
} }
None Err(AuthError::InvalidApiKey)
} }
} }
impl<const A: u8> ActionPolicy<A> { impl<const A: u8> ActionPolicy<A> {
fn authenticate_tenant_token(auth: &AuthController, token: &str) -> TenantTokenOutcome { fn authenticate_tenant_token(
auth: &AuthController,
token: &str,
) -> Result<TenantTokenOutcome, AuthError> {
// Only search action can be accessed by a tenant token. // Only search action can be accessed by a tenant token.
if A != actions::SEARCH { if A != actions::SEARCH {
return TenantTokenOutcome::NotATenantToken; return Ok(TenantTokenOutcome::NotATenantToken);
} }
let uid = if let Some(uid) = extract_key_id(token) { let uid = extract_key_id(token)?;
uid
} else {
return TenantTokenOutcome::NotATenantToken;
};
// Check if tenant token is valid. // Check if tenant token is valid.
let key = if let Some(key) = auth.generate_key(uid) { let key = if let Some(key) = auth.generate_key(uid) {
key key
} else { } else {
return TenantTokenOutcome::Invalid; return Err(AuthError::InvalidTenantToken);
}; };
let data = if let Ok(data) = decode::<Claims>( let data = decode::<Claims>(
token, token,
&DecodingKey::from_secret(key.as_bytes()), &DecodingKey::from_secret(key.as_bytes()),
&tenant_token_validation(), &tenant_token_validation(),
) { )?;
data
} else {
return TenantTokenOutcome::Invalid;
};
// Check if token is expired. // Check if token is expired.
if let Some(exp) = data.claims.exp { if let Some(exp) = data.claims.exp {
if OffsetDateTime::now_utc().unix_timestamp() > exp { let now = OffsetDateTime::now_utc().unix_timestamp();
return TenantTokenOutcome::Expired; if now > exp {
return Err(AuthError::ExpiredTenantToken { exp, now });
} }
} }
TenantTokenOutcome::Valid(uid, data.claims.search_rules) Ok(TenantTokenOutcome::Valid(uid, data.claims.search_rules))
} }
} }

View File

@ -752,10 +752,15 @@ fn prepare_search<'t>(
SearchKind::SemanticOnly { embedder_name, embedder } => { SearchKind::SemanticOnly { embedder_name, embedder } => {
let vector = match query.vector.clone() { let vector = match query.vector.clone() {
Some(vector) => vector, Some(vector) => vector,
None => embedder None => {
let span = tracing::trace_span!(target: "search::vector", "embed_one");
let _entered = span.enter();
embedder
.embed_one(query.q.clone().unwrap()) .embed_one(query.q.clone().unwrap())
.map_err(milli::vector::Error::from) .map_err(milli::vector::Error::from)
.map_err(milli::Error::from)?, .map_err(milli::Error::from)?
}
}; };
search.semantic(embedder_name.clone(), embedder.clone(), Some(vector)); search.semantic(embedder_name.clone(), embedder.clone(), Some(vector));
@ -1331,13 +1336,23 @@ fn insert_geo_distance(sorts: &[String], document: &mut Document) {
// TODO: TAMO: milli encountered an internal error, what do we want to do? // TODO: TAMO: milli encountered an internal error, what do we want to do?
let base = [capture_group[1].parse().unwrap(), capture_group[2].parse().unwrap()]; let base = [capture_group[1].parse().unwrap(), capture_group[2].parse().unwrap()];
let geo_point = &document.get("_geo").unwrap_or(&json!(null)); let geo_point = &document.get("_geo").unwrap_or(&json!(null));
if let Some((lat, lng)) = geo_point["lat"].as_f64().zip(geo_point["lng"].as_f64()) { if let Some((lat, lng)) =
extract_geo_value(&geo_point["lat"]).zip(extract_geo_value(&geo_point["lng"]))
{
let distance = milli::distance_between_two_points(&base, &[lat, lng]); let distance = milli::distance_between_two_points(&base, &[lat, lng]);
document.insert("_geoDistance".to_string(), json!(distance.round() as usize)); document.insert("_geoDistance".to_string(), json!(distance.round() as usize));
} }
} }
} }
fn extract_geo_value(value: &Value) -> Option<f64> {
match value {
Value::Number(n) => n.as_f64(),
Value::String(s) => s.parse().ok(),
_ => None,
}
}
fn compute_formatted_options( fn compute_formatted_options(
attr_to_highlight: &HashSet<String>, attr_to_highlight: &HashSet<String>,
attr_to_crop: &[String], attr_to_crop: &[String],
@ -1711,4 +1726,54 @@ mod test {
insert_geo_distance(sorters, &mut document); insert_geo_distance(sorters, &mut document);
assert_eq!(document.get("_geoDistance"), None); assert_eq!(document.get("_geoDistance"), None);
} }
#[test]
fn test_insert_geo_distance_with_coords_as_string() {
let value: Document = serde_json::from_str(
r#"{
"_geo": {
"lat": "50",
"lng": 3
}
}"#,
)
.unwrap();
let sorters = &["_geoPoint(50,3):desc".to_string()];
let mut document = value.clone();
insert_geo_distance(sorters, &mut document);
assert_eq!(document.get("_geoDistance"), Some(&json!(0)));
let value: Document = serde_json::from_str(
r#"{
"_geo": {
"lat": "50",
"lng": "3"
},
"id": "1"
}"#,
)
.unwrap();
let sorters = &["_geoPoint(50,3):desc".to_string()];
let mut document = value.clone();
insert_geo_distance(sorters, &mut document);
assert_eq!(document.get("_geoDistance"), Some(&json!(0)));
let value: Document = serde_json::from_str(
r#"{
"_geo": {
"lat": 50,
"lng": "3"
},
"id": "1"
}"#,
)
.unwrap();
let sorters = &["_geoPoint(50,3):desc".to_string()];
let mut document = value.clone();
insert_geo_distance(sorters, &mut document);
assert_eq!(document.get("_geoDistance"), Some(&json!(0)));
}
} }

View File

@ -78,7 +78,7 @@ pub static ALL_ACTIONS: Lazy<HashSet<&'static str>> = Lazy::new(|| {
}); });
static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| { static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| {
json!({"message": "The provided API key is invalid.", json!({"message": null,
"code": "invalid_api_key", "code": "invalid_api_key",
"type": "auth", "type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key" "link": "https://docs.meilisearch.com/errors#invalid_api_key"
@ -119,7 +119,8 @@ async fn error_access_expired_key() {
thread::sleep(time::Duration::new(1, 0)); thread::sleep(time::Duration::new(1, 0));
for (method, route) in AUTHORIZATIONS.keys() { for (method, route) in AUTHORIZATIONS.keys() {
let (response, code) = server.dummy_request(method, route).await; let (mut response, code) = server.dummy_request(method, route).await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route); assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route);
assert_eq!(403, code, "{:?}", &response); assert_eq!(403, code, "{:?}", &response);
@ -149,7 +150,8 @@ async fn error_access_unauthorized_index() {
// filter `products` index routes // filter `products` index routes
.filter(|(_, route)| route.starts_with("/indexes/products")) .filter(|(_, route)| route.starts_with("/indexes/products"))
{ {
let (response, code) = server.dummy_request(method, route).await; let (mut response, code) = server.dummy_request(method, route).await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route); assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route);
assert_eq!(403, code, "{:?}", &response); assert_eq!(403, code, "{:?}", &response);
@ -176,7 +178,8 @@ async fn error_access_unauthorized_action() {
let key = response["key"].as_str().unwrap(); let key = response["key"].as_str().unwrap();
server.use_api_key(key); server.use_api_key(key);
let (response, code) = server.dummy_request(method, route).await; let (mut response, code) = server.dummy_request(method, route).await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route); assert_eq!(response, INVALID_RESPONSE.clone(), "on route: {:?} - {:?}", method, route);
assert_eq!(403, code, "{:?}", &response); assert_eq!(403, code, "{:?}", &response);
@ -280,7 +283,7 @@ async fn access_authorized_no_index_restriction() {
route, route,
action action
); );
assert_ne!(code, 403); assert_ne!(code, 403, "on route: {:?} - {:?} with action: {:?}", method, route, action);
} }
} }
} }

View File

@ -1,7 +1,10 @@
use actix_web::test;
use http::StatusCode;
use jsonwebtoken::{EncodingKey, Header};
use meili_snap::*; use meili_snap::*;
use uuid::Uuid; use uuid::Uuid;
use crate::common::Server; use crate::common::{Server, Value};
use crate::json; use crate::json;
#[actix_rt::test] #[actix_rt::test]
@ -436,3 +439,262 @@ async fn patch_api_keys_unknown_field() {
} }
"###); "###);
} }
async fn send_request_with_custom_auth(
app: impl actix_web::dev::Service<
actix_http::Request,
Response = actix_web::dev::ServiceResponse<impl actix_web::body::MessageBody>,
Error = actix_web::Error,
>,
url: &str,
auth: &str,
) -> (Value, StatusCode) {
let req = test::TestRequest::get().uri(url).insert_header(("Authorization", auth)).to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
(response, status_code)
}
#[actix_rt::test]
async fn invalid_auth_format() {
let server = Server::new_auth().await;
let app = server.init_web_app().await;
let req = test::TestRequest::get().uri("/indexes/dog/documents").to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(status_code, @"401 Unauthorized");
snapshot!(response, @r###"
{
"message": "The Authorization header is missing. It must use the bearer authorization method.",
"code": "missing_authorization_header",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#missing_authorization_header"
}
"###);
let req = test::TestRequest::get().uri("/indexes/dog/documents").to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(status_code, @"401 Unauthorized");
snapshot!(response, @r###"
{
"message": "The Authorization header is missing. It must use the bearer authorization method.",
"code": "missing_authorization_header",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#missing_authorization_header"
}
"###);
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/documents", "Bearer").await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The provided API key is invalid.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
}
#[actix_rt::test]
async fn invalid_api_key() {
let server = Server::new_auth().await;
let app = server.init_web_app().await;
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/search", "Bearer kefir").await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The provided API key is invalid.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
let uuid = Uuid::nil();
let key = json!({ "actions": ["search"], "indexes": ["dog"], "expiresAt": null, "uid": uuid.to_string() });
let req = test::TestRequest::post()
.uri("/keys")
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.set_json(&key)
.to_request();
let res = test::call_service(&app, req).await;
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(json_string!(response, { ".createdAt" => "[date]", ".updatedAt" => "[date]" }), @r###"
{
"name": null,
"description": null,
"key": "aeb94973e0b6e912d94165430bbe87dee91a7c4f891ce19050c3910ec96977e9",
"uid": "00000000-0000-0000-0000-000000000000",
"actions": [
"search"
],
"indexes": [
"dog"
],
"expiresAt": null,
"createdAt": "[date]",
"updatedAt": "[date]"
}
"###);
let key = response["key"].as_str().unwrap();
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/doggo/search", &format!("Bearer {key}"))
.await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The API key cannot acces the index `doggo`, authorized indexes are [\"dog\"].",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
}
#[actix_rt::test]
async fn invalid_tenant_token() {
let server = Server::new_auth().await;
let app = server.init_web_app().await;
// The tenant token won't be recognized at all if we're not on a search route
let claims = json!({ "tamo": "kefir" });
let jwt = jsonwebtoken::encode(&Header::default(), &claims, &EncodingKey::from_secret(b"tamo"))
.unwrap();
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/documents", &format!("Bearer {jwt}"))
.await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The provided API key is invalid.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
let claims = json!({ "tamo": "kefir" });
let jwt = jsonwebtoken::encode(&Header::default(), &claims, &EncodingKey::from_secret(b"tamo"))
.unwrap();
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/search", &format!("Bearer {jwt}")).await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "Could not decode tenant token, JSON error: missing field `searchRules` at line 1 column 16.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
// The error messages are not ideal but that's expected since we cannot _yet_ use deserr
let claims = json!({ "searchRules": "kefir" });
let jwt = jsonwebtoken::encode(&Header::default(), &claims, &EncodingKey::from_secret(b"tamo"))
.unwrap();
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/search", &format!("Bearer {jwt}")).await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "Could not decode tenant token, JSON error: data did not match any variant of untagged enum SearchRules at line 1 column 23.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
let uuid = Uuid::nil();
let claims = json!({ "searchRules": ["kefir"], "apiKeyUid": uuid.to_string() });
let jwt = jsonwebtoken::encode(&Header::default(), &claims, &EncodingKey::from_secret(b"tamo"))
.unwrap();
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/search", &format!("Bearer {jwt}")).await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "Could not decode tenant token, InvalidSignature.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
// ~~ For the next tests we first need a valid API key
let key = json!({ "actions": ["search"], "indexes": ["dog"], "expiresAt": null, "uid": uuid.to_string() });
let req = test::TestRequest::post()
.uri("/keys")
.insert_header(("Authorization", "Bearer MASTER_KEY"))
.set_json(&key)
.to_request();
let res = test::call_service(&app, req).await;
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(json_string!(response, { ".createdAt" => "[date]", ".updatedAt" => "[date]" }), @r###"
{
"name": null,
"description": null,
"key": "aeb94973e0b6e912d94165430bbe87dee91a7c4f891ce19050c3910ec96977e9",
"uid": "00000000-0000-0000-0000-000000000000",
"actions": [
"search"
],
"indexes": [
"dog"
],
"expiresAt": null,
"createdAt": "[date]",
"updatedAt": "[date]"
}
"###);
let key = response["key"].as_str().unwrap();
let claims = json!({ "searchRules": ["doggo", "catto"], "apiKeyUid": uuid.to_string() });
let jwt = jsonwebtoken::encode(
&Header::default(),
&claims,
&EncodingKey::from_secret(key.as_bytes()),
)
.unwrap();
// Try to access an index that is not authorized by the tenant token
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/dog/search", &format!("Bearer {jwt}")).await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The provided tenant token cannot acces the index `dog`, allowed indexes are [\"catto\", \"doggo\"].",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
// Try to access an index that *is* authorized by the tenant token but not by the api key used to generate the tt
let (response, status_code) =
send_request_with_custom_auth(&app, "/indexes/doggo/search", &format!("Bearer {jwt}"))
.await;
snapshot!(status_code, @"403 Forbidden");
snapshot!(response, @r###"
{
"message": "The API key used to generate this tenant token cannot acces the index `doggo`.",
"code": "invalid_api_key",
"type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key"
}
"###);
}

View File

@ -53,7 +53,8 @@ static DOCUMENTS: Lazy<Value> = Lazy::new(|| {
}); });
static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| { static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| {
json!({"message": "The provided API key is invalid.", json!({
"message": null,
"code": "invalid_api_key", "code": "invalid_api_key",
"type": "auth", "type": "auth",
"link": "https://docs.meilisearch.com/errors#invalid_api_key" "link": "https://docs.meilisearch.com/errors#invalid_api_key"
@ -191,7 +192,9 @@ macro_rules! compute_forbidden_search {
server.use_api_key(&web_token); server.use_api_key(&web_token);
let index = server.index("sales"); let index = server.index("sales");
index index
.search(json!({}), |response, code| { .search(json!({}), |mut response, code| {
// We don't assert anything on the message since it may change between cases
response["message"] = serde_json::json!(null);
assert_eq!( assert_eq!(
response, response,
INVALID_RESPONSE.clone(), INVALID_RESPONSE.clone(),
@ -495,7 +498,8 @@ async fn error_access_forbidden_routes() {
for ((method, route), actions) in AUTHORIZATIONS.iter() { for ((method, route), actions) in AUTHORIZATIONS.iter() {
if !actions.contains("search") { if !actions.contains("search") {
let (response, code) = server.dummy_request(method, route).await; let (mut response, code) = server.dummy_request(method, route).await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone()); assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403); assert_eq!(code, 403);
} }
@ -529,14 +533,16 @@ async fn error_access_expired_parent_key() {
server.use_api_key(&web_token); server.use_api_key(&web_token);
// test search request while parent_key is not expired // test search request while parent_key is not expired
let (response, code) = server.dummy_request("POST", "/indexes/products/search").await; let (mut response, code) = server.dummy_request("POST", "/indexes/products/search").await;
response["message"] = serde_json::json!(null);
assert_ne!(response, INVALID_RESPONSE.clone()); assert_ne!(response, INVALID_RESPONSE.clone());
assert_ne!(code, 403); assert_ne!(code, 403);
// wait until the key is expired. // wait until the key is expired.
thread::sleep(time::Duration::new(1, 0)); thread::sleep(time::Duration::new(1, 0));
let (response, code) = server.dummy_request("POST", "/indexes/products/search").await; let (mut response, code) = server.dummy_request("POST", "/indexes/products/search").await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone()); assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403); assert_eq!(code, 403);
} }
@ -585,7 +591,8 @@ async fn error_access_modified_token() {
.join("."); .join(".");
server.use_api_key(&altered_token); server.use_api_key(&altered_token);
let (response, code) = server.dummy_request("POST", "/indexes/products/search").await; let (mut response, code) = server.dummy_request("POST", "/indexes/products/search").await;
response["message"] = serde_json::json!(null);
assert_eq!(response, INVALID_RESPONSE.clone()); assert_eq!(response, INVALID_RESPONSE.clone());
assert_eq!(code, 403); assert_eq!(code, 403);
} }

View File

@ -109,9 +109,11 @@ static NESTED_DOCUMENTS: Lazy<Value> = Lazy::new(|| {
fn invalid_response(query_index: Option<usize>) -> Value { fn invalid_response(query_index: Option<usize>) -> Value {
let message = if let Some(query_index) = query_index { let message = if let Some(query_index) = query_index {
format!("Inside `.queries[{query_index}]`: The provided API key is invalid.") json!(format!("Inside `.queries[{query_index}]`: The provided API key is invalid."))
} else { } else {
"The provided API key is invalid.".to_string() // if it's anything else we simply return null and will tests all the
// error messages somewhere else
json!(null)
}; };
json!({"message": message, json!({"message": message,
"code": "invalid_api_key", "code": "invalid_api_key",
@ -414,7 +416,10 @@ macro_rules! compute_forbidden_single_search {
for (tenant_token, failed_query_index) in $tenant_tokens.iter().zip(failed_query_indexes.into_iter()) { for (tenant_token, failed_query_index) in $tenant_tokens.iter().zip(failed_query_indexes.into_iter()) {
let web_token = generate_tenant_token(&uid, &key, tenant_token.clone()); let web_token = generate_tenant_token(&uid, &key, tenant_token.clone());
server.use_api_key(&web_token); server.use_api_key(&web_token);
let (response, code) = server.multi_search(json!({"queries" : [{"indexUid": "sales"}]})).await; let (mut response, code) = server.multi_search(json!({"queries" : [{"indexUid": "sales"}]})).await;
if failed_query_index.is_none() && !response["message"].is_null() {
response["message"] = serde_json::json!(null);
}
assert_eq!( assert_eq!(
response, response,
invalid_response(failed_query_index), invalid_response(failed_query_index),
@ -469,10 +474,13 @@ macro_rules! compute_forbidden_multiple_search {
for (tenant_token, failed_query_index) in $tenant_tokens.iter().zip(failed_query_indexes.into_iter()) { for (tenant_token, failed_query_index) in $tenant_tokens.iter().zip(failed_query_indexes.into_iter()) {
let web_token = generate_tenant_token(&uid, &key, tenant_token.clone()); let web_token = generate_tenant_token(&uid, &key, tenant_token.clone());
server.use_api_key(&web_token); server.use_api_key(&web_token);
let (response, code) = server.multi_search(json!({"queries" : [ let (mut response, code) = server.multi_search(json!({"queries" : [
{"indexUid": "sales"}, {"indexUid": "sales"},
{"indexUid": "products"}, {"indexUid": "products"},
]})).await; ]})).await;
if failed_query_index.is_none() && !response["message"].is_null() {
response["message"] = serde_json::json!(null);
}
assert_eq!( assert_eq!(
response, response,
invalid_response(failed_query_index), invalid_response(failed_query_index),
@ -1073,18 +1081,20 @@ async fn error_access_expired_parent_key() {
server.use_api_key(&web_token); server.use_api_key(&web_token);
// test search request while parent_key is not expired // test search request while parent_key is not expired
let (response, code) = server let (mut response, code) = server
.multi_search(json!({"queries" : [{"indexUid": "sales"}, {"indexUid": "products"}]})) .multi_search(json!({"queries" : [{"indexUid": "sales"}, {"indexUid": "products"}]}))
.await; .await;
response["message"] = serde_json::json!(null);
assert_ne!(response, invalid_response(None)); assert_ne!(response, invalid_response(None));
assert_ne!(code, 403); assert_ne!(code, 403);
// wait until the key is expired. // wait until the key is expired.
thread::sleep(time::Duration::new(1, 0)); thread::sleep(time::Duration::new(1, 0));
let (response, code) = server let (mut response, code) = server
.multi_search(json!({"queries" : [{"indexUid": "sales"}, {"indexUid": "products"}]})) .multi_search(json!({"queries" : [{"indexUid": "sales"}, {"indexUid": "products"}]}))
.await; .await;
response["message"] = serde_json::json!(null);
assert_eq!(response, invalid_response(None)); assert_eq!(response, invalid_response(None));
assert_eq!(code, 403); assert_eq!(code, 403);
} }
@ -1134,8 +1144,9 @@ async fn error_access_modified_token() {
.join("."); .join(".");
server.use_api_key(&altered_token); server.use_api_key(&altered_token);
let (response, code) = let (mut response, code) =
server.multi_search(json!({"queries" : [{"indexUid": "products"}]})).await; server.multi_search(json!({"queries" : [{"indexUid": "products"}]})).await;
response["message"] = serde_json::json!(null);
assert_eq!(response, invalid_response(None)); assert_eq!(response, invalid_response(None));
assert_eq!(code, 403); assert_eq!(code, 403);
} }

View File

@ -185,7 +185,7 @@ impl Index<'_> {
pub async fn get_document(&self, id: u64, options: Option<Value>) -> (Value, StatusCode) { pub async fn get_document(&self, id: u64, options: Option<Value>) -> (Value, StatusCode) {
let mut url = format!("/indexes/{}/documents/{}", urlencode(self.uid.as_ref()), id); let mut url = format!("/indexes/{}/documents/{}", urlencode(self.uid.as_ref()), id);
if let Some(options) = options { if let Some(options) = options {
write!(url, "?{}", yaup::to_string(&options).unwrap()).unwrap(); write!(url, "{}", yaup::to_string(&options).unwrap()).unwrap();
} }
self.service.get(url).await self.service.get(url).await
} }
@ -202,7 +202,7 @@ impl Index<'_> {
pub async fn get_all_documents(&self, options: GetAllDocumentsOptions) -> (Value, StatusCode) { pub async fn get_all_documents(&self, options: GetAllDocumentsOptions) -> (Value, StatusCode) {
let url = format!( let url = format!(
"/indexes/{}/documents?{}", "/indexes/{}/documents{}",
urlencode(self.uid.as_ref()), urlencode(self.uid.as_ref()),
yaup::to_string(&options).unwrap() yaup::to_string(&options).unwrap()
); );
@ -365,7 +365,7 @@ impl Index<'_> {
} }
pub async fn search_get(&self, query: &str) -> (Value, StatusCode) { pub async fn search_get(&self, query: &str) -> (Value, StatusCode) {
let url = format!("/indexes/{}/search?{}", urlencode(self.uid.as_ref()), query); let url = format!("/indexes/{}/search{}", urlencode(self.uid.as_ref()), query);
self.service.get(url).await self.service.get(url).await
} }
@ -402,7 +402,7 @@ impl Index<'_> {
} }
pub async fn similar_get(&self, query: &str) -> (Value, StatusCode) { pub async fn similar_get(&self, query: &str) -> (Value, StatusCode) {
let url = format!("/indexes/{}/similar?{}", urlencode(self.uid.as_ref()), query); let url = format!("/indexes/{}/similar{}", urlencode(self.uid.as_ref()), query);
self.service.get(url).await self.service.get(url).await
} }
@ -427,8 +427,11 @@ impl Index<'_> {
#[derive(Debug, Default, serde::Serialize)] #[derive(Debug, Default, serde::Serialize)]
#[serde(rename_all = "camelCase")] #[serde(rename_all = "camelCase")]
pub struct GetAllDocumentsOptions { pub struct GetAllDocumentsOptions {
#[serde(skip_serializing_if = "Option::is_none")]
pub limit: Option<usize>, pub limit: Option<usize>,
#[serde(skip_serializing_if = "Option::is_none")]
pub offset: Option<usize>, pub offset: Option<usize>,
pub retrieve_vectors: bool, #[serde(skip_serializing_if = "Option::is_none")]
pub fields: Option<Vec<&'static str>>, pub fields: Option<Vec<&'static str>>,
pub retrieve_vectors: bool,
} }

View File

@ -42,6 +42,12 @@ impl std::ops::Deref for Value {
} }
} }
impl std::ops::DerefMut for Value {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
impl PartialEq<serde_json::Value> for Value { impl PartialEq<serde_json::Value> for Value {
fn eq(&self, other: &serde_json::Value) -> bool { fn eq(&self, other: &serde_json::Value) -> bool {
&self.0 == other &self.0 == other

View File

@ -183,6 +183,58 @@ async fn add_single_document_gzip_encoded() {
} }
"###); "###);
} }
#[actix_rt::test]
async fn add_single_document_gzip_encoded_with_incomplete_error() {
let document = json!("kefir");
// this is a what is expected and should work
let server = Server::new().await;
let app = server.init_web_app().await;
// post
let document = serde_json::to_string(&document).unwrap();
let req = test::TestRequest::post()
.uri("/indexes/dog/documents")
.set_payload(document.to_string())
.insert_header(("content-type", "application/json"))
.insert_header(("content-encoding", "gzip"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(status_code, @"400 Bad Request");
snapshot!(json_string!(response),
@r###"
{
"message": "The provided payload is incomplete and cannot be parsed",
"code": "bad_request",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#bad_request"
}
"###);
// put
let req = test::TestRequest::put()
.uri("/indexes/dog/documents")
.set_payload(document.to_string())
.insert_header(("content-type", "application/json"))
.insert_header(("content-encoding", "gzip"))
.to_request();
let res = test::call_service(&app, req).await;
let status_code = res.status();
let body = test::read_body(res).await;
let response: Value = serde_json::from_slice(&body).unwrap_or_default();
snapshot!(status_code, @"400 Bad Request");
snapshot!(json_string!(response),
@r###"
{
"message": "The provided payload is incomplete and cannot be parsed",
"code": "bad_request",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#bad_request"
}
"###);
}
/// Here we try document request with every encoding /// Here we try document request with every encoding
#[actix_rt::test] #[actix_rt::test]
@ -1040,6 +1092,52 @@ async fn document_addition_with_primary_key() {
"###); "###);
} }
#[actix_rt::test]
async fn document_addition_with_huge_int_primary_key() {
let server = Server::new().await;
let index = server.index("test");
let documents = json!([
{
"primary": 14630868576586246730u64,
"content": "foo",
}
]);
let (response, code) = index.add_documents(documents, Some("primary")).await;
snapshot!(code, @"202 Accepted");
let response = index.wait_task(response.uid()).await;
snapshot!(response,
@r###"
{
"uid": 0,
"indexUid": "test",
"status": "succeeded",
"type": "documentAdditionOrUpdate",
"canceledBy": null,
"details": {
"receivedDocuments": 1,
"indexedDocuments": 1
},
"error": null,
"duration": "[duration]",
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
let (response, code) = index.get_document(14630868576586246730u64, None).await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(response),
@r###"
{
"primary": 14630868576586246730,
"content": "foo"
}
"###);
}
#[actix_rt::test] #[actix_rt::test]
async fn replace_document() { async fn replace_document() {
let server = Server::new().await; let server = Server::new().await;

View File

@ -719,7 +719,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!(null)).await; let (response, code) = index.get_document_by_filter(json!(null)).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value type: expected an object, but found null", "message": "Invalid value type: expected an object, but found null",
"code": "bad_request", "code": "bad_request",
@ -730,7 +730,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!({ "offset": "doggo" })).await; let (response, code) = index.get_document_by_filter(json!({ "offset": "doggo" })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value type at `.offset`: expected a positive integer, but found a string: `\"doggo\"`", "message": "Invalid value type at `.offset`: expected a positive integer, but found a string: `\"doggo\"`",
"code": "invalid_document_offset", "code": "invalid_document_offset",
@ -741,7 +741,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!({ "limit": "doggo" })).await; let (response, code) = index.get_document_by_filter(json!({ "limit": "doggo" })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value type at `.limit`: expected a positive integer, but found a string: `\"doggo\"`", "message": "Invalid value type at `.limit`: expected a positive integer, but found a string: `\"doggo\"`",
"code": "invalid_document_limit", "code": "invalid_document_limit",
@ -752,7 +752,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!({ "fields": "doggo" })).await; let (response, code) = index.get_document_by_filter(json!({ "fields": "doggo" })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value type at `.fields`: expected an array, but found a string: `\"doggo\"`", "message": "Invalid value type at `.fields`: expected an array, but found a string: `\"doggo\"`",
"code": "invalid_document_fields", "code": "invalid_document_fields",
@ -763,7 +763,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!({ "filter": true })).await; let (response, code) = index.get_document_by_filter(json!({ "filter": true })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid syntax for the filter parameter: `expected String, Array, found: true`.", "message": "Invalid syntax for the filter parameter: `expected String, Array, found: true`.",
"code": "invalid_document_filter", "code": "invalid_document_filter",
@ -774,7 +774,7 @@ async fn fetch_document_by_filter() {
let (response, code) = index.get_document_by_filter(json!({ "filter": "cool doggo" })).await; let (response, code) = index.get_document_by_filter(json!({ "filter": "cool doggo" })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `cool doggo`.\n1:11 cool doggo", "message": "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `cool doggo`.\n1:11 cool doggo",
"code": "invalid_document_filter", "code": "invalid_document_filter",
@ -786,7 +786,7 @@ async fn fetch_document_by_filter() {
let (response, code) = let (response, code) =
index.get_document_by_filter(json!({ "filter": "doggo = bernese" })).await; index.get_document_by_filter(json!({ "filter": "doggo = bernese" })).await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Attribute `doggo` is not filterable. Available filterable attributes are: `color`.\n1:6 doggo = bernese", "message": "Attribute `doggo` is not filterable. Available filterable attributes are: `color`.\n1:6 doggo = bernese",
"code": "invalid_document_filter", "code": "invalid_document_filter",
@ -803,7 +803,7 @@ async fn retrieve_vectors() {
// GETALL DOCUMENTS BY QUERY // GETALL DOCUMENTS BY QUERY
let (response, _code) = index.get_all_documents_raw("?retrieveVectors=tamo").await; let (response, _code) = index.get_all_documents_raw("?retrieveVectors=tamo").await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value in parameter `retrieveVectors`: could not parse `tamo` as a boolean, expected either `true` or `false`", "message": "Invalid value in parameter `retrieveVectors`: could not parse `tamo` as a boolean, expected either `true` or `false`",
"code": "invalid_document_retrieve_vectors", "code": "invalid_document_retrieve_vectors",
@ -812,7 +812,7 @@ async fn retrieve_vectors() {
} }
"###); "###);
let (response, _code) = index.get_all_documents_raw("?retrieveVectors=true").await; let (response, _code) = index.get_all_documents_raw("?retrieveVectors=true").await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677", "message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677",
"code": "feature_not_enabled", "code": "feature_not_enabled",
@ -824,7 +824,7 @@ async fn retrieve_vectors() {
// FETCHALL DOCUMENTS BY POST // FETCHALL DOCUMENTS BY POST
let (response, _code) = let (response, _code) =
index.get_document_by_filter(json!({ "retrieveVectors": "tamo" })).await; index.get_document_by_filter(json!({ "retrieveVectors": "tamo" })).await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value type at `.retrieveVectors`: expected a boolean, but found a string: `\"tamo\"`", "message": "Invalid value type at `.retrieveVectors`: expected a boolean, but found a string: `\"tamo\"`",
"code": "invalid_document_retrieve_vectors", "code": "invalid_document_retrieve_vectors",
@ -833,7 +833,7 @@ async fn retrieve_vectors() {
} }
"###); "###);
let (response, _code) = index.get_document_by_filter(json!({ "retrieveVectors": true })).await; let (response, _code) = index.get_document_by_filter(json!({ "retrieveVectors": true })).await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677", "message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677",
"code": "feature_not_enabled", "code": "feature_not_enabled",
@ -844,7 +844,7 @@ async fn retrieve_vectors() {
// GET A SINGLEDOCUMENT // GET A SINGLEDOCUMENT
let (response, _code) = index.get_document(0, Some(json!({"retrieveVectors": "tamo"}))).await; let (response, _code) = index.get_document(0, Some(json!({"retrieveVectors": "tamo"}))).await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Invalid value in parameter `retrieveVectors`: could not parse `tamo` as a boolean, expected either `true` or `false`", "message": "Invalid value in parameter `retrieveVectors`: could not parse `tamo` as a boolean, expected either `true` or `false`",
"code": "invalid_document_retrieve_vectors", "code": "invalid_document_retrieve_vectors",
@ -853,7 +853,7 @@ async fn retrieve_vectors() {
} }
"###); "###);
let (response, _code) = index.get_document(0, Some(json!({"retrieveVectors": true}))).await; let (response, _code) = index.get_document(0, Some(json!({"retrieveVectors": true}))).await;
snapshot!(json_string!(response), @r###" snapshot!(response, @r###"
{ {
"message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677", "message": "Passing `retrieveVectors` as a parameter requires enabling the `vector store` experimental feature. See https://github.com/meilisearch/product/discussions/677",
"code": "feature_not_enabled", "code": "feature_not_enabled",

View File

@ -71,7 +71,7 @@ async fn search_bad_offset() {
} }
"###); "###);
let (response, code) = index.search_get("offset=doggo").await; let (response, code) = index.search_get("?offset=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -99,7 +99,7 @@ async fn search_bad_limit() {
} }
"###); "###);
let (response, code) = index.search_get("limit=doggo").await; let (response, code) = index.search_get("?limit=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -127,7 +127,7 @@ async fn search_bad_page() {
} }
"###); "###);
let (response, code) = index.search_get("page=doggo").await; let (response, code) = index.search_get("?page=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -155,7 +155,7 @@ async fn search_bad_hits_per_page() {
} }
"###); "###);
let (response, code) = index.search_get("hitsPerPage=doggo").await; let (response, code) = index.search_get("?hitsPerPage=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -212,7 +212,7 @@ async fn search_bad_retrieve_vectors() {
} }
"###); "###);
let (response, code) = index.search_get("retrieveVectors=").await; let (response, code) = index.search_get("?retrieveVectors=").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -223,7 +223,7 @@ async fn search_bad_retrieve_vectors() {
} }
"###); "###);
let (response, code) = index.search_get("retrieveVectors=doggo").await; let (response, code) = index.search_get("?retrieveVectors=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -269,7 +269,7 @@ async fn search_bad_crop_length() {
} }
"###); "###);
let (response, code) = index.search_get("cropLength=doggo").await; let (response, code) = index.search_get("?cropLength=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -359,7 +359,7 @@ async fn search_bad_show_matches_position() {
} }
"###); "###);
let (response, code) = index.search_get("showMatchesPosition=doggo").await; let (response, code) = index.search_get("?showMatchesPosition=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -442,7 +442,7 @@ async fn search_non_filterable_facets() {
} }
"###); "###);
let (response, code) = index.search_get("facets=doggo").await; let (response, code) = index.search_get("?facets=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -472,7 +472,7 @@ async fn search_non_filterable_facets_multiple_filterable() {
} }
"###); "###);
let (response, code) = index.search_get("facets=doggo").await; let (response, code) = index.search_get("?facets=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -502,7 +502,7 @@ async fn search_non_filterable_facets_no_filterable() {
} }
"###); "###);
let (response, code) = index.search_get("facets=doggo").await; let (response, code) = index.search_get("?facets=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -532,7 +532,7 @@ async fn search_non_filterable_facets_multiple_facets() {
} }
"###); "###);
let (response, code) = index.search_get("facets=doggo,neko").await; let (response, code) = index.search_get("?facets=doggo,neko").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -625,7 +625,7 @@ async fn search_bad_matching_strategy() {
} }
"###); "###);
let (response, code) = index.search_get("matchingStrategy=doggo").await; let (response, code) = index.search_get("?matchingStrategy=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {

View File

@ -150,7 +150,8 @@ async fn bug_4640() {
"_geo": { "_geo": {
"lat": "45.4777599", "lat": "45.4777599",
"lng": "9.1967508" "lng": "9.1967508"
} },
"_geoDistance": 0
}, },
{ {
"id": 1, "id": 1,

View File

@ -150,6 +150,35 @@ async fn simple_search() {
snapshot!(response["semanticHitCount"], @"3"); snapshot!(response["semanticHitCount"], @"3");
} }
#[actix_rt::test]
async fn limit_offset() {
let server = Server::new().await;
let index = index_with_documents_user_provided(&server, &SIMPLE_SEARCH_DOCUMENTS_VEC).await;
let (response, code) = index
.search_post(
json!({"q": "Captain", "vector": [1.0, 1.0], "hybrid": {"semanticRatio": 0.2}, "retrieveVectors": true, "offset": 1, "limit": 1}),
)
.await;
snapshot!(code, @"200 OK");
snapshot!(response["hits"], @r###"[{"title":"Captain Marvel","desc":"a Shazam ersatz","id":"3","_vectors":{"default":{"embeddings":[[2.0,3.0]],"regenerate":false}}}]"###);
snapshot!(response["semanticHitCount"], @"0");
assert_eq!(response["hits"].as_array().unwrap().len(), 1);
let server = Server::new().await;
let index = index_with_documents_user_provided(&server, &SIMPLE_SEARCH_DOCUMENTS_VEC).await;
let (response, code) = index
.search_post(
json!({"q": "Captain", "vector": [1.0, 1.0], "hybrid": {"semanticRatio": 0.9}, "retrieveVectors": true, "offset": 1, "limit": 1}),
)
.await;
snapshot!(code, @"200 OK");
snapshot!(response["hits"], @r###"[{"title":"Captain Planet","desc":"He's not part of the Marvel Cinematic Universe","id":"2","_vectors":{"default":{"embeddings":[[1.0,2.0]],"regenerate":false}}}]"###);
snapshot!(response["semanticHitCount"], @"1");
assert_eq!(response["hits"].as_array().unwrap().len(), 1);
}
#[actix_rt::test] #[actix_rt::test]
async fn simple_search_hf() { async fn simple_search_hf() {
let server = Server::new().await; let server = Server::new().await;

View File

@ -241,7 +241,7 @@ async fn similar_bad_offset() {
} }
"###); "###);
let (response, code) = index.similar_get("id=287947&offset=doggo").await; let (response, code) = index.similar_get("?id=287947&offset=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -283,7 +283,7 @@ async fn similar_bad_limit() {
} }
"###); "###);
let (response, code) = index.similar_get("id=287946&limit=doggo").await; let (response, code) = index.similar_get("?id=287946&limit=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -785,7 +785,7 @@ async fn similar_bad_retrieve_vectors() {
} }
"###); "###);
let (response, code) = index.similar_get("retrieveVectors=").await; let (response, code) = index.similar_get("?retrieveVectors=").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {
@ -796,7 +796,7 @@ async fn similar_bad_retrieve_vectors() {
} }
"###); "###);
let (response, code) = index.similar_get("retrieveVectors=doggo").await; let (response, code) = index.similar_get("?retrieveVectors=doggo").await;
snapshot!(code, @"400 Bad Request"); snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###" snapshot!(json_string!(response), @r###"
{ {

View File

@ -2,6 +2,7 @@ mod errors;
mod webhook; mod webhook;
use meili_snap::insta::assert_json_snapshot; use meili_snap::insta::assert_json_snapshot;
use meili_snap::snapshot;
use time::format_description::well_known::Rfc3339; use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime; use time::OffsetDateTime;
@ -738,11 +739,9 @@ async fn test_summarized_index_creation() {
async fn test_summarized_index_deletion() { async fn test_summarized_index_deletion() {
let server = Server::new().await; let server = Server::new().await;
let index = server.index("test"); let index = server.index("test");
index.delete().await; let (ret, _code) = index.delete().await;
index.wait_task(0).await; let task = index.wait_task(ret.uid()).await;
let (task, _) = index.get_task(0).await; snapshot!(task,
assert_json_snapshot!(task,
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r###" @r###"
{ {
"uid": 0, "uid": 0,
@ -767,12 +766,34 @@ async fn test_summarized_index_deletion() {
"###); "###);
// is the details correctly set when documents are actually deleted. // is the details correctly set when documents are actually deleted.
// /!\ We need to wait for the document addition to be processed otherwise, if the test runs too slow,
// both tasks may get autobatched and the deleted documents count will be wrong.
let (ret, _code) =
index.add_documents(json!({ "id": 42, "content": "doggos & fluff" }), Some("id")).await; index.add_documents(json!({ "id": 42, "content": "doggos & fluff" }), Some("id")).await;
index.delete().await; let task = index.wait_task(ret.uid()).await;
index.wait_task(2).await; snapshot!(task,
let (task, _) = index.get_task(2).await; @r###"
assert_json_snapshot!(task, {
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" }, "uid": 1,
"indexUid": "test",
"status": "succeeded",
"type": "documentAdditionOrUpdate",
"canceledBy": null,
"details": {
"receivedDocuments": 1,
"indexedDocuments": 1
},
"error": null,
"duration": "[duration]",
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
let (ret, _code) = index.delete().await;
let task = index.wait_task(ret.uid()).await;
snapshot!(task,
@r###" @r###"
{ {
"uid": 2, "uid": 2,
@ -792,22 +813,25 @@ async fn test_summarized_index_deletion() {
"###); "###);
// What happens when you delete an index that doesn't exists. // What happens when you delete an index that doesn't exists.
index.delete().await; let (ret, _code) = index.delete().await;
index.wait_task(2).await; let task = index.wait_task(ret.uid()).await;
let (task, _) = index.get_task(2).await; snapshot!(task,
assert_json_snapshot!(task,
{ ".duration" => "[duration]", ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]" },
@r###" @r###"
{ {
"uid": 2, "uid": 3,
"indexUid": "test", "indexUid": "test",
"status": "succeeded", "status": "failed",
"type": "indexDeletion", "type": "indexDeletion",
"canceledBy": null, "canceledBy": null,
"details": { "details": {
"deletedDocuments": 1 "deletedDocuments": 0
},
"error": {
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"
}, },
"error": null,
"duration": "[duration]", "duration": "[duration]",
"enqueuedAt": "[date]", "enqueuedAt": "[date]",
"startedAt": "[date]", "startedAt": "[date]",

View File

@ -27,8 +27,7 @@ fst = "0.4.7"
fxhash = "0.2.1" fxhash = "0.2.1"
geoutils = "0.5.1" geoutils = "0.5.1"
grenad = { version = "0.4.6", default-features = false, features = [ grenad = { version = "0.4.6", default-features = false, features = [
"rayon", "rayon"
"tempfile",
] } ] }
heed = { version = "0.20.1", default-features = false, features = [ heed = { version = "0.20.1", default-features = false, features = [
"serde-json", "serde-json",

View File

@ -166,7 +166,7 @@ pub fn validate_document_id_value(document_id: Value) -> StdResult<String, UserE
Some(s) => Ok(s.to_string()), Some(s) => Ok(s.to_string()),
None => Err(UserError::InvalidDocumentId { document_id: Value::String(string) }), None => Err(UserError::InvalidDocumentId { document_id: Value::String(string) }),
}, },
Value::Number(number) if number.is_i64() => Ok(number.to_string()), Value::Number(number) if !number.is_f64() => Ok(number.to_string()),
content => Err(UserError::InvalidDocumentId { document_id: content }), content => Err(UserError::InvalidDocumentId { document_id: content }),
} }
} }

View File

@ -17,6 +17,7 @@ struct ScoreWithRatioResult {
type ScoreWithRatio = (Vec<ScoreDetails>, f32); type ScoreWithRatio = (Vec<ScoreDetails>, f32);
#[tracing::instrument(level = "trace", skip_all, target = "search::hybrid")]
fn compare_scores( fn compare_scores(
&(ref left_scores, left_ratio): &ScoreWithRatio, &(ref left_scores, left_ratio): &ScoreWithRatio,
&(ref right_scores, right_ratio): &ScoreWithRatio, &(ref right_scores, right_ratio): &ScoreWithRatio,
@ -84,6 +85,7 @@ impl ScoreWithRatioResult {
} }
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::hybrid")]
fn merge( fn merge(
vector_results: Self, vector_results: Self,
keyword_results: Self, keyword_results: Self,
@ -150,6 +152,7 @@ impl ScoreWithRatioResult {
} }
impl<'a> Search<'a> { impl<'a> Search<'a> {
#[tracing::instrument(level = "trace", skip_all, target = "search::hybrid")]
pub fn execute_hybrid(&self, semantic_ratio: f32) -> Result<(SearchResult, Option<u32>)> { pub fn execute_hybrid(&self, semantic_ratio: f32) -> Result<(SearchResult, Option<u32>)> {
// TODO: find classier way to achieve that than to reset vector and query params // TODO: find classier way to achieve that than to reset vector and query params
// create separate keyword and semantic searches // create separate keyword and semantic searches
@ -178,22 +181,25 @@ impl<'a> Search<'a> {
// completely skip semantic search if the results of the keyword search are good enough // completely skip semantic search if the results of the keyword search are good enough
if self.results_good_enough(&keyword_results, semantic_ratio) { if self.results_good_enough(&keyword_results, semantic_ratio) {
return Ok((keyword_results, Some(0))); return Ok(return_keyword_results(self.limit, self.offset, keyword_results));
} }
// no vector search against placeholder search // no vector search against placeholder search
let Some(query) = search.query.take() else { let Some(query) = search.query.take() else {
return Ok((keyword_results, Some(0))); return Ok(return_keyword_results(self.limit, self.offset, keyword_results));
}; };
// no embedder, no semantic search // no embedder, no semantic search
let Some(SemanticSearch { vector, embedder_name, embedder }) = semantic else { let Some(SemanticSearch { vector, embedder_name, embedder }) = semantic else {
return Ok((keyword_results, Some(0))); return Ok(return_keyword_results(self.limit, self.offset, keyword_results));
}; };
let vector_query = match vector { let vector_query = match vector {
Some(vector_query) => vector_query, Some(vector_query) => vector_query,
None => { None => {
// attempt to embed the vector // attempt to embed the vector
let span = tracing::trace_span!(target: "search::hybrid", "embed_one");
let _entered = span.enter();
match embedder.embed_one(query) { match embedder.embed_one(query) {
Ok(embedding) => embedding, Ok(embedding) => embedding,
Err(error) => { Err(error) => {
@ -239,3 +245,44 @@ impl<'a> Search<'a> {
true true
} }
} }
fn return_keyword_results(
limit: usize,
offset: usize,
SearchResult {
matching_words,
candidates,
mut documents_ids,
mut document_scores,
degraded,
used_negative_operator,
}: SearchResult,
) -> (SearchResult, Option<u32>) {
let (documents_ids, document_scores) = if offset >= documents_ids.len() ||
// technically redudant because documents_ids.len() == document_scores.len(),
// defensive programming
offset >= document_scores.len()
{
(vec![], vec![])
} else {
// PANICS: offset < len
documents_ids.rotate_left(offset);
documents_ids.truncate(limit);
// PANICS: offset < len
document_scores.rotate_left(offset);
document_scores.truncate(limit);
(documents_ids, document_scores)
};
(
SearchResult {
matching_words,
candidates,
documents_ids,
document_scores,
degraded,
used_negative_operator,
},
Some(0),
)
}

View File

@ -371,4 +371,28 @@ mod test {
assert_eq!(documents_ids, vec![1]); assert_eq!(documents_ids, vec![1]);
} }
#[cfg(feature = "korean")]
#[test]
fn test_hangul_language_detection() {
use crate::index::tests::TempIndex;
let index = TempIndex::new();
index
.add_documents(documents!([
{ "id": 0, "title": "The quick (\"brown\") fox can't jump 32.3 feet, right? Brr, it's 29.3°F!" },
{ "id": 1, "title": "김밥먹을래。" },
{ "id": 2, "title": "הַשּׁוּעָל הַמָּהִיר (״הַחוּם״) לֹא יָכוֹל לִקְפֹּץ 9.94 מֶטְרִים, נָכוֹן? ברר, 1.5°C- בַּחוּץ!" }
]))
.unwrap();
let txn = index.write_txn().unwrap();
let mut search = Search::new(&txn, &index);
search.query("김밥");
let SearchResult { documents_ids, .. } = search.execute().unwrap();
assert_eq!(documents_ids, vec![1]);
}
} }

View File

@ -213,9 +213,6 @@ pub fn bucket_sort<'ctx, Q: RankingRuleQueryTrait>(
continue; continue;
} }
let span = tracing::trace_span!(target: "search::bucket_sort", "next_bucket", id = ranking_rules[cur_ranking_rule_index].id());
let entered = span.enter();
let Some(next_bucket) = ranking_rules[cur_ranking_rule_index].next_bucket( let Some(next_bucket) = ranking_rules[cur_ranking_rule_index].next_bucket(
ctx, ctx,
logger, logger,
@ -225,7 +222,6 @@ pub fn bucket_sort<'ctx, Q: RankingRuleQueryTrait>(
back!(); back!();
continue; continue;
}; };
drop(entered);
ranking_rule_scores.push(next_bucket.score); ranking_rule_scores.push(next_bucket.score);

View File

@ -27,6 +27,7 @@ impl<'ctx> RankingRule<'ctx, QueryGraph> for ExactAttribute {
"exact_attribute".to_owned() "exact_attribute".to_owned()
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::exact_attribute")]
fn start_iteration( fn start_iteration(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -38,6 +39,7 @@ impl<'ctx> RankingRule<'ctx, QueryGraph> for ExactAttribute {
Ok(()) Ok(())
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::exact_attribute")]
fn next_bucket( fn next_bucket(
&mut self, &mut self,
_ctx: &mut SearchContext<'ctx>, _ctx: &mut SearchContext<'ctx>,
@ -51,6 +53,7 @@ impl<'ctx> RankingRule<'ctx, QueryGraph> for ExactAttribute {
Ok(output) Ok(output)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::exact_attribute")]
fn end_iteration( fn end_iteration(
&mut self, &mut self,
_ctx: &mut SearchContext<'ctx>, _ctx: &mut SearchContext<'ctx>,

View File

@ -209,6 +209,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for GeoSort<Q> {
"geo_sort".to_owned() "geo_sort".to_owned()
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::geo_sort")]
fn start_iteration( fn start_iteration(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -234,6 +235,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for GeoSort<Q> {
Ok(()) Ok(())
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::geo_sort")]
#[allow(clippy::only_used_in_recursion)] #[allow(clippy::only_used_in_recursion)]
fn next_bucket( fn next_bucket(
&mut self, &mut self,
@ -285,6 +287,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for GeoSort<Q> {
self.next_bucket(ctx, logger, universe) self.next_bucket(ctx, logger, universe)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::geo_sort")]
fn end_iteration(&mut self, _ctx: &mut SearchContext<'ctx>, _logger: &mut dyn SearchLogger<Q>) { fn end_iteration(&mut self, _ctx: &mut SearchContext<'ctx>, _logger: &mut dyn SearchLogger<Q>) {
// we do not reset the rtree here, it could be used in a next iteration // we do not reset the rtree here, it could be used in a next iteration
self.query = None; self.query = None;

View File

@ -127,6 +127,8 @@ impl<'ctx, G: RankingRuleGraphTrait> RankingRule<'ctx, QueryGraph> for GraphBase
fn id(&self) -> String { fn id(&self) -> String {
self.id.clone() self.id.clone()
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::graph_based")]
fn start_iteration( fn start_iteration(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -209,6 +211,7 @@ impl<'ctx, G: RankingRuleGraphTrait> RankingRule<'ctx, QueryGraph> for GraphBase
Ok(()) Ok(())
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::graph_based")]
fn next_bucket( fn next_bucket(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -358,6 +361,7 @@ impl<'ctx, G: RankingRuleGraphTrait> RankingRule<'ctx, QueryGraph> for GraphBase
Ok(Some(RankingRuleOutput { query: next_query_graph, candidates: bucket, score })) Ok(Some(RankingRuleOutput { query: next_query_graph, candidates: bucket, score }))
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::graph_based")]
fn end_iteration( fn end_iteration(
&mut self, &mut self,
_ctx: &mut SearchContext<'ctx>, _ctx: &mut SearchContext<'ctx>,

View File

@ -212,7 +212,7 @@ fn resolve_maximally_reduced_query_graph(
Ok(docids) Ok(docids)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search")] #[tracing::instrument(level = "trace", skip_all, target = "search::universe")]
fn resolve_universe( fn resolve_universe(
ctx: &mut SearchContext, ctx: &mut SearchContext,
initial_universe: &RoaringBitmap, initial_universe: &RoaringBitmap,
@ -229,7 +229,7 @@ fn resolve_universe(
) )
} }
#[tracing::instrument(level = "trace", skip_all, target = "search")] #[tracing::instrument(level = "trace", skip_all, target = "search::query")]
fn resolve_negative_words( fn resolve_negative_words(
ctx: &mut SearchContext, ctx: &mut SearchContext,
negative_words: &[Word], negative_words: &[Word],
@ -243,7 +243,7 @@ fn resolve_negative_words(
Ok(negative_bitmap) Ok(negative_bitmap)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search")] #[tracing::instrument(level = "trace", skip_all, target = "search::query")]
fn resolve_negative_phrases( fn resolve_negative_phrases(
ctx: &mut SearchContext, ctx: &mut SearchContext,
negative_phrases: &[LocatedQueryTerm], negative_phrases: &[LocatedQueryTerm],
@ -548,7 +548,7 @@ fn resolve_sort_criteria<'ctx, Query: RankingRuleQueryTrait>(
Ok(()) Ok(())
} }
#[tracing::instrument(level = "trace", skip_all, target = "search")] #[tracing::instrument(level = "trace", skip_all, target = "search::universe")]
pub fn filtered_universe( pub fn filtered_universe(
index: &Index, index: &Index,
txn: &RoTxn<'_>, txn: &RoTxn<'_>,
@ -620,7 +620,7 @@ pub fn execute_vector_search(
} }
#[allow(clippy::too_many_arguments)] #[allow(clippy::too_many_arguments)]
#[tracing::instrument(level = "trace", skip_all, target = "search")] #[tracing::instrument(level = "trace", skip_all, target = "search::main")]
pub fn execute_search( pub fn execute_search(
ctx: &mut SearchContext, ctx: &mut SearchContext,
query: Option<&str>, query: Option<&str>,

View File

@ -44,6 +44,7 @@ fn compute_docids(
impl RankingRuleGraphTrait for ExactnessGraph { impl RankingRuleGraphTrait for ExactnessGraph {
type Condition = ExactnessCondition; type Condition = ExactnessCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::exactness")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -71,6 +72,7 @@ impl RankingRuleGraphTrait for ExactnessGraph {
}) })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::exactness")]
fn build_edges( fn build_edges(
_ctx: &mut SearchContext, _ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -86,6 +88,7 @@ impl RankingRuleGraphTrait for ExactnessGraph {
Ok(vec![(0, exact_condition), (dest_node.term_ids.len() as u32, skip_condition)]) Ok(vec![(0, exact_condition), (dest_node.term_ids.len() as u32, skip_condition)])
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::exactness")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::ExactWords(score_details::ExactWords::from_rank(rank)) ScoreDetails::ExactWords(score_details::ExactWords::from_rank(rank))
} }

View File

@ -20,6 +20,7 @@ pub enum FidGraph {}
impl RankingRuleGraphTrait for FidGraph { impl RankingRuleGraphTrait for FidGraph {
type Condition = FidCondition; type Condition = FidCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::fid")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -44,6 +45,7 @@ impl RankingRuleGraphTrait for FidGraph {
}) })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::fid")]
fn build_edges( fn build_edges(
ctx: &mut SearchContext, ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -101,6 +103,7 @@ impl RankingRuleGraphTrait for FidGraph {
Ok(edges) Ok(edges)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::fid")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::Fid(rank) ScoreDetails::Fid(rank)
} }

View File

@ -20,6 +20,7 @@ pub enum PositionGraph {}
impl RankingRuleGraphTrait for PositionGraph { impl RankingRuleGraphTrait for PositionGraph {
type Condition = PositionCondition; type Condition = PositionCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::position")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -44,6 +45,7 @@ impl RankingRuleGraphTrait for PositionGraph {
}) })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::position")]
fn build_edges( fn build_edges(
ctx: &mut SearchContext, ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -117,6 +119,7 @@ impl RankingRuleGraphTrait for PositionGraph {
Ok(edges) Ok(edges)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::position")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::Position(rank) ScoreDetails::Position(rank)
} }

View File

@ -21,6 +21,7 @@ pub enum ProximityGraph {}
impl RankingRuleGraphTrait for ProximityGraph { impl RankingRuleGraphTrait for ProximityGraph {
type Condition = ProximityCondition; type Condition = ProximityCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::proximity")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -29,6 +30,7 @@ impl RankingRuleGraphTrait for ProximityGraph {
compute_docids::compute_docids(ctx, condition, universe) compute_docids::compute_docids(ctx, condition, universe)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::proximity")]
fn build_edges( fn build_edges(
ctx: &mut SearchContext, ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -38,6 +40,7 @@ impl RankingRuleGraphTrait for ProximityGraph {
build::build_edges(ctx, conditions_interner, source_term, dest_term) build::build_edges(ctx, conditions_interner, source_term, dest_term)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::proximity")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::Proximity(rank) ScoreDetails::Proximity(rank)
} }

View File

@ -19,6 +19,7 @@ pub enum TypoGraph {}
impl RankingRuleGraphTrait for TypoGraph { impl RankingRuleGraphTrait for TypoGraph {
type Condition = TypoCondition; type Condition = TypoCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::typo")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -37,6 +38,7 @@ impl RankingRuleGraphTrait for TypoGraph {
}) })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::typo")]
fn build_edges( fn build_edges(
ctx: &mut SearchContext, ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -77,6 +79,7 @@ impl RankingRuleGraphTrait for TypoGraph {
Ok(edges) Ok(edges)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::typo")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::Typo(score_details::Typo::from_rank(rank)) ScoreDetails::Typo(score_details::Typo::from_rank(rank))
} }

View File

@ -18,6 +18,7 @@ pub enum WordsGraph {}
impl RankingRuleGraphTrait for WordsGraph { impl RankingRuleGraphTrait for WordsGraph {
type Condition = WordsCondition; type Condition = WordsCondition;
#[tracing::instrument(level = "trace", skip_all, target = "search::words")]
fn resolve_condition( fn resolve_condition(
ctx: &mut SearchContext, ctx: &mut SearchContext,
condition: &Self::Condition, condition: &Self::Condition,
@ -36,6 +37,7 @@ impl RankingRuleGraphTrait for WordsGraph {
}) })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::words")]
fn build_edges( fn build_edges(
_ctx: &mut SearchContext, _ctx: &mut SearchContext,
conditions_interner: &mut DedupInterner<Self::Condition>, conditions_interner: &mut DedupInterner<Self::Condition>,
@ -45,6 +47,7 @@ impl RankingRuleGraphTrait for WordsGraph {
Ok(vec![(0, conditions_interner.insert(WordsCondition { term: to_term.clone() }))]) Ok(vec![(0, conditions_interner.insert(WordsCondition { term: to_term.clone() }))])
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::words")]
fn rank_to_score(rank: Rank) -> ScoreDetails { fn rank_to_score(rank: Rank) -> ScoreDetails {
ScoreDetails::Words(score_details::Words::from_rank(rank)) ScoreDetails::Words(score_details::Words::from_rank(rank))
} }

View File

@ -88,6 +88,8 @@ impl<'ctx, Query: RankingRuleQueryTrait> RankingRule<'ctx, Query> for Sort<'ctx,
let Self { field_name, is_ascending, .. } = self; let Self { field_name, is_ascending, .. } = self;
format!("{field_name}:{}", if *is_ascending { "asc" } else { "desc" }) format!("{field_name}:{}", if *is_ascending { "asc" } else { "desc" })
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::sort")]
fn start_iteration( fn start_iteration(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -186,6 +188,7 @@ impl<'ctx, Query: RankingRuleQueryTrait> RankingRule<'ctx, Query> for Sort<'ctx,
Ok(()) Ok(())
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::sort")]
fn next_bucket( fn next_bucket(
&mut self, &mut self,
_ctx: &mut SearchContext<'ctx>, _ctx: &mut SearchContext<'ctx>,
@ -211,6 +214,7 @@ impl<'ctx, Query: RankingRuleQueryTrait> RankingRule<'ctx, Query> for Sort<'ctx,
} }
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::sort")]
fn end_iteration( fn end_iteration(
&mut self, &mut self,
_ctx: &mut SearchContext<'ctx>, _ctx: &mut SearchContext<'ctx>,

View File

@ -73,6 +73,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for VectorSort<Q> {
"vector_sort".to_owned() "vector_sort".to_owned()
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::vector_sort")]
fn start_iteration( fn start_iteration(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -89,6 +90,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for VectorSort<Q> {
} }
#[allow(clippy::only_used_in_recursion)] #[allow(clippy::only_used_in_recursion)]
#[tracing::instrument(level = "trace", skip_all, target = "search::vector_sort")]
fn next_bucket( fn next_bucket(
&mut self, &mut self,
ctx: &mut SearchContext<'ctx>, ctx: &mut SearchContext<'ctx>,
@ -139,6 +141,7 @@ impl<'ctx, Q: RankingRuleQueryTrait> RankingRule<'ctx, Q> for VectorSort<Q> {
self.next_bucket(ctx, _logger, universe) self.next_bucket(ctx, _logger, universe)
} }
#[tracing::instrument(level = "trace", skip_all, target = "search::vector_sort")]
fn end_iteration(&mut self, _ctx: &mut SearchContext<'ctx>, _logger: &mut dyn SearchLogger<Q>) { fn end_iteration(&mut self, _ctx: &mut SearchContext<'ctx>, _logger: &mut dyn SearchLogger<Q>) {
self.query = None; self.query = None;
} }

View File

@ -325,7 +325,7 @@ where
let documents_chunk_size = match self.indexer_config.documents_chunk_size { let documents_chunk_size = match self.indexer_config.documents_chunk_size {
Some(chunk_size) => chunk_size, Some(chunk_size) => chunk_size,
None => { None => {
let default_chunk_size = 1024 * 1024 * 4; // 4MiB let default_chunk_size = 1024 * 1024 * 1024 * 2; // 2 GiB
let min_chunk_size = 1024 * 512; // 512KiB let min_chunk_size = 1024 * 512; // 512KiB
// compute the chunk size from the number of available threads and the inputed data size. // compute the chunk size from the number of available threads and the inputed data size.

3
rust-toolchain.toml Normal file
View File

@ -0,0 +1,3 @@
[toolchain]
channel = "1.75.0"
components = ["clippy"]

View File

@ -0,0 +1,171 @@
{
"name": "search-movies-subset-hf-embeddings",
"run_count": 2,
"target": "search::=trace",
"extra_cli_args": [
"--max-indexing-threads=4"
],
"assets": {
"movies-100.json": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/movies-100.json",
"sha256": "d215e395e4240f12f03b8f1f68901eac82d9e7ded5b462cbf4a6b8efde76c6c6"
}
},
"precommands": [
{
"route": "experimental-features",
"method": "PATCH",
"body": {
"inline": {
"vectorStore": true
}
},
"synchronous": "DontWait"
},
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"searchableAttributes": [
"title",
"overview"
],
"filterableAttributes": [
"genres",
"release_date"
],
"sortableAttributes": [
"release_date"
],
"searchCutoffMs": 15000
}
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"embedders": {
"default": {
"source": "huggingFace",
"documentTemplate": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
}
}
}
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "movies-100.json"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "puppy cute comforting movie",
"limit": 100,
"hybrid": {
"semanticRatio": 0.1
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "puppy cute comforting movie",
"limit": 100,
"hybrid": {
"semanticRatio": 0.5
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "puppy cute comforting movie",
"limit": 100,
"hybrid": {
"semanticRatio": 0.9
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "puppy cute comforting movie",
"limit": 100,
"hybrid": {
"semanticRatio": 1.0
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "shrek",
"limit": 100,
"hybrid": {
"semanticRatio": 1.0
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "shrek",
"limit": 100,
"hybrid": {
"semanticRatio": 0.5
}
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "shrek",
"limit": 100,
"hybrid": {
"semanticRatio": 0.1
}
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -0,0 +1,94 @@
{
"name": "search-sortable-movies.json",
"run_count": 10,
"target": "search::=trace",
"extra_cli_args": [],
"assets": {
"movies.json": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/movies.json",
"sha256": "5b6e4cb660bc20327776e8a33ea197b43d9ec84856710ead1cc87ab24df77de1"
}
},
"precommands": [
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"searchableAttributes": [
"title",
"overview"
],
"filterableAttributes": [
"genres",
"release_date"
],
"sortableAttributes": [
"release_date"
],
"searchCutoffMs": 15000
}
},
"synchronous": "DontWait"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "movies.json"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "",
"limit": 100,
"filter": "genres IN [action, comedy, adventure] AND release_date = 233366400"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "Batman returns",
"limit": 100,
"filter": "genres IN [action, comedy, adventure] AND release_date > 233366400"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "the",
"limit": 100,
"filter": "genres IN [animation, comedy, adventure] AND release_date < 233366400"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "t",
"limit": 100,
"filter": "genres = Family AND release_date <= 233366400 OR release_date >= 1054252800"
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -0,0 +1,340 @@
{
"name": "search-geosort.jsonl_1M",
"run_count": 3,
"target": "search::=trace",
"extra_cli_args": [],
"assets": {
"smol-all-countries-100k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-100k.jsonl",
"sha256": "d00924689abc02d09ec4667cc5a18364ff7bc236bad51367f34b9184b945ece3"
},
"smol-all-countries-200k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-200k.jsonl",
"sha256": "2a215b43b35d596d9da4f1071deab9002a93602e6dbf1308fba53eb89d9c5a9e"
},
"smol-all-countries-300k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-300k.jsonl",
"sha256": "91d94d78eeb10d631557a5ccf775e74a41d14ccaff4d7121dd90c7aa35534f2b"
},
"smol-all-countries-400k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-400k.jsonl",
"sha256": "ee883a353b571f35f4abb79b95cfa628f3f1c582919dd658a388b220f97fe035"
},
"smol-all-countries-500k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-500k.jsonl",
"sha256": "5be254ce4c50db12b7f1795859b8bbdcbc2ec22bccb3a1898899bd4c4765a1bf"
},
"smol-all-countries-600k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-600k.jsonl",
"sha256": "3aa91afe3361f5185c142125dfcdc8ddcb7d39fdeeeb4f5e67439511905e9826"
},
"smol-all-countries-700k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-700k.jsonl",
"sha256": "5a864a1e9d89736147a8da594e2cbce5264979326d38655d0945d8447f3867b3"
},
"smol-all-countries-800k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-800k.jsonl",
"sha256": "d85eb9c85a612fd7b77623e162ecd0f8265ba3be97054e26b9cff7c48735809b"
},
"smol-all-countries-900k.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-900k.jsonl",
"sha256": "4fd6662e8b9bfcd9fad7d5dcd691a47ec985d810d1e340465c056ee84e9c40f3"
},
"smol-all-countries-1M.jsonl": {
"local_location": null,
"format": "NdJson",
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/smol-all-countries/smol-all-countries-1M.jsonl",
"sha256": "585a713b489b154b94e7c07707bd369f888c7fe24eb90bf604578d7adf51a9e6"
}
},
"precommands": [
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"displayedAttributes": [
"geonameid",
"name",
"asciiname",
"alternatenames",
"_geo",
"population"
],
"searchableAttributes": [
"name",
"alternatenames",
"elevation"
],
"filterableAttributes": [
"_geo",
"population",
"elevation"
],
"sortableAttributes": [
"_geo",
"population",
"elevation"
],
"searchCutoffMs": 15000
}
},
"synchronous": "DontWait"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-100k.jsonl"
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-200k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-300k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-400k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-500k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-600k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-700k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-800k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-900k.jsonl"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "smol-all-countries-1M.jsonl"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "",
"limit": 100
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(50.62999333378238, 3.086269263384099):asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(50.62999333378238, 3.086269263384099):desc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(35.749512532692144, 139.61664952543356):asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(35.749512532692144, 139.61664952543356):desc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(-48.87561645055408, -123.39275749319793):asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"sort": [
"_geoPoint(-48.87561645055408, -123.39275749319793):desc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(50.62999333378238, 3.086269263384099, 100000)"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(50.62999333378238, 3.086269263384099, 1000)"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(35.749512532692144, 139.61664952543356, 100000)"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(35.749512532692144, 139.61664952543356, 1000)"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(-48.87561645055408, -123.39275749319793, 100000)"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"filter": "_geoRadius(-48.87561645055408, -123.39275749319793, 1000)"
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -0,0 +1,255 @@
{
"name": "search-hackernews.ndjson_1M",
"run_count": 3,
"target": "search::=trace",
"extra_cli_args": [],
"assets": {
"hackernews-100_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-100_000.ndjson",
"sha256": "60ecd23485d560edbd90d9ca31f0e6dba1455422f2a44e402600fbb5f7f1b213"
},
"hackernews-200_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-200_000.ndjson",
"sha256": "785b0271fdb47cba574fab617d5d332276b835c05dd86e4a95251cf7892a1685"
},
"hackernews-300_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-300_000.ndjson",
"sha256": "de73c7154652eddfaf69cdc3b2f824d5c452f095f40a20a1c97bb1b5c4d80ab2"
},
"hackernews-400_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-400_000.ndjson",
"sha256": "c1b00a24689110f366447e434c201c086d6f456d54ed1c4995894102794d8fe7"
},
"hackernews-500_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-500_000.ndjson",
"sha256": "ae98f9dbef8193d750e3e2dbb6a91648941a1edca5f6e82c143e7996f4840083"
},
"hackernews-600_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-600_000.ndjson",
"sha256": "b495fdc72c4a944801f786400f22076ab99186bee9699f67cbab2f21f5b74dbe"
},
"hackernews-700_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-700_000.ndjson",
"sha256": "4b2c63974f3dabaa4954e3d4598b48324d03c522321ac05b0d583f36cb78a28b"
},
"hackernews-800_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-800_000.ndjson",
"sha256": "cb7b6afe0e6caa1be111be256821bc63b0771b2a0e1fad95af7aaeeffd7ba546"
},
"hackernews-900_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-900_000.ndjson",
"sha256": "e1154ddcd398f1c867758a93db5bcb21a07b9e55530c188a2917fdef332d3ba9"
},
"hackernews-1_000_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-1_000_000.ndjson",
"sha256": "27e25efd0b68b159b8b21350d9af76938710cb29ce0393fa71b41c4f3c630ffe"
}
},
"precommands": [
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"displayedAttributes": [
"title",
"by",
"score",
"time"
],
"searchableAttributes": [
"title"
],
"filterableAttributes": [
"by"
],
"sortableAttributes": [
"score",
"time"
],
"rankingRules": [
"sort",
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"searchCutoffMs": 15000
}
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-100_000.ndjson"
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-200_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-300_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-400_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-500_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-600_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-700_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-800_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-900_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-1_000_000.ndjson"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "rust meilisearch",
"limit": 100,
"filter": "by = tpayet",
"sort": [
"score:desc",
"time:asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "rust meilisearch",
"limit": 100,
"filter": "NOT by = tpayet",
"sort": [
"score:desc",
"time:asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "meilisearch",
"limit": 100,
"sort": [
"score:desc",
"time:desc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "rust",
"limit": 100,
"filter": "by = dang",
"sort": [
"score:desc",
"time:asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "combinator YC",
"limit": 100,
"filter": "by = dang",
"sort": [
"score:desc",
"time:asc"
]
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -0,0 +1,90 @@
{
"name": "search-movies.json",
"run_count": 10,
"target": "search::=trace",
"extra_cli_args": [],
"assets": {
"movies.json": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/movies.json",
"sha256": "5b6e4cb660bc20327776e8a33ea197b43d9ec84856710ead1cc87ab24df77de1"
}
},
"precommands": [
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"searchableAttributes": [
"title",
"overview"
],
"filterableAttributes": [
"genres",
"release_date"
],
"sortableAttributes": [
"release_date"
],
"searchCutoffMs": 15000
}
},
"synchronous": "DontWait"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "movies.json"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "",
"limit": 100
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "Batman returns",
"limit": 100
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"q": "the"
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"limit": 100,
"q": "t"
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -0,0 +1,110 @@
{
"name": "search-sortable-movies.json",
"run_count": 10,
"target": "search::=trace",
"extra_cli_args": [],
"assets": {
"movies.json": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/movies.json",
"sha256": "5b6e4cb660bc20327776e8a33ea197b43d9ec84856710ead1cc87ab24df77de1"
}
},
"precommands": [
{
"route": "indexes/movies/settings",
"method": "PATCH",
"body": {
"inline": {
"searchableAttributes": [
"title",
"overview"
],
"filterableAttributes": [
"genres",
"release_date"
],
"sortableAttributes": [
"release_date"
],
"rankingRules": [
"sort",
"words",
"typo",
"proximity",
"attribute",
"exactness"
],
"searchCutoffMs": 15000
}
},
"synchronous": "DontWait"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "movies.json"
},
"synchronous": "WaitForTask"
}
],
"commands": [
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "",
"limit": 100,
"sort": [
"release_date:asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "Batman returns",
"limit": 100,
"sort": [
"release_date:desc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "the",
"limit": 100,
"sort": [
"release_date:asc"
]
}
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/search",
"method": "POST",
"body": {
"inline": {
"q": "t",
"limit": 100,
"sort": [
"release_date:asc"
]
}
},
"synchronous": "WaitForResponse"
}
]
}

View File

@ -23,6 +23,8 @@ pub struct Workload {
pub extra_cli_args: Vec<String>, pub extra_cli_args: Vec<String>,
pub assets: BTreeMap<String, Asset>, pub assets: BTreeMap<String, Asset>,
#[serde(default)] #[serde(default)]
pub target: String,
#[serde(default)]
pub precommands: Vec<super::command::Command>, pub precommands: Vec<super::command::Command>,
pub commands: Vec<super::command::Command>, pub commands: Vec<super::command::Command>,
} }
@ -54,7 +56,7 @@ async fn run_commands(
let trace_filename = format!("{report_folder}/{workload_name}-{run_number}-trace.json"); let trace_filename = format!("{report_folder}/{workload_name}-{run_number}-trace.json");
let report_filename = format!("{report_folder}/{workload_name}-{run_number}-report.json"); let report_filename = format!("{report_folder}/{workload_name}-{run_number}-report.json");
let report_handle = start_report(logs_client, trace_filename).await?; let report_handle = start_report(logs_client, trace_filename, &workload.target).await?;
for batch in workload for batch in workload
.commands .commands
@ -160,7 +162,11 @@ async fn execute_run(
async fn start_report( async fn start_report(
logs_client: &Client, logs_client: &Client,
filename: String, filename: String,
target: &str,
) -> anyhow::Result<tokio::task::JoinHandle<anyhow::Result<std::fs::File>>> { ) -> anyhow::Result<tokio::task::JoinHandle<anyhow::Result<std::fs::File>>> {
const DEFAULT_TARGET: &str = "indexing::=trace";
let target = if target.is_empty() { DEFAULT_TARGET } else { target };
let report_file = std::fs::File::options() let report_file = std::fs::File::options()
.create(true) .create(true)
.truncate(true) .truncate(true)
@ -174,7 +180,7 @@ async fn start_report(
.post("") .post("")
.json(&json!({ .json(&json!({
"mode": "profile", "mode": "profile",
"target": "indexing::=trace" "target": target,
})) }))
.send() .send()
.await .await