Compare commits

...

109 Commits

Author SHA1 Message Date
98f0618065 Bump docker/setup-qemu-action from 2 to 3
Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](https://github.com/docker/setup-qemu-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-01 17:18:20 +00:00
86b314626d Merge #4080
4080: Bring back changes from v1.4.0 into main r=Kerollmops a=curquiza



Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: curquiza <clementine@meilisearch.com>
Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
Co-authored-by: dogukanakkaya <doguakkaya27@hotmail.com>
2023-09-26 08:13:49 +00:00
bb79bdb3f8 Merge #4074
4074: Enable analytics in debug builds r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4072

## What does this PR do?
- Stop disabling the analytics if meilisearch has been compiled in debug mode

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-09-21 15:54:41 +00:00
d429e7da99 make clippy happy 2023-09-21 17:41:12 +02:00
584b772248 enable metrics in debug builds 2023-09-21 17:01:05 +02:00
1806c04a9a Merge #4065
4065: Dependency issue every 6 months r=curquiza a=curquiza

To avoid spending too much time on it (1 every two sprints)

If you disagree `@Kerollmops,` for security or any reason, please close the PR

Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2023-09-19 15:58:06 +00:00
3485e8f1c4 Update .github/workflows/dependency-issue.yml 2023-09-18 09:59:22 +02:00
fe697a6685 Dependency issue every 6 months 2023-09-18 09:57:58 +02:00
eb4135f8ae Merge #4044
4044: Add more integrations to SDK CI r=curquiza a=curquiza

For the integration scope management, but also to anticipate bugs and breaking changes for engine team, we need to add more SDKs tests into the CI

Co-authored-by: curquiza <clementine@meilisearch.com>
2023-09-13 14:05:41 +00:00
ec4844c3a6 Add dart, swift, dotnet, and java test
Display docker image

Add strapi and firebase

Add rails and symfony tests

Remove strapi and firestore tests

Fix dotnet SDK CI

Use specific dart SDK version

Disable coverage for ruby SDK

Prevent pushing coverage information to codecov

Remove codecoverage token

Trigger Build

Trigger Build

Trigger Build

Trigger Build

Trigger Build
2023-09-12 17:31:17 +02:00
77c3787b78 Merge #4056
4056: Rewrite segment_analytics module with the destructuring syntax r=Kerollmops a=vivek-26

# Pull Request

## Related issue
Fixes #3928

## What does this PR do?
- This PR uses Rust Destructuring syntax in the `segment_analytics` module, such that adding or deleting fields causes an error at compile time.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
2023-09-12 08:07:50 +00:00
4f902490b9 struct destructuring for DocumentsFetchAggregator 2023-09-12 10:39:28 +05:30
1faee92748 struct destructuring for HealthAggregator 2023-09-12 10:39:28 +05:30
5831466525 struct destructuring for DocumentsDeletionAggregator and TasksAggregator 2023-09-12 10:39:28 +05:30
3cdb3e4eaf struct destructuring for DocumentsAggregator 2023-09-12 10:39:27 +05:30
26f34ec7a2 struct destructuring for FacetSearchAggregator 2023-09-12 10:39:27 +05:30
07d36180ad struct destructuring for MultiSearchAggregator 2023-09-12 10:39:27 +05:30
4c641b79a2 use rust struct destructuring for SearchAggregator 2023-09-12 10:39:27 +05:30
76c05d1b20 Merge #4053
4053: Fix the stats of the documents deletion by filter r=Kerollmops a=irevoire

# Pull Request

The issue was that the operation « DocumentDeletionByFilter » was not declared as an index operation. That means the index stats were not reprocessed after the application of the operation.

## Related issue
Fixes #4018

## What does this PR do?
- Move the `DocumentDeletionByFilter` internal operation into the category of the `IndexOperation`. This means that the stats will automatically be re-processed after a batch is processed.
- Update a test to ensure that the stats are valid after each operation

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-09-11 15:53:26 +00:00
ef31ab52a4 Merge #4051
4051: Implement the snapshots on demand r=Kerollmops a=irevoire

# Pull Request
Private link: [PRD available here](https://www.notion.so/meilisearch/On-demand-snapshots-5676e542b905459d96eec228da133b00#847ff0cafeb64fe09e8ee7150852b474)
Specification here: https://github.com/meilisearch/specifications/pull/258

## Prototype
A prototype is available under the name: `prototype-snapshot-on-demand-0`.

## Related issue
Fixes #4052
## What does this PR do?
- Introduce a new route, `POST /snapshots` to create snapshots on demand
- Introduce a new api-key action `snapshot.create`
- Introduce a new analytic `Snapshot Created` sent every time a snapshot is created.

## Notes for the team

I made a prototype so users can test the feature before the v1.5 comes out. But we can merge the PR as-is.

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-09-11 15:16:08 +00:00
34fac115d5 fix clippy 2023-09-11 17:15:57 +02:00
791c5cd874 makes clippy happy 2023-09-11 17:02:01 +02:00
5bea1092fb fix the flaky test 2023-09-11 16:56:26 +02:00
056b2c387d refactor the tests suite slightly 2023-09-11 16:56:26 +02:00
a09686fcbd Merge #3997
3997: Refactor empty arrays/objects should return empty instead of null r=Kerollmops a=dogukanakkaya

# Pull Request

## What does this PR do?
At the moment if we select empty objects and array of object properties with dot notations like:
```json
{
  "array": [],
  "object": {}
}
```
```rs
GetDocumentOptions { fields: Some(vec!["array.name", "object.name"]) }
```
returns null if the array/object has no property yet.

I am not sure if this is expected or it's the correct behaviour but I add my document with a property that is assigned to an empty array/object, later on when I select it, returns null which is kinda weird and unexpected in my opinion.

This PR fixes that issue by returning an empty vector if the array is empty or an empty map if object is empty. This is not added for `permissive-json-pointer/src/lib.rs:224` because `create_array` loops over each item. Selecting a single property that is an object, in an array of objects would result other objects to be empty maps instead of none. 
```json
"doggos": [
  {
    "jean": {
      "race": {
        "name": "bernese mountain",
      }
    }
  },
  {
    "marc": {
       "age": 4,
       "race": {
          "name": "golden retriever",
        }
     }
   }
]
```
```rs
GetDocumentOptions { fields: Some(vec!["doggos.jean"]) }
```
Would result in `jean` object and an extra empty object for `marc`.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: dogukanakkaya <doguakkaya27@hotmail.com>
2023-09-11 13:46:02 +00:00
b4c44603db Merge #4009
4009: Bump rustls-webpki from 0.100.1 to 0.100.2 r=Kerollmops a=dependabot[bot]

Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.100.1 to 0.100.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/rustls/webpki/releases">rustls-webpki's releases</a>.</em></p>
<blockquote>
<h2>v/0.100.2</h2>
<h2>Release notes</h2>
<ul>
<li>certificate path building and verification is now capped at 100 signature validation operations to avoid the risk of CPU usage denial-of-service attack when validating crafted certificate chains producing quadratic runtime. This risk affected both clients, as well as servers that verified client certificates.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>v0.100.2 prep by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/154">rustls/webpki#154</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.100.1...v/0.100.2">https://github.com/rustls/webpki/compare/v/0.100.1...v/0.100.2</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="c8b821450b"><code>c8b8214</code></a> Bump MSRV to 1.60</li>
<li><a href="855752292e"><code>8557522</code></a> Avoid testing MSRV of dev-dependencies</li>
<li><a href="73a7f0c7d7"><code>73a7f0c</code></a> Cargo: version 0.100.1 -&gt; 0.100.2</li>
<li><a href="4ea052366f"><code>4ea0523</code></a> verify_cert: enforce maximum number of signatures.</li>
<li>See full diff in <a href="https://github.com/rustls/webpki/compare/v/0.100.1...v/0.100.2">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rustls-webpki&package-manager=cargo&previous-version=0.100.1&new-version=0.100.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 13:11:07 +00:00
393be40179 Refactor empty arrays/objects should return empty instead of null 2023-09-11 15:56:15 +03:00
2c1d60f79b get rid of a warning 2023-09-11 14:40:22 +02:00
487d493f49 Merge #4043
4043: Bring back hotfixes from v1.3.3 into v1.4.0 r=Kerollmops a=curquiza



Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: curquiza <clementine@meilisearch.com>
2023-09-11 12:27:34 +00:00
08af69a33b improve a test to understand what's going on with the ci 2023-09-11 14:23:57 +02:00
9258e5b5bf Fix the stats of the documents deletion by filter
The issue was that the operation « DocumentDeletionByFilter » was not
declared as an index operation. That means the indexes stats were not
reprocessed after the application of the operation.
2023-09-11 14:04:10 +02:00
ddd34a488a update the api-key tests 2023-09-11 13:52:07 +02:00
526c2b3602 Merge #4050
4050: Bump webpki from 0.22.0 to 0.22.1 r=Kerollmops a=dependabot[bot]

Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.0 to 0.22.1.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a href="https://github.com/briansmith/webpki/commits">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=webpki&package-manager=cargo&previous-version=0.22.0&new-version=0.22.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 11:46:22 +00:00
e8c9367686 implement the snapshots on demand 2023-09-11 12:35:57 +02:00
9636c5f558 Bump webpki from 0.22.0 to 0.22.1
Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.0 to 0.22.1.
- [Commits](https://github.com/briansmith/webpki/commits)

---
updated-dependencies:
- dependency-name: webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-11 10:32:34 +00:00
b310830b5d Improve test-suite.yml for CI failing when disabling tokenization (#4005)
* [Update] test-suite.yml

Added New run command for cargo tree without default features using if-then block

* [Updated] test-disabled-tokenization in test-suite.yml

* [Updated] test-suite.yml

* Update .github/workflows/test-suite.yml

---------

Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2023-09-11 12:30:53 +02:00
462b4654c4 Merge #4028
4028: Fix highlighting bug when searching for a phrase with cropping r=ManyTheFish a=vivek-26

# Pull Request

## Related issue
Fixes #3975

## What does this PR do?
This PR -
- Fixes the bug where searching **only** for a phrase (containing multiple words) along with cropping, highlighted only the first word of the phrase.
- Adds unit test case for the above mentioned scenario.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
2023-09-11 07:58:41 +00:00
abfa7ded25 use a new temp index in the test 2023-09-08 12:32:47 +05:30
f2837aaec2 add another test case 2023-09-08 11:39:54 +05:30
11df155598 fix highlighting bug when searching for a phrase with cropping 2023-09-08 11:39:52 +05:30
651657c03e Fix git conflicts 2023-09-07 16:48:13 +02:00
b9ad59c969 Merge #4041
4041: Register the swap indexe task in a spawn blocking to be sure to never… r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4040

## What does this PR do?
- Register the swap indexes task in a spawn blocking task

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-09-07 10:22:01 +00:00
66aa682e23 Register the swap indexe task in a spawn blocking to be sure to never block the main thread 2023-09-07 11:37:02 +02:00
256cf33bca Merge #4039
4039: Fix multiple vectors dimensions r=ManyTheFish a=Kerollmops

This PR fixes #4035, making providing multiple vectors in documents possible. This is fixed by extracting the vectors from the non-flattened version of the documents.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-09-07 09:25:58 +00:00
9945cbf9db Merge #4038
4038: Fix filter escaping issues r=ManyTheFish a=Kerollmops

This PR fixes #4034 by always escaping the sequences. Users must always put quotes (simple or double) to escape the filter values.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-09-06 12:29:29 +00:00
03d0f628bd Use the unescaper crate to unescape any char sequence 2023-09-06 13:59:45 +02:00
ea78060916 Fix tests that were supposed to escape characters 2023-09-06 13:59:45 +02:00
b42d48187a Add a test case scenario 2023-09-06 13:59:44 +02:00
679c0b0f97 Extract the vectors from the non-flattened version of the documents 2023-09-06 12:26:00 +02:00
e02d0064bd Add a test case scenario 2023-09-06 12:26:00 +02:00
7ef3572f11 Merge #4037
4037: Update version for the next release (v1.3.3) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2023-09-06 09:50:58 +00:00
93285041a9 Update version for the next release (v1.3.3) in Cargo.toml 2023-09-06 09:23:20 +00:00
dc3d9c90d9 Merge #3994
3994: Fix synonyms with separators r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #3977

## Available prototype
```
$ docker pull getmeili/meilisearch:prototype-fix-synonyms-with-separators-0
```

## What does this PR do?
- add a new test
- filter the empty synonyms after normalization


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-09-05 14:42:46 +00:00
287cf25d39 Merge #4033
4033: Fix thai synonyms r=Kerollmops a=Kerollmops

Fixes #4031

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-09-05 13:54:33 +00:00
66aa6d5871 Ignore tokens with empty normalized value during indexing process 2023-09-05 15:44:14 +02:00
8ac5b765bc Fix synonyms normalization 2023-09-04 16:12:48 +02:00
cea93e9a37 Merge #4016
4016: Define the full Homebrew formula path r=curquiza a=Kerollmops

This PR fixes #4015 by defining the full Homebrew formula path.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-09-04 13:10:28 +00:00
085aad0a94 Add a test 2023-09-04 14:39:33 +02:00
e9b62aacb3 Merge #4025
4025: Bump Swatinem/rust-cache from 2.5.1 to 2.6.2 r=curquiza a=dependabot[bot]

Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.5.1 to 2.6.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/swatinem/rust-cache/releases">Swatinem/rust-cache's releases</a>.</em></p>
<blockquote>
<h2>v2.6.2</h2>
<h2>What's Changed</h2>
<ul>
<li>dep: Use <code>smol-toml</code> instead of <code>toml</code> by <a href="https://github.com/NobodyXu"><code>`@​NobodyXu</code></a>` in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/164">Swatinem/rust-cache#164</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/Swatinem/rust-cache/compare/v2...v2.6.2">https://github.com/Swatinem/rust-cache/compare/v2...v2.6.2</a></p>
<h2>v2.6.1</h2>
<ul>
<li>Fix hash contributions of <code>Cargo.lock</code>/<code>Cargo.toml</code> files.</li>
</ul>
<h2>v2.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Add &quot;buildjet&quot; as a second <code>cache-provider</code> backend <a href="https://github.com/joroshiba"><code>`@​joroshiba</code></a>` in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/154">Swatinem/rust-cache#154</a></li>
<li>Clean up sparse registry index.</li>
<li>Do not clean up src of <code>-sys</code> crates.</li>
<li>Remove <code>.cargo/credentials.toml</code> before saving.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/joroshiba"><code>`@​joroshiba</code></a>` made their first contribution in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/154">Swatinem/rust-cache#154</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/Swatinem/rust-cache/compare/v2.5.1...v2.6.0">https://github.com/Swatinem/rust-cache/compare/v2.5.1...v2.6.0</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md">Swatinem/rust-cache's changelog</a>.</em></p>
<blockquote>
<h2>2.6.2</h2>
<ul>
<li>Fix <code>toml</code> parsing.</li>
</ul>
<h2>2.6.1</h2>
<ul>
<li>Fix hash contributions of <code>Cargo.lock</code>/<code>Cargo.toml</code> files.</li>
</ul>
<h2>2.6.0</h2>
<ul>
<li>Add &quot;buildjet&quot; as a second <code>cache-provider</code> backend.</li>
<li>Clean up sparse registry index.</li>
<li>Do not clean up src of <code>-sys</code> crates.</li>
<li>Remove <code>.cargo/credentials.toml</code> before saving.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="e207df5d26"><code>e207df5</code></a> 2.6.2</li>
<li><a href="decb69d790"><code>decb69d</code></a> Update dependencies and add changelog</li>
<li><a href="ab6b2769d1"><code>ab6b276</code></a> dep: Use <code>smol-toml</code> instead of <code>toml</code> (<a href="https://redirect.github.com/swatinem/rust-cache/issues/164">#164</a>)</li>
<li><a href="578b235f6e"><code>578b235</code></a> 2.6.1</li>
<li><a href="5113490c3f"><code>5113490</code></a> prepare 2.6.1</li>
<li><a href="c0e052c18c"><code>c0e052c</code></a> Fix hashing of parsed <code>Cargo.toml</code> (<a href="https://redirect.github.com/swatinem/rust-cache/issues/160">#160</a>)</li>
<li><a href="4e0f4b19dd"><code>4e0f4b1</code></a> Fix typo in hashing parsed <code>Cargo.lock</code> (<a href="https://redirect.github.com/swatinem/rust-cache/issues/159">#159</a>)</li>
<li><a href="b919e1427f"><code>b919e14</code></a> feat: Add logging to <code>Cargo.lock</code>/<code>Cargo.toml</code> hashing (<a href="https://redirect.github.com/swatinem/rust-cache/issues/156">#156</a>)</li>
<li><a href="b8a6852b4f"><code>b8a6852</code></a> 2.6.0</li>
<li><a href="80c47cc945"><code>80c47cc</code></a> Clean up <code>credentials.toml</code></li>
<li>Additional commits viewable in <a href="https://github.com/swatinem/rust-cache/compare/v2.5.1...v2.6.2">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Swatinem/rust-cache&package-manager=github_actions&previous-version=2.5.1&new-version=2.6.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-04 12:30:53 +00:00
456960d2c7 Bump Swatinem/rust-cache from 2.5.1 to 2.6.2
Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.5.1 to 2.6.2.
- [Release notes](https://github.com/swatinem/rust-cache/releases)
- [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md)
- [Commits](https://github.com/swatinem/rust-cache/compare/v2.5.1...v2.6.2)

---
updated-dependencies:
- dependency-name: Swatinem/rust-cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-01 17:17:39 +00:00
3dda176723 Merge #4020
4020: Update version for the next release (v1.4.0) in Cargo.toml r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-08-28 13:51:23 +00:00
af0f6f0bf0 Merge branch 'main' into update-version-v1.4.0 2023-08-28 15:08:59 +02:00
ccf3ba3f32 Merge #4019
4019: Bringing back changes from `v1.3.2` onto `main` r=irevoire a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: irevoire <irevoire@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-08-28 12:14:11 +00:00
65528a3e06 Update version for the next release (v1.4.0) in Cargo.toml 2023-08-28 11:52:28 +00:00
6db80b0836 Define the full Homebrew formula path 2023-08-24 11:24:47 +02:00
cdb4b3e024 Merge #4013
4013: Fix the ranking rule by temporarily disabling an assert in the bucket sort algorithm r=Kerollmops a=Kerollmops

This PR temporarily disables an assertion, making the search crash. [I created a tracking issue](https://github.com/meilisearch/meilisearch/issues/4012) to find a better way to fix this.

It no longer reverts a20e4d447c, which seemed to generate unreachable graphs and make the bucket sort ranking algorithm panic because of entering an unreachable state. We discussed that below in the comments.

Temporary fixes #4002, fixes #4006, and fixes #3995.

---

It took me approximately 2 days to find the first bad commit just because I'm bad in `git bisect` x `bash`, i.e. [I misused `%1` with `$!` to kill the most recently backgrounded job](https://unix.stackexchange.com/a/340084/212574)...

<details>
  <summary>Here is the script I used to find the invalid commit</summary>

```bash
#!/usr/bin/env bash

set -x

# remove the data
rm -rf data.ms

# build meilisearch
cargo build --release
# ignore this commit if it doesn't compile
if [[ $? != 0 ]]; then
    exit 125
fi

# index the dump and start from it
./target/release/meilisearch \
--http-addr 'localhost:7705' \
--import-dump $HOME/Downloads/modified-20230822-083016113.dump &

# wait 10 sec while it indexes the docs
sleep 5

# check if the server crashes on requests
echo '{
    "q": "rtx 305",
    "attributesToHighlight": [
        "*"
    ],
    "highlightPreTag": "<ais-highlight-0000000000>",
    "highlightPostTag": "</ais-highlight-0000000000>",
    "limit": 21,
    "offset": 0
}' | xh 'localhost:7705/indexes/arvutitark_local_orderables/search'

last_exit_code=$?

# Now kill Meilisearch
kill $!

# Clean the potential Cargo.lock
git checkout .

exit $last_exit_code
```
</details>

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-08-23 15:30:56 +00:00
8c0ebd1331 Update milli/src/search/new/bucket_sort.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-08-23 16:40:39 +02:00
5130e06b41 Temporarily disable an assert in the ranking rules 2023-08-23 16:11:54 +02:00
08e27ef73f Merge pull request #4008 from meilisearch/fix-highlighting-panic
Bump charabia to 0.8.3
2023-08-23 11:56:45 +02:00
914b125c5f Merge #3945
3945: Do not leak field information on error r=Kerollmops a=vivek-26

# Pull Request

## Related issue
Fixes #3865

## What does this PR do?
This PR ensures that `InvalidSortableAttribute`and `InvalidFacetSearchFacetName` errors do not leak field information i.e. fields which are not part of `displayedAttributes` in the settings are hidden from the error message.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
2023-08-22 18:55:27 +00:00
e59d7f238c Bump rustls-webpki from 0.100.1 to 0.100.2
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.100.1 to 0.100.2.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.100.1...v/0.100.2)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-08-22 18:10:53 +00:00
717b069907 Bump charabia to 0.8.3 2023-08-22 16:25:00 +02:00
7ea154673a Merge #4000
4000: Update version for the next release (v1.3.2) in Cargo.toml r=irevoire a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2023-08-16 10:41:33 +00:00
b947f3bb9d Update version for the next release (v1.3.2) in Cargo.toml 2023-08-16 08:20:36 +00:00
4c35817c5f Merge #3998
3998: Accept the `null` JSON value as a value of the `_vectors` field r=irevoire a=Kerollmops

This PR fixes #3979 by accepting `null` JSON values in the `_vectors` fields provided by the user.

Can the reviewer please verify that I am merging in the right branch?
I think we must create a new _release-v1.3.2_.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-08-16 08:12:24 +00:00
c53841e166 Accept the null JSON value as the value of _vectors 2023-08-14 16:03:55 +02:00
fd81945597 Merge #3987
3987: Update dependencies for v1.4 r=curquiza a=ManyTheFish

# Pull Request

## Related issue
Fixes #3870 

## What does this PR do?
- [Update dependencies](d7ff5368b4)
- [upgrade itertools = "0.10.5"](d0582d01f4)
- [upgrade sysinfo = "0.29.7"](507c661352)
- [upgrade memmap2 = "0.7.1"](489e0d5cd0)
- [upgrade rstar = "0.11.0"](3d9d08e3b2)
- [upgrade fastrand = "2.0.0"](1af7083c48)
- [upgrade deserr = "0.6.0"](7fe77045af)
- [upgrade indexmap = "2.0.0"](95e4960b0c)
- [update rust toolchain = "1.71.1"](937b7b5da5)

## Remaining un-upgraded dependencies
- vergen 7.5.1 --> 8.2.4: I wasn't able to quickly understand the changes in the lib API to upgrade the dependency
- rustls 0.20.8 --> 0.21.6: Meilisearch doesn't have any direct dependency on it


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-08-10 16:46:17 +00:00
794e491152 update rust toolchain 2023-08-10 18:09:02 +02:00
cab27c2ab4 upgrade indexmap = "2.0.0" 2023-08-10 18:09:02 +02:00
624fa9052f upgrade deserr = "0.6.0" 2023-08-10 18:09:02 +02:00
359ede4862 upgrade fastrand = "2.0.0" 2023-08-10 18:09:02 +02:00
60c11dbdbd upgrade rstar - "0.11.0" 2023-08-10 18:09:02 +02:00
dacee40ebc upgrade memmap2 = "0.7.1" 2023-08-10 18:09:02 +02:00
6089083a8e upgrade sysinfo = "0.29.7" 2023-08-10 18:09:02 +02:00
cc2c19d4c3 upgrade itertools = "0.10.5" 2023-08-10 18:09:02 +02:00
a5c56fac8a Update dependencies 2023-08-10 18:09:02 +02:00
e4e49e63d0 Merge #3993
3993: Bringing back changes from v1.3.1 to `main` r=irevoire a=curquiza



Co-authored-by: irevoire <irevoire@users.noreply.github.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-08-10 14:30:02 +00:00
00bd7bd19a Merge #3990
3990: Removed unnecessary borrow call that failed nightly tests r=irevoire a=JannisK89

# Pull Request

## Related issue
Fixes #3988

## What does this PR do?
- Removes unnecessary borrow call that was causing warnings when running tests on nightly.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ x] Have you read the contributing guidelines?
- [ x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Please let me know if there is anything else I can do to improve this PR.
Thank you.

Co-authored-by: JannisK89 <jannis.karanikis@gmail.com>
2023-08-10 11:42:19 +00:00
ef3d098b4d Merge #3976
3976: Fix the get stats method r=ManyTheFish a=irevoire

# Pull Request

- The get stats method of the index-scheduler was not using at all the processing tasks. That was returning a wrong number of enqueued tasks and 0 processing tasks.
- Added a test
- Currently this method was **ONLY** used to compute the `meilisearch_nb_tasks` field of the **experimental feature** metrics.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3972


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-08-10 10:55:50 +00:00
8084cf29f3 Merge #3946
3946: Settings customizing tokenization r=irevoire a=ManyTheFish

# Pull Request
This pull Request allows the User to customize Meilisearch Tokenization by providing specialized settings.

## Small documentation
All the new settings can be set and reset like the other index settings by calling the route `/indexes/:name/settings`

### `nonSeparatorTokens`
The Meilisearch word segmentation uses a default list of separators to segment words, however, for specific use cases some of the default separators shouldn't be considered separators, the `nonSeparatorTokens` setting allows to remove of some tokens from the default list of separators.

***Request payload `PUT`- `/indexes/articles/settings/non-separator-tokens`***
```json
["`@",` "#", "&"]
```

### `separatorTokens`
Some use cases need to define additional separators, some are related to a specific way of parsing technical documents some others are related to encodings in documents,  the `separatorTokens` setting allows adding some tokens to the list of separators.

***Request payload `PUT`- `/indexes/articles/settings/separator-tokens`***
```json
["&sect;", "&sep"]
```

### `dictionary`
The Meilisearch word segmentation relies on separators and language-based word-dictionaries to segment words, however, this segmentation is inaccurate on technical or use-case specific vocabulary (like `G/Box` to say `Gear Box`), or on proper nouns (like `J. R. R.` when parsing `J. R. R. Tolkien`), the `dictionary` setting allows defining a list of words that would be segmented as described in the list.

***Request payload `PUT`- `/indexes/articles/settings/dictionary`***
```json
["J. R. R.", "J.R.R."]
```

these last feature synergies well with the `stopWords` setting or the `synonyms` setting allowing to segment words and correctly retrieve the synonyms:
***Request payload `PATCH`- `/indexes/articles/settings`***
```json
{
    "dictionary": ["J. R. R.", "J.R.R."],
    "synonyms": {
            "J.R.R.": ["jrr", "J. R. R."],
            "J. R. R.": ["jrr", "J.R.R."],
            "jrr": ["J.R.R.", "J. R. R."],
    }
}
```

### Related specifications:
- https://github.com/meilisearch/specifications/pull/255
- https://github.com/meilisearch/specifications/pull/254

### Try it with Docker

```bash
$ docker pull getmeili/meilisearch:prototype-tokenizer-customization-3
```

## Related issue
Fixes #3610
Fixes #3917
Fixes https://github.com/meilisearch/product/discussions/468
Fixes https://github.com/meilisearch/product/discussions/160
Fixes https://github.com/meilisearch/product/discussions/260
Fixes https://github.com/meilisearch/product/discussions/381
Fixes https://github.com/meilisearch/product/discussions/131
Related to https://github.com/meilisearch/meilisearch/issues/2879

Fixes #2760

## What does this PR do?
- Add a setting `nonSeparatorTokens` allowing to remove a token from the default separator tokens
- Add a setting `separatorTokens` allowing to add a token in the separator tokens
- Add a setting `dictionary` allowing to override the segmentation on specific words
- add new error code `invalid_settings_non_separator_tokens` (invalid_request)
- add new error code `invalid_settings_separator_tokens` (invalid_request)
- add new error code `invalid_settings_dictionary` (invalid_request)

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2023-08-10 10:01:18 +00:00
5a7c1bde84 Fix clippy 2023-08-10 11:27:56 +02:00
6b2d671be7 Fix PR comments 2023-08-10 10:44:07 +02:00
43c13faeda Update milli/src/update/index_documents/extract/extract_docid_word_positions.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-08-10 10:05:03 +02:00
29adfc2f68 Merge #3989
3989: Improve test suite CI for manual trigger events r=irevoire a=curquiza

# Why?

To be able to test https://github.com/meilisearch/meilisearch/issues/3988 before merging the PR solving it

# How do we ensure this PR works?

I triggered `workflow_dispatch` (i.e. manual trigger) on this branch, and we can see all the jobs have been triggered (even if some of them are failing -> it's another issue)
https://github.com/meilisearch/meilisearch/actions/runs/5810609073

We can see the tests triggered by the PR are restricted as expected: https://github.com/meilisearch/meilisearch/actions/runs/5810605977

Co-authored-by: curquiza <clementine@meilisearch.com>
2023-08-10 07:55:48 +00:00
064ee95b1c removed unnecessary borrow call 2023-08-10 08:41:25 +02:00
604d533b31 Improve test suite CI for workflow_dispatch event 2023-08-09 16:47:28 +02:00
44c1900f36 Merge #3986
3986: Fix geo bounding box with strings r=ManyTheFish a=irevoire

# Pull Request

When sending a document with one geofield of type string (i.e.: `{ "_geo": { "lat": 12, "lng": "13" }}`), the geobounding box would exclude this document.

This PR fixes this issue by automatically parsing the string value in case we're working on a geofield.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3973

## What does this PR do?
- Automatically parse the facet value iif we're working on a geofield.
- Make insta works with snapshots in loops or closure executed multiple times. (you may need to update your cli if it panics after this PR: `cargo install cargo-insta`).
- Add one integration test in milli and in meilisearch to ensure it works forever.
- Add three snapshots for the dump that mysteriously disappeared I don't know how


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-08-09 07:58:15 +00:00
04671d0751 Merge #3981
3981: Truncate the normalized long facets used in the search for facet value r=irevoire a=ManyTheFish

# Pull Request
 Truncate the normalized long facets used in the search for facet value

## targeted release

v1.3.1

## Related issue
Fixes #3978


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-08-08 15:07:07 +00:00
4f4c669d50 add back some dump snapshots that disappeared. it's completely unrelated to this PR 2023-08-08 16:58:14 +02:00
8dc5acf998 Try fix 2023-08-08 16:52:36 +02:00
fc2590fc9d Add a test 2023-08-08 16:43:08 +02:00
35758db9ec Truncate the the normalized long facets used in search for facet value 2023-08-08 16:38:30 +02:00
4988199bb9 ensure the geoboundingbox works with strings and int geofields in milli and meilisearch 2023-08-08 16:29:25 +02:00
83991ee770 enable the multi-snapshot attribute in insta. This will let us use insta in loops 2023-08-08 16:28:38 +02:00
9d061cec26 automatically parse the filterable attribute to float if it's a geo field 2023-08-08 16:28:07 +02:00
fe819a9d80 fix the get stats method
It was not taking into account the processing tasks at all
2023-08-08 13:21:15 +02:00
e338ceb97f Merge #3982
3982: Update version for the next release (v1.3.1) in Cargo.toml r=irevoire a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2023-08-08 10:30:56 +00:00
75c87d5391 Update version for the next release (v1.3.1) in Cargo.toml 2023-08-08 10:30:06 +00:00
dd57873f8e hide fields not in the displayedAttributes list from errors 2023-08-05 16:03:10 +05:30
88 changed files with 2243 additions and 1016 deletions

View File

@ -2,8 +2,8 @@ name: Create issue to upgrade dependencies
on:
schedule:
# Run the first of the month, every 3 month
- cron: '0 0 1 */3 *'
# Run the first of the month, every 6 month
- cron: '0 0 1 */6 *'
workflow_dispatch:
jobs:

View File

@ -53,5 +53,6 @@ jobs:
uses: mislav/bump-homebrew-formula-action@v2
with:
formula-name: meilisearch
formula-path: Formula/m/meilisearch.rb
env:
COMMITTER_TOKEN: ${{ secrets.HOMEBREW_COMMITTER_TOKEN }}

View File

@ -57,7 +57,7 @@ jobs:
echo "date=$commit_date" >> $GITHUB_OUTPUT
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

View File

@ -14,6 +14,7 @@ on:
env:
MEILI_MASTER_KEY: 'masterKey'
MEILI_NO_ANALYTICS: 'true'
DISABLE_COVERAGE: 'true'
jobs:
define-docker-image:
@ -30,6 +31,117 @@ jobs:
if [[ $event == 'workflow_dispatch' ]]; then
echo "docker-image=${{ github.event.inputs.docker_image }}" >> $GITHUB_OUTPUT
fi
- name: Docker image is ${{ steps.define-image.outputs.docker-image }}
run: echo "Docker image is ${{ steps.define-image.outputs.docker-image }}"
##########
## SDKs ##
##########
meilisearch-dotnet-tests:
needs: define-docker-image
name: .NET SDK tests
runs-on: ubuntu-latest
env:
MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }}
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-dotnet
- name: Setup .NET Core
uses: actions/setup-dotnet@v3
with:
dotnet-version: "6.0.x"
- name: Install dependencies
run: dotnet restore
- name: Build
run: dotnet build --configuration Release --no-restore
- name: Meilisearch (latest version) setup with Docker
run: docker compose up -d
- name: Run tests
run: dotnet test --no-restore --verbosity normal
meilisearch-dart-tests:
needs: define-docker-image
name: Dart SDK tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-dart
- uses: dart-lang/setup-dart@v1
with:
sdk: 3.1.1
- name: Install dependencies
run: dart pub get
- name: Run integration tests
run: dart test --concurrency=4
meilisearch-go-tests:
needs: define-docker-image
name: Go SDK tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: stable
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-go
- name: Get dependencies
run: |
go get -v -t -d ./...
if [ -f Gopkg.toml ]; then
curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh
dep ensure
fi
- name: Run integration tests
run: go test -v ./...
meilisearch-java-tests:
needs: define-docker-image
name: Java SDK tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-java
- name: Set up Java
uses: actions/setup-java@v3
with:
java-version: 8
distribution: 'zulu'
cache: gradle
- name: Grant execute permission for gradlew
run: chmod +x gradlew
- name: Build and run unit and integration tests
run: ./gradlew build integrationTest
meilisearch-js-tests:
needs: define-docker-image
@ -66,33 +178,6 @@ jobs:
- name: Run Browser env
run: yarn test:env:browser
instant-meilisearch-tests:
needs: define-docker-image
name: instant-meilisearch tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/instant-meilisearch
- name: Setup node
uses: actions/setup-node@v3
with:
cache: yarn
- name: Install dependencies
run: yarn install
- name: Run tests
run: yarn test
- name: Build all the playgrounds and the packages
run: yarn build
meilisearch-php-tests:
needs: define-docker-image
name: PHP SDK tests
@ -111,8 +196,6 @@ jobs:
repository: meilisearch/meilisearch-php
- name: Install PHP
uses: shivammathur/setup-php@v2
with:
coverage: none
- name: Validate composer.json and composer.lock
run: composer validate
- name: Install dependencies
@ -149,36 +232,6 @@ jobs:
- name: Test with pytest
run: pipenv run pytest
meilisearch-go-tests:
needs: define-docker-image
name: Go SDK tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: stable
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-go
- name: Get dependencies
run: |
go get -v -t -d ./...
if [ -f Gopkg.toml ]; then
curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh
dep ensure
fi
- name: Run integration tests
run: go test -v ./...
meilisearch-ruby-tests:
needs: define-docker-image
name: Ruby SDK tests
@ -224,3 +277,110 @@ jobs:
run: cargo build --verbose
- name: Run tests
run: cargo test --verbose
meilisearch-swift-tests:
needs: define-docker-image
name: Swift SDK tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-swift
- name: Run tests
run: swift test
########################
## FRONT-END PLUGINS ##
########################
meilisearch-js-plugins-tests:
needs: define-docker-image
name: meilisearch-js-plugins tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-js-plugins
- name: Setup node
uses: actions/setup-node@v3
with:
cache: yarn
- name: Install dependencies
run: yarn install
- name: Run tests
run: yarn test
- name: Build all the playgrounds and the packages
run: yarn build
########################
## BACK-END PLUGINS ###
########################
meilisearch-rails-tests:
needs: define-docker-image
name: meilisearch-rails tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-rails
- name: Set up Ruby 3
uses: ruby/setup-ruby@v1
with:
ruby-version: 3
bundler-cache: true
- name: Run tests
run: bundle exec rspec
meilisearch-symfony-tests:
needs: define-docker-image
name: meilisearch-symfony tests
runs-on: ubuntu-latest
services:
meilisearch:
image: getmeili/meilisearch:${{ needs.define-docker-image.outputs.docker-image }}
env:
MEILI_MASTER_KEY: ${{ env.MEILI_MASTER_KEY }}
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-symfony
- name: Install PHP
uses: shivammathur/setup-php@v2
with:
tools: composer:v2, flex
- name: Validate composer.json and composer.lock
run: composer validate
- name: Install dependencies
run: composer install --prefer-dist --no-progress --quiet
- name: Remove doctrine/annotations
run: composer remove --dev doctrine/annotations
- name: Run test suite
run: composer test:unit

View File

@ -37,13 +37,13 @@ jobs:
toolchain: stable
override: true
- name: Setup test with Rust nightly
if: github.event_name == 'schedule'
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
uses: actions-rs/toolchain@v1
with:
toolchain: nightly
override: true
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.5.1
uses: Swatinem/rust-cache@v2.6.2
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -65,7 +65,7 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.5.1
uses: Swatinem/rust-cache@v2.6.2
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -78,12 +78,12 @@ jobs:
args: --locked --release --all
test-all-features:
name: Tests all features on cron schedule only
name: Tests all features
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
image: ubuntu:18.04
if: github.event_name == 'schedule'
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
@ -110,7 +110,7 @@ jobs:
runs-on: ubuntu-latest
container:
image: ubuntu:18.04
if: github.event_name == 'schedule'
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
@ -123,7 +123,10 @@ jobs:
override: true
- name: Run cargo tree without default features and check lindera is not present
run: |
cargo tree -f '{p} {f}' -e normal --no-default-features | grep lindera -vqz
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -vqz lindera; then
echo "lindera has been found in the sources and it shouldn't"
exit 1
fi
- name: Run cargo tree with default features and check lindera is pressent
run: |
cargo tree -f '{p} {f}' -e normal | grep lindera -qz
@ -146,7 +149,7 @@ jobs:
toolchain: stable
override: true
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.5.1
uses: Swatinem/rust-cache@v2.6.2
- name: Run tests in debug
uses: actions-rs/cargo@v1
with:
@ -161,11 +164,11 @@ jobs:
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: 1.69.0
toolchain: 1.71.1
override: true
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.5.1
uses: Swatinem/rust-cache@v2.6.2
- name: Run cargo clippy
uses: actions-rs/cargo@v1
with:
@ -184,7 +187,7 @@ jobs:
override: true
components: rustfmt
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.5.1
uses: Swatinem/rust-cache@v2.6.2
- name: Run cargo fmt
# Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file.
# Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate

678
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -18,7 +18,7 @@ members = [
]
[workspace.package]
version = "1.3.0"
version = "1.4.0"
authors = ["Quentin de Quelen <quentin@dequelen.me>", "Clément Renault <clement@meilisearch.com>"]
description = "Meilisearch HTTP server"
homepage = "https://meilisearch.com"

View File

@ -0,0 +1,24 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View File

@ -0,0 +1,38 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View File

@ -0,0 +1,31 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"genres",
"id"
],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View File

@ -14,6 +14,7 @@ license.workspace = true
[dependencies]
nom = "7.1.3"
nom_locate = "4.1.0"
unescaper = "0.1.2"
[dev-dependencies]
insta = "1.29.0"

View File

@ -62,6 +62,7 @@ pub enum ErrorKind<'a> {
MisusedGeoRadius,
MisusedGeoBoundingBox,
InvalidPrimary,
InvalidEscapedNumber,
ExpectedEof,
ExpectedValue(ExpectedValueKind),
MalformedValue,
@ -147,6 +148,9 @@ impl<'a> Display for Error<'a> {
let text = if input.trim().is_empty() { "but instead got nothing.".to_string() } else { format!("at `{}`.", escaped_input) };
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` {}", text)?
}
ErrorKind::InvalidEscapedNumber => {
writeln!(f, "Found an invalid escaped sequence number: `{}`.", escaped_input)?
}
ErrorKind::ExpectedEof => {
writeln!(f, "Found unexpected characters at the end of the filter: `{}`. You probably forgot an `OR` or an `AND` rule.", escaped_input)?
}

View File

@ -545,6 +545,8 @@ impl<'a> std::fmt::Display for Token<'a> {
#[cfg(test)]
pub mod tests {
use FilterCondition as Fc;
use super::*;
/// Create a raw [Token]. You must specify the string that appear BEFORE your element followed by your element
@ -556,14 +558,22 @@ pub mod tests {
unsafe { Span::new_from_raw_offset(offset, lines as u32, value, "") }.into()
}
fn p(s: &str) -> impl std::fmt::Display + '_ {
Fc::parse(s).unwrap().unwrap()
}
#[test]
fn parse_escaped() {
insta::assert_display_snapshot!(p(r#"title = 'foo\\'"#), @r#"{title} = {foo\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\'"#), @r#"{title} = {foo\\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\\\'"#), @r#"{title} = {foo\\\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\\\\\'"#), @r#"{title} = {foo\\\\}"#);
// but it also works with other sequencies
insta::assert_display_snapshot!(p(r#"title = 'foo\x20\n\t\"\'"'"#), @"{title} = {foo \n\t\"\'\"}");
}
#[test]
fn parse() {
use FilterCondition as Fc;
fn p(s: &str) -> impl std::fmt::Display + '_ {
Fc::parse(s).unwrap().unwrap()
}
// Test equal
insta::assert_display_snapshot!(p("channel = Ponce"), @"{channel} = {Ponce}");
insta::assert_display_snapshot!(p("subscribers = 12"), @"{subscribers} = {12}");

View File

@ -171,7 +171,24 @@ pub fn parse_value(input: Span) -> IResult<Token> {
})
})?;
Ok((input, value))
match unescaper::unescape(value.value()) {
Ok(content) => {
if content.len() != value.value().len() {
Ok((input, Token::new(value.original_span(), Some(content))))
} else {
Ok((input, value))
}
}
Err(unescaper::Error::IncompleteStr(_)) => Err(nom::Err::Incomplete(nom::Needed::Unknown)),
Err(unescaper::Error::ParseIntError { .. }) => Err(nom::Err::Error(Error::new_from_kind(
value.original_span(),
ErrorKind::InvalidEscapedNumber,
))),
Err(unescaper::Error::InvalidChar { .. }) => Err(nom::Err::Error(Error::new_from_kind(
value.original_span(),
ErrorKind::MalformedValue,
))),
}
}
fn is_value_component(c: char) -> bool {
@ -318,17 +335,17 @@ pub mod test {
("\"cha'nnel\"", "cha'nnel", false),
("I'm tamo", "I", false),
// escaped thing but not quote
(r#""\\""#, r#"\\"#, false),
(r#""\\\\\\""#, r#"\\\\\\"#, false),
(r#""aa\\aa""#, r#"aa\\aa"#, false),
(r#""\\""#, r#"\"#, true),
(r#""\\\\\\""#, r#"\\\"#, true),
(r#""aa\\aa""#, r#"aa\aa"#, true),
// with double quote
(r#""Hello \"world\"""#, r#"Hello "world""#, true),
(r#""Hello \\\"world\\\"""#, r#"Hello \\"world\\""#, true),
(r#""Hello \\\"world\\\"""#, r#"Hello \"world\""#, true),
(r#""I'm \"super\" tamo""#, r#"I'm "super" tamo"#, true),
(r#""\"\"""#, r#""""#, true),
// with simple quote
(r#"'Hello \'world\''"#, r#"Hello 'world'"#, true),
(r#"'Hello \\\'world\\\''"#, r#"Hello \\'world\\'"#, true),
(r#"'Hello \\\'world\\\''"#, r#"Hello \'world\'"#, true),
(r#"'I\'m "super" tamo'"#, r#"I'm "super" tamo"#, true),
(r#"'\'\''"#, r#"''"#, true),
];
@ -350,7 +367,14 @@ pub mod test {
"Filter `{}` was not supposed to be escaped",
input
);
assert_eq!(token.value(), expected, "Filter `{}` failed.", input);
assert_eq!(
token.value(),
expected,
"Filter `{}` failed by giving `{}` instead of `{}`.",
input,
token.value(),
expected
);
}
}

View File

@ -13,7 +13,7 @@ license.workspace = true
[dependencies]
arbitrary = { version = "1.3.0", features = ["derive"] }
clap = { version = "4.3.0", features = ["derive"] }
fastrand = "1.9.0"
fastrand = "2.0.0"
milli = { path = "../milli" }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }

View File

@ -67,10 +67,6 @@ pub(crate) enum Batch {
op: IndexOperation,
must_create_index: bool,
},
IndexDocumentDeletionByFilter {
index_uid: String,
task: Task,
},
IndexCreation {
index_uid: String,
primary_key: Option<String>,
@ -114,6 +110,10 @@ pub(crate) enum IndexOperation {
documents: Vec<Vec<String>>,
tasks: Vec<Task>,
},
IndexDocumentDeletionByFilter {
index_uid: String,
task: Task,
},
DocumentClear {
index_uid: String,
tasks: Vec<Task>,
@ -155,7 +155,6 @@ impl Batch {
| Batch::TaskDeletion(task)
| Batch::Dump(task)
| Batch::IndexCreation { task, .. }
| Batch::IndexDocumentDeletionByFilter { task, .. }
| Batch::IndexUpdate { task, .. } => vec![task.uid],
Batch::SnapshotCreation(tasks) | Batch::IndexDeletion { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
@ -167,6 +166,7 @@ impl Batch {
| IndexOperation::DocumentClear { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
}
IndexOperation::IndexDocumentDeletionByFilter { task, .. } => vec![task.uid],
IndexOperation::SettingsAndDocumentOperation {
document_import_tasks: tasks,
settings_tasks: other,
@ -194,8 +194,7 @@ impl Batch {
IndexOperation { op, .. } => Some(op.index_uid()),
IndexCreation { index_uid, .. }
| IndexUpdate { index_uid, .. }
| IndexDeletion { index_uid, .. }
| IndexDocumentDeletionByFilter { index_uid, .. } => Some(index_uid),
| IndexDeletion { index_uid, .. } => Some(index_uid),
}
}
}
@ -205,6 +204,7 @@ impl IndexOperation {
match self {
IndexOperation::DocumentOperation { index_uid, .. }
| IndexOperation::DocumentDeletion { index_uid, .. }
| IndexOperation::IndexDocumentDeletionByFilter { index_uid, .. }
| IndexOperation::DocumentClear { index_uid, .. }
| IndexOperation::Settings { index_uid, .. }
| IndexOperation::DocumentClearAndSetting { index_uid, .. }
@ -239,9 +239,12 @@ impl IndexScheduler {
let task = self.get_task(rtxn, id)?.ok_or(Error::CorruptedTaskQueue)?;
match &task.kind {
KindWithContent::DocumentDeletionByFilter { index_uid, .. } => {
Ok(Some(Batch::IndexDocumentDeletionByFilter {
index_uid: index_uid.clone(),
task,
Ok(Some(Batch::IndexOperation {
op: IndexOperation::IndexDocumentDeletionByFilter {
index_uid: index_uid.clone(),
task,
},
must_create_index: false,
}))
}
_ => unreachable!(),
@ -896,51 +899,6 @@ impl IndexScheduler {
Ok(tasks)
}
Batch::IndexDocumentDeletionByFilter { mut task, index_uid: _ } => {
let (index_uid, filter) =
if let KindWithContent::DocumentDeletionByFilter { index_uid, filter_expr } =
&task.kind
{
(index_uid, filter_expr)
} else {
unreachable!()
};
let index = {
let rtxn = self.env.read_txn()?;
self.index_mapper.index(&rtxn, index_uid)?
};
let deleted_documents = delete_document_by_filter(filter, index);
let original_filter = if let Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: _,
}) = task.details
{
original_filter
} else {
// In the case of a `documentDeleteByFilter` the details MUST be set
unreachable!();
};
match deleted_documents {
Ok(deleted_documents) => {
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: Some(deleted_documents),
});
}
Err(e) => {
task.status = Status::Failed;
task.details = Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: Some(0),
});
task.error = Some(e.into());
}
}
Ok(vec![task])
}
Batch::IndexCreation { index_uid, primary_key, task } => {
let wtxn = self.env.write_txn()?;
if self.index_mapper.exists(&wtxn, &index_uid)? {
@ -1299,6 +1257,47 @@ impl IndexScheduler {
Ok(tasks)
}
IndexOperation::IndexDocumentDeletionByFilter { mut task, index_uid: _ } => {
let filter =
if let KindWithContent::DocumentDeletionByFilter { filter_expr, .. } =
&task.kind
{
filter_expr
} else {
unreachable!()
};
let deleted_documents = delete_document_by_filter(index_wtxn, filter, index);
let original_filter = if let Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: _,
}) = task.details
{
original_filter
} else {
// In the case of a `documentDeleteByFilter` the details MUST be set
unreachable!();
};
match deleted_documents {
Ok(deleted_documents) => {
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: Some(deleted_documents),
});
}
Err(e) => {
task.status = Status::Failed;
task.details = Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: Some(0),
});
task.error = Some(e.into());
}
}
Ok(vec![task])
}
IndexOperation::Settings { index_uid: _, settings, mut tasks } => {
let indexer_config = self.index_mapper.indexer_config();
let mut builder = milli::update::Settings::new(index_wtxn, index, indexer_config);
@ -1498,23 +1497,22 @@ impl IndexScheduler {
}
}
fn delete_document_by_filter(filter: &serde_json::Value, index: Index) -> Result<u64> {
fn delete_document_by_filter<'a>(
wtxn: &mut RwTxn<'a, '_>,
filter: &serde_json::Value,
index: &'a Index,
) -> Result<u64> {
let filter = Filter::from_json(filter)?;
Ok(if let Some(filter) = filter {
let mut wtxn = index.write_txn()?;
let candidates = filter.evaluate(&wtxn, &index).map_err(|err| match err {
let candidates = filter.evaluate(wtxn, index).map_err(|err| match err {
milli::Error::UserError(milli::UserError::InvalidFilter(_)) => {
Error::from(err).with_custom_error_code(Code::InvalidDocumentFilter)
}
e => e.into(),
})?;
let mut delete_operation = DeleteDocuments::new(&mut wtxn, &index)?;
let mut delete_operation = DeleteDocuments::new(wtxn, index)?;
delete_operation.delete_documents(&candidates);
let deleted_documents =
delete_operation.execute().map(|result| result.deleted_documents)?;
wtxn.commit()?;
deleted_documents
delete_operation.execute().map(|result| result.deleted_documents)?
} else {
0
})

View File

@ -790,10 +790,19 @@ impl IndexScheduler {
let mut res = BTreeMap::new();
let processing_tasks = { self.processing_tasks.read().unwrap().processing.len() };
res.insert(
"statuses".to_string(),
enum_iterator::all::<Status>()
.map(|s| Ok((s.to_string(), self.get_status(&rtxn, s)?.len())))
.map(|s| {
let tasks = self.get_status(&rtxn, s)?.len();
match s {
Status::Enqueued => Ok((s.to_string(), tasks - processing_tasks)),
Status::Processing => Ok((s.to_string(), processing_tasks)),
s => Ok((s.to_string(), tasks)),
}
})
.collect::<Result<BTreeMap<String, u64>>>()?,
);
res.insert(
@ -4131,4 +4140,154 @@ mod tests {
snapshot!(json_string!(tasks, { "[].enqueuedAt" => "[date]", "[].startedAt" => "[date]", "[].finishedAt" => "[date]", ".**.original_filter" => "[filter]", ".**.query" => "[query]" }), name: "everything_has_been_processed");
drop(rtxn);
}
#[test]
fn basic_get_stats() {
let (index_scheduler, mut handle) = IndexScheduler::test(true, vec![]);
let kind = index_creation_task("catto", "mouse");
let _task = index_scheduler.register(kind).unwrap();
let kind = index_creation_task("doggo", "sheep");
let _task = index_scheduler.register(kind).unwrap();
let kind = index_creation_task("whalo", "fish");
let _task = index_scheduler.register(kind).unwrap();
snapshot!(json_string!(index_scheduler.get_stats().unwrap()), @r###"
{
"indexes": {
"catto": 1,
"doggo": 1,
"whalo": 1
},
"statuses": {
"canceled": 0,
"enqueued": 3,
"failed": 0,
"processing": 0,
"succeeded": 0
},
"types": {
"documentAdditionOrUpdate": 0,
"documentDeletion": 0,
"dumpCreation": 0,
"indexCreation": 3,
"indexDeletion": 0,
"indexSwap": 0,
"indexUpdate": 0,
"settingsUpdate": 0,
"snapshotCreation": 0,
"taskCancelation": 0,
"taskDeletion": 0
}
}
"###);
handle.advance_till([Start, BatchCreated]);
snapshot!(json_string!(index_scheduler.get_stats().unwrap()), @r###"
{
"indexes": {
"catto": 1,
"doggo": 1,
"whalo": 1
},
"statuses": {
"canceled": 0,
"enqueued": 2,
"failed": 0,
"processing": 1,
"succeeded": 0
},
"types": {
"documentAdditionOrUpdate": 0,
"documentDeletion": 0,
"dumpCreation": 0,
"indexCreation": 3,
"indexDeletion": 0,
"indexSwap": 0,
"indexUpdate": 0,
"settingsUpdate": 0,
"snapshotCreation": 0,
"taskCancelation": 0,
"taskDeletion": 0
}
}
"###);
handle.advance_till([
InsideProcessBatch,
InsideProcessBatch,
ProcessBatchSucceeded,
AfterProcessing,
Start,
BatchCreated,
]);
snapshot!(json_string!(index_scheduler.get_stats().unwrap()), @r###"
{
"indexes": {
"catto": 1,
"doggo": 1,
"whalo": 1
},
"statuses": {
"canceled": 0,
"enqueued": 1,
"failed": 0,
"processing": 1,
"succeeded": 1
},
"types": {
"documentAdditionOrUpdate": 0,
"documentDeletion": 0,
"dumpCreation": 0,
"indexCreation": 3,
"indexDeletion": 0,
"indexSwap": 0,
"indexUpdate": 0,
"settingsUpdate": 0,
"snapshotCreation": 0,
"taskCancelation": 0,
"taskDeletion": 0
}
}
"###);
// now we make one more batch, the started_at field of the new tasks will be past `second_start_time`
handle.advance_till([
InsideProcessBatch,
InsideProcessBatch,
ProcessBatchSucceeded,
AfterProcessing,
Start,
BatchCreated,
]);
snapshot!(json_string!(index_scheduler.get_stats().unwrap()), @r###"
{
"indexes": {
"catto": 1,
"doggo": 1,
"whalo": 1
},
"statuses": {
"canceled": 0,
"enqueued": 0,
"failed": 0,
"processing": 1,
"succeeded": 2
},
"types": {
"documentAdditionOrUpdate": 0,
"documentDeletion": 0,
"dumpCreation": 0,
"indexCreation": 3,
"indexDeletion": 0,
"indexSwap": 0,
"indexUpdate": 0,
"settingsUpdate": 0,
"snapshotCreation": 0,
"taskCancelation": 0,
"taskDeletion": 0
}
}
"###);
}
}

View File

@ -167,7 +167,9 @@ macro_rules! snapshot {
let (settings, snap_name, _) = $crate::default_snapshot_settings_for_test(test_name, Some(&snap_name));
settings.bind(|| {
let snap = format!("{}", $value);
meili_snap::insta::assert_snapshot!(format!("{}", snap_name), snap);
insta::allow_duplicates! {
meili_snap::insta::assert_snapshot!(format!("{}", snap_name), snap);
}
});
};
($value:expr, @$inline:literal) => {
@ -176,7 +178,9 @@ macro_rules! snapshot {
let (settings, _, _) = $crate::default_snapshot_settings_for_test("", Some("_dummy_argument"));
settings.bind(|| {
let snap = format!("{}", $value);
meili_snap::insta::assert_snapshot!(snap, @$inline);
insta::allow_duplicates! {
meili_snap::insta::assert_snapshot!(snap, @$inline);
}
});
};
($value:expr) => {
@ -194,7 +198,9 @@ macro_rules! snapshot {
let (settings, snap_name, _) = $crate::default_snapshot_settings_for_test(test_name, None);
settings.bind(|| {
let snap = format!("{}", $value);
meili_snap::insta::assert_snapshot!(format!("{}", snap_name), snap);
insta::allow_duplicates! {
meili_snap::insta::assert_snapshot!(format!("{}", snap_name), snap);
}
});
};
}

View File

@ -129,6 +129,9 @@ impl HeedAuthStore {
Action::DumpsAll => {
actions.insert(Action::DumpsCreate);
}
Action::SnapshotsAll => {
actions.insert(Action::SnapshotsCreate);
}
Action::TasksAll => {
actions.extend([Action::TasksGet, Action::TasksDelete, Action::TasksCancel]);
}

View File

@ -15,13 +15,13 @@ actix-web = { version = "4.3.1", default-features = false }
anyhow = "1.0.70"
convert_case = "0.6.0"
csv = "1.2.1"
deserr = "0.5.0"
deserr = { version = "0.6.0", features = ["actix-web"]}
either = { version = "1.8.1", features = ["serde"] }
enum-iterator = "1.4.0"
file-store = { path = "../file-store" }
flate2 = "1.0.25"
fst = "0.4.7"
memmap2 = "0.5.10"
memmap2 = "0.7.1"
milli = { path = "../milli" }
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }

View File

@ -1,4 +1,3 @@
use std::borrow::Borrow;
use std::fmt::{self, Debug, Display};
use std::fs::File;
use std::io::{self, Seek, Write};
@ -42,7 +41,7 @@ impl Display for DocumentFormatError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::Io(e) => write!(f, "{e}"),
Self::MalformedPayload(me, b) => match me.borrow() {
Self::MalformedPayload(me, b) => match me {
Error::Json(se) => {
let mut message = match se.classify() {
Category::Data => {

View File

@ -257,6 +257,12 @@ pub enum Action {
#[serde(rename = "dumps.create")]
#[deserr(rename = "dumps.create")]
DumpsCreate,
#[serde(rename = "snapshots.*")]
#[deserr(rename = "snapshots.*")]
SnapshotsAll,
#[serde(rename = "snapshots.create")]
#[deserr(rename = "snapshots.create")]
SnapshotsCreate,
#[serde(rename = "version")]
#[deserr(rename = "version")]
Version,
@ -309,6 +315,7 @@ impl Action {
METRICS_GET => Some(Self::MetricsGet),
DUMPS_ALL => Some(Self::DumpsAll),
DUMPS_CREATE => Some(Self::DumpsCreate),
SNAPSHOTS_CREATE => Some(Self::SnapshotsCreate),
VERSION => Some(Self::Version),
KEYS_CREATE => Some(Self::KeysAdd),
KEYS_GET => Some(Self::KeysGet),
@ -353,6 +360,7 @@ pub mod actions {
pub const METRICS_GET: u8 = MetricsGet.repr();
pub const DUMPS_ALL: u8 = DumpsAll.repr();
pub const DUMPS_CREATE: u8 = DumpsCreate.repr();
pub const SNAPSHOTS_CREATE: u8 = SnapshotsCreate.repr();
pub const VERSION: u8 = Version.repr();
pub const KEYS_CREATE: u8 = KeysAdd.repr();
pub const KEYS_GET: u8 = KeysGet.repr();

View File

@ -39,7 +39,7 @@ byte-unit = { version = "4.0.19", default-features = false, features = [
bytes = "1.4.0"
clap = { version = "4.2.1", features = ["derive", "env"] }
crossbeam-channel = "0.5.8"
deserr = "0.5.0"
deserr = { version = "0.6.0", features = ["actix-web"]}
dump = { path = "../dump" }
either = "1.8.1"
env_logger = "0.10.0"
@ -50,9 +50,9 @@ futures = "0.3.28"
futures-util = "0.3.28"
http = "0.2.9"
index-scheduler = { path = "../index-scheduler" }
indexmap = { version = "1.9.3", features = ["serde-1"] }
indexmap = { version = "2.0.0", features = ["serde"] }
is-terminal = "0.4.8"
itertools = "0.10.5"
itertools = "0.11.0"
jsonwebtoken = "8.3.0"
lazy_static = "1.4.0"
log = "0.4.17"
@ -87,7 +87,7 @@ sha2 = "0.10.6"
siphasher = "0.3.10"
slice-group-by = "0.3.0"
static-files = { version = "0.2.3", optional = true }
sysinfo = "0.28.4"
sysinfo = "0.29.7"
tar = "0.4.38"
tempfile = "3.5.0"
thiserror = "1.0.40"

View File

@ -20,7 +20,7 @@ pub struct SearchAggregator;
#[allow(dead_code)]
impl SearchAggregator {
pub fn from_query(_: &dyn Any, _: &dyn Any) -> Self {
Self::default()
Self
}
pub fn succeed(&mut self, _: &dyn Any) {}
@ -32,7 +32,7 @@ pub struct MultiSearchAggregator;
#[allow(dead_code)]
impl MultiSearchAggregator {
pub fn from_queries(_: &dyn Any, _: &dyn Any) -> Self {
Self::default()
Self
}
pub fn succeed(&mut self) {}
@ -44,7 +44,7 @@ pub struct FacetSearchAggregator;
#[allow(dead_code)]
impl FacetSearchAggregator {
pub fn from_query(_: &dyn Any, _: &dyn Any) -> Self {
Self::default()
Self
}
pub fn succeed(&mut self, _: &dyn Any) {}

View File

@ -1,6 +1,5 @@
mod mock_analytics;
// if we are in release mode and the feature analytics was enabled
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
mod segment_analytics;
use std::fs;
@ -17,26 +16,25 @@ use serde_json::Value;
use crate::routes::indexes::documents::UpdateDocumentsQuery;
use crate::routes::tasks::TasksFilterQuery;
// if we are in debug mode OR the analytics feature is disabled
// if the analytics feature is disabled
// the `SegmentAnalytics` point to the mock instead of the real analytics
#[cfg(any(debug_assertions, not(feature = "analytics")))]
#[cfg(not(feature = "analytics"))]
pub type SegmentAnalytics = mock_analytics::MockAnalytics;
#[cfg(any(debug_assertions, not(feature = "analytics")))]
#[cfg(not(feature = "analytics"))]
pub type SearchAggregator = mock_analytics::SearchAggregator;
#[cfg(any(debug_assertions, not(feature = "analytics")))]
#[cfg(not(feature = "analytics"))]
pub type MultiSearchAggregator = mock_analytics::MultiSearchAggregator;
#[cfg(any(debug_assertions, not(feature = "analytics")))]
#[cfg(not(feature = "analytics"))]
pub type FacetSearchAggregator = mock_analytics::FacetSearchAggregator;
// if we are in release mode and the feature analytics was enabled
// we use the real analytics
#[cfg(all(not(debug_assertions), feature = "analytics"))]
// if the feature analytics is enabled we use the real analytics
#[cfg(feature = "analytics")]
pub type SegmentAnalytics = segment_analytics::SegmentAnalytics;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
pub type SearchAggregator = segment_analytics::SearchAggregator;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
pub type MultiSearchAggregator = segment_analytics::MultiSearchAggregator;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
pub type FacetSearchAggregator = segment_analytics::FacetSearchAggregator;
/// The Meilisearch config dir:

File diff suppressed because it is too large Load Diff

View File

@ -28,7 +28,7 @@ const MEILI_DB_PATH: &str = "MEILI_DB_PATH";
const MEILI_HTTP_ADDR: &str = "MEILI_HTTP_ADDR";
const MEILI_MASTER_KEY: &str = "MEILI_MASTER_KEY";
const MEILI_ENV: &str = "MEILI_ENV";
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
const MEILI_NO_ANALYTICS: &str = "MEILI_NO_ANALYTICS";
const MEILI_HTTP_PAYLOAD_SIZE_LIMIT: &str = "MEILI_HTTP_PAYLOAD_SIZE_LIMIT";
const MEILI_SSL_CERT_PATH: &str = "MEILI_SSL_CERT_PATH";
@ -159,7 +159,7 @@ pub struct Opt {
/// Meilisearch automatically collects data from all instances that do not opt out using this flag.
/// All gathered data is used solely for the purpose of improving Meilisearch, and can be deleted
/// at any time.
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
#[serde(default)] // we can't send true
#[clap(long, env = MEILI_NO_ANALYTICS)]
pub no_analytics: bool,
@ -390,7 +390,7 @@ impl Opt {
ignore_missing_dump: _,
ignore_dump_if_db_exists: _,
config_file_path: _,
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
no_analytics,
experimental_enable_metrics: enable_metrics_route,
experimental_reduce_indexing_memory_usage: reduce_indexing_memory_usage,
@ -401,7 +401,7 @@ impl Opt {
export_to_env_if_not_present(MEILI_MASTER_KEY, master_key);
}
export_to_env_if_not_present(MEILI_ENV, env);
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
{
export_to_env_if_not_present(MEILI_NO_ANALYTICS, no_analytics.to_string());
}

View File

@ -24,6 +24,7 @@ pub mod features;
pub mod indexes;
mod metrics;
mod multi_search;
mod snapshot;
mod swap_indexes;
pub mod tasks;
@ -32,6 +33,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
.service(web::resource("/health").route(web::get().to(get_health)))
.service(web::scope("/keys").configure(api_key::configure))
.service(web::scope("/dumps").configure(dump::configure))
.service(web::scope("/snapshots").configure(snapshot::configure))
.service(web::resource("/stats").route(web::get().to(get_stats)))
.service(web::resource("/version").route(web::get().to(get_version)))
.service(web::scope("/indexes").configure(indexes::configure))

View File

@ -0,0 +1,32 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::error::ResponseError;
use meilisearch_types::tasks::KindWithContent;
use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::SummarizedTaskView;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(SeqHandler(create_snapshot))));
}
pub async fn create_snapshot(
index_scheduler: GuardedData<ActionPolicy<{ actions::SNAPSHOTS_CREATE }>, Data<IndexScheduler>>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish("Snapshot Created".to_string(), json!({}), Some(&req));
let task = KindWithContent::SnapshotCreation;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}

View File

@ -60,8 +60,7 @@ pub async fn swap_indexes(
}
let task = KindWithContent::IndexSwap { swaps };
let task = index_scheduler.register(task)?;
let task: SummarizedTaskView = task.into();
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
Ok(HttpResponse::Accepted().json(task))
}

View File

@ -680,6 +680,7 @@ fn compute_semantic_score(query: &[f32], vectors: Value) -> milli::Result<Option
.map_err(InternalError::SerdeJson)?;
Ok(vectors
.into_iter()
.flatten()
.map(|v| OrderedFloat(dot_product_similarity(query, &v)))
.max()
.map(OrderedFloat::into_inner))

View File

@ -1,8 +1,7 @@
use std::{thread, time};
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
#[actix_rt::test]
async fn add_valid_api_key() {
@ -162,7 +161,7 @@ async fn add_valid_api_key_null_description() {
server.use_api_key("MASTER_KEY");
let content = json!({
"description": Value::Null,
"description": json!(null),
"indexes": ["products"],
"actions": ["documents.add"],
"expiresAt": "2050-11-13T00:00:00"
@ -365,7 +364,7 @@ async fn error_add_api_key_invalid_index_uids() {
server.use_api_key("MASTER_KEY");
let content = json!({
"description": Value::Null,
"description": json!(null),
"indexes": ["invalid index # / \\name with spaces"],
"actions": [
"documents.add"
@ -422,7 +421,7 @@ async fn error_add_api_key_invalid_parameters_actions() {
meili_snap::snapshot!(code, @"400 Bad Request");
meili_snap::snapshot!(meili_snap::json_string!(response, { ".createdAt" => "[ignored]", ".updatedAt" => "[ignored]" }), @r###"
{
"message": "Unknown value `doc.add` at `.actions[0]`: expected one of `*`, `search`, `documents.*`, `documents.add`, `documents.get`, `documents.delete`, `indexes.*`, `indexes.create`, `indexes.get`, `indexes.update`, `indexes.delete`, `indexes.swap`, `tasks.*`, `tasks.cancel`, `tasks.delete`, `tasks.get`, `settings.*`, `settings.get`, `settings.update`, `stats.*`, `stats.get`, `metrics.*`, `metrics.get`, `dumps.*`, `dumps.create`, `version`, `keys.create`, `keys.get`, `keys.update`, `keys.delete`, `experimental.get`, `experimental.update`",
"message": "Unknown value `doc.add` at `.actions[0]`: expected one of `*`, `search`, `documents.*`, `documents.add`, `documents.get`, `documents.delete`, `indexes.*`, `indexes.create`, `indexes.get`, `indexes.update`, `indexes.delete`, `indexes.swap`, `tasks.*`, `tasks.cancel`, `tasks.delete`, `tasks.get`, `settings.*`, `settings.get`, `settings.update`, `stats.*`, `stats.get`, `metrics.*`, `metrics.get`, `dumps.*`, `dumps.create`, `snapshots.*`, `snapshots.create`, `version`, `keys.create`, `keys.get`, `keys.update`, `keys.delete`, `experimental.get`, `experimental.update`",
"code": "invalid_api_key_actions",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_api_key_actions"
@ -507,7 +506,7 @@ async fn error_add_api_key_invalid_parameters_uid() {
async fn error_add_api_key_parameters_uid_already_exist() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
let content = json!({
let content: Value = json!({
"uid": "4bc0887a-0e41-4f3b-935d-0c451dcee9c8",
"indexes": ["products"],
"actions": ["search"],
@ -1146,7 +1145,7 @@ async fn patch_api_key_description() {
meili_snap::snapshot!(code, @"200 OK");
// Remove the description
let content = json!({ "description": serde_json::Value::Null });
let content = json!({ "description": null });
let (response, code) = server.patch_api_key(&uid, content).await;
meili_snap::snapshot!(meili_snap::json_string!(response, { ".createdAt" => "[ignored]", ".updatedAt" => "[ignored]", ".uid" => "[ignored]", ".key" => "[ignored]" }), @r###"

View File

@ -3,10 +3,10 @@ use std::collections::{HashMap, HashSet};
use ::time::format_description::well_known::Rfc3339;
use maplit::{hashmap, hashset};
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use time::{Duration, OffsetDateTime};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
pub static AUTHORIZATIONS: Lazy<HashMap<(&'static str, &'static str), HashSet<&'static str>>> =
Lazy::new(|| {
@ -54,6 +54,7 @@ pub static AUTHORIZATIONS: Lazy<HashMap<(&'static str, &'static str), HashSet<&'
("GET", "/indexes/products/stats") => hashset!{"stats.get", "stats.*", "*"},
("GET", "/stats") => hashset!{"stats.get", "stats.*", "*"},
("POST", "/dumps") => hashset!{"dumps.create", "dumps.*", "*"},
("POST", "/snapshots") => hashset!{"snapshots.create", "snapshots.*", "*"},
("GET", "/version") => hashset!{"version", "*"},
("GET", "/metrics") => hashset!{"metrics.get", "metrics.*", "*"},
("PATCH", "/keys/mykey/") => hashset!{"keys.update", "*"},

View File

@ -1,8 +1,8 @@
use meili_snap::*;
use serde_json::json;
use uuid::Uuid;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn create_api_key_bad_description() {
@ -90,7 +90,7 @@ async fn create_api_key_bad_actions() {
snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###"
{
"message": "Unknown value `doggo` at `.actions[0]`: expected one of `*`, `search`, `documents.*`, `documents.add`, `documents.get`, `documents.delete`, `indexes.*`, `indexes.create`, `indexes.get`, `indexes.update`, `indexes.delete`, `indexes.swap`, `tasks.*`, `tasks.cancel`, `tasks.delete`, `tasks.get`, `settings.*`, `settings.get`, `settings.update`, `stats.*`, `stats.get`, `metrics.*`, `metrics.get`, `dumps.*`, `dumps.create`, `version`, `keys.create`, `keys.get`, `keys.update`, `keys.delete`, `experimental.get`, `experimental.update`",
"message": "Unknown value `doggo` at `.actions[0]`: expected one of `*`, `search`, `documents.*`, `documents.add`, `documents.get`, `documents.delete`, `indexes.*`, `indexes.create`, `indexes.get`, `indexes.update`, `indexes.delete`, `indexes.swap`, `tasks.*`, `tasks.cancel`, `tasks.delete`, `tasks.get`, `settings.*`, `settings.get`, `settings.update`, `stats.*`, `stats.get`, `metrics.*`, `metrics.get`, `dumps.*`, `dumps.create`, `snapshots.*`, `snapshots.create`, `version`, `keys.create`, `keys.get`, `keys.update`, `keys.delete`, `experimental.get`, `experimental.update`",
"code": "invalid_api_key_actions",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_api_key_actions"

View File

@ -7,9 +7,9 @@ mod tenant_token;
mod tenant_token_multi_search;
use actix_web::http::StatusCode;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
impl Server {
pub fn use_api_key(&mut self, api_key: impl AsRef<str>) {

View File

@ -3,11 +3,11 @@ use std::collections::HashMap;
use ::time::format_description::well_known::Rfc3339;
use maplit::hashmap;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use time::{Duration, OffsetDateTime};
use super::authorization::{ALL_ACTIONS, AUTHORIZATIONS};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
fn generate_tenant_token(
parent_uid: impl AsRef<str>,
@ -233,31 +233,31 @@ async fn search_authorized_simple_token() {
},
hashmap! {
"searchRules" => json!({"*": {}}),
"exp" => Value::Null
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"*": null}),
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!(["*"]),
"exp" => Value::Null
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!({"sales": {}}),
"exp" => Value::Null
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"sales": null}),
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!(["sales"]),
"exp" => Value::Null
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!(["sa*"]),
"exp" => Value::Null
"exp" => json!(null)
},
];
@ -386,7 +386,7 @@ async fn error_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -398,7 +398,7 @@ async fn error_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"searchRules" => json!({"sales": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -428,15 +428,15 @@ async fn error_search_forbidden_token() {
},
hashmap! {
"searchRules" => json!({"products": {}}),
"exp" => Value::Null
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!({"products": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"products": null}),
"exp" => json!(null)
},
hashmap! {
"searchRules" => json!(["products"]),
"exp" => Value::Null
"exp" => json!(null)
},
// expired token
hashmap! {
@ -444,7 +444,7 @@ async fn error_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -456,7 +456,7 @@ async fn error_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"searchRules" => json!({"sales": null}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {

View File

@ -3,11 +3,11 @@ use std::collections::HashMap;
use ::time::format_description::well_known::Rfc3339;
use maplit::hashmap;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use time::{Duration, OffsetDateTime};
use super::authorization::ALL_ACTIONS;
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
fn generate_tenant_token(
parent_uid: impl AsRef<str>,
@ -512,31 +512,31 @@ async fn single_search_authorized_simple_token() {
},
hashmap! {
"searchRules" => json!({"*": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"*": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["*"]),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"sales": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["sales"]),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["sa*"]),
"exp" => Value::Null
"exp" => json!(null),
},
];
@ -564,31 +564,31 @@ async fn multi_search_authorized_simple_token() {
},
hashmap! {
"searchRules" => json!({"*": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"*": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["*"]),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": {}, "products": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": Value::Null, "products": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"sales": null, "products": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["sales", "products"]),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["sa*", "pro*"]),
"exp" => Value::Null
"exp" => json!(null),
},
];
@ -823,7 +823,7 @@ async fn error_single_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -835,7 +835,7 @@ async fn error_single_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"searchRules" => json!({"sales": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -864,7 +864,7 @@ async fn error_multi_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -876,7 +876,7 @@ async fn error_multi_search_token_forbidden_parent_key() {
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null, "products": Value::Null}),
"searchRules" => json!({"sales": null, "products": null}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -919,15 +919,15 @@ async fn error_single_search_forbidden_token() {
},
hashmap! {
"searchRules" => json!({"products": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"products": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"products": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["products"]),
"exp" => Value::Null
"exp" => json!(null),
},
// expired token
hashmap! {
@ -935,7 +935,7 @@ async fn error_single_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -947,7 +947,7 @@ async fn error_single_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"searchRules" => json!({"sales": null}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -978,15 +978,15 @@ async fn error_multi_search_forbidden_token() {
},
hashmap! {
"searchRules" => json!({"products": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"products": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"products": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["products"]),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": {}}),
@ -998,15 +998,15 @@ async fn error_multi_search_forbidden_token() {
},
hashmap! {
"searchRules" => json!({"sales": {}}),
"exp" => Value::Null
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!({"sales": Value::Null}),
"exp" => Value::Null
"searchRules" => json!({"sales": null}),
"exp" => json!(null),
},
hashmap! {
"searchRules" => json!(["sales"]),
"exp" => Value::Null
"exp" => json!(null),
},
// expired token
hashmap! {
@ -1014,7 +1014,7 @@ async fn error_multi_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"*": Value::Null}),
"searchRules" => json!({"*": null}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
@ -1026,7 +1026,7 @@ async fn error_multi_search_forbidden_token() {
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({"sales": Value::Null, "products": {}}),
"searchRules" => json!({"sales": null, "products": {}}),
"exp" => json!((OffsetDateTime::now_utc() - Duration::hours(1)).unix_timestamp())
},
hashmap! {

View File

@ -3,12 +3,13 @@ use std::panic::{catch_unwind, resume_unwind, UnwindSafe};
use std::time::Duration;
use actix_web::http::StatusCode;
use serde_json::{json, Value};
use tokio::time::sleep;
use urlencoding::encode as urlencode;
use super::encoder::Encoder;
use super::service::Service;
use super::Value;
use crate::json;
pub struct Index<'a> {
pub uid: String,
@ -242,7 +243,9 @@ impl Index<'_> {
pub async fn delete_batch(&self, ids: Vec<u64>) -> (Value, StatusCode) {
let url = format!("/indexes/{}/documents/delete-batch", urlencode(self.uid.as_ref()));
self.service.post_encoded(url, serde_json::to_value(&ids).unwrap(), self.encoder).await
self.service
.post_encoded(url, serde_json::to_value(&ids).unwrap().into(), self.encoder)
.await
}
pub async fn delete_batch_raw(&self, body: Value) -> (Value, StatusCode) {

View File

@ -3,9 +3,83 @@ pub mod index;
pub mod server;
pub mod service;
use std::fmt::{self, Display};
pub use index::{GetAllDocumentsOptions, GetDocumentOptions};
use meili_snap::json_string;
use serde::{Deserialize, Serialize};
pub use server::{default_settings, Server};
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq)]
pub struct Value(pub serde_json::Value);
impl Value {
pub fn uid(&self) -> u64 {
if let Some(uid) = self["uid"].as_u64() {
uid
} else if let Some(uid) = self["taskUid"].as_u64() {
uid
} else {
panic!("Didn't find any task id in: {self}");
}
}
}
impl From<serde_json::Value> for Value {
fn from(value: serde_json::Value) -> Self {
Value(value)
}
}
impl std::ops::Deref for Value {
type Target = serde_json::Value;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl PartialEq<serde_json::Value> for Value {
fn eq(&self, other: &serde_json::Value) -> bool {
&self.0 == other
}
}
impl PartialEq<Value> for serde_json::Value {
fn eq(&self, other: &Value) -> bool {
self == &other.0
}
}
impl PartialEq<&str> for Value {
fn eq(&self, other: &&str) -> bool {
self.0.eq(other)
}
}
impl Display for Value {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"{}",
json_string!(self, { ".enqueuedAt" => "[date]", ".processedAt" => "[date]", ".finishedAt" => "[date]", ".duration" => "[duration]" })
)
}
}
impl From<Vec<Value>> for Value {
fn from(value: Vec<Value>) -> Self {
Self(value.into_iter().map(|value| value.0).collect::<serde_json::Value>())
}
}
#[macro_export]
macro_rules! json {
($($json:tt)+) => {
$crate::common::Value(serde_json::json!($($json)+))
};
}
/// Performs a search test on both post and get routes
#[macro_export]
macro_rules! test_post_get_search {

View File

@ -11,13 +11,14 @@ use clap::Parser;
use meilisearch::option::{IndexerOpts, MaxMemory, Opt};
use meilisearch::{analytics, create_app, setup_meilisearch};
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use tempfile::TempDir;
use tokio::time::sleep;
use super::index::Index;
use super::service::Service;
use crate::common::encoder::Encoder;
use crate::common::Value;
use crate::json;
pub struct Server {
pub service: Service,
@ -156,6 +157,10 @@ impl Server {
self.service.post("/dumps", json!(null)).await
}
pub async fn create_snapshot(&self) -> (Value, StatusCode) {
self.service.post("/snapshots", json!(null)).await
}
pub async fn index_swap(&self, value: Value) -> (Value, StatusCode) {
self.service.post("/swap-indexes", value).await
}
@ -204,7 +209,7 @@ pub fn default_settings(dir: impl AsRef<Path>) -> Opt {
db_path: dir.as_ref().join("db"),
dump_dir: dir.as_ref().join("dumps"),
env: "development".to_owned(),
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[cfg(feature = "analytics")]
no_analytics: true,
max_index_size: Byte::from_unit(100.0, ByteUnit::MiB).unwrap(),
max_task_db_size: Byte::from_unit(1.0, ByteUnit::GiB).unwrap(),

View File

@ -7,9 +7,9 @@ use actix_web::test::TestRequest;
use index_scheduler::IndexScheduler;
use meilisearch::{analytics, create_app, Opt};
use meilisearch_auth::AuthController;
use serde_json::Value;
use crate::common::encoder::Encoder;
use crate::common::Value;
pub struct Service {
pub index_scheduler: Arc<IndexScheduler>,

View File

@ -3,9 +3,8 @@
mod common;
use actix_web::test;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
enum HttpVerb {
Put,

View File

@ -1,11 +1,11 @@
use actix_web::test;
use meili_snap::{json_string, snapshot};
use serde_json::{json, Value};
use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime;
use crate::common::encoder::Encoder;
use crate::common::{GetAllDocumentsOptions, Server};
use crate::common::{GetAllDocumentsOptions, Server, Value};
use crate::json;
/// This is the basic usage of our API and every other tests uses the content-type application/json
#[actix_rt::test]

View File

@ -1,7 +1,7 @@
use meili_snap::{json_string, snapshot};
use serde_json::json;
use crate::common::{GetAllDocumentsOptions, Server};
use crate::json;
#[actix_rt::test]
async fn delete_one_document_unexisting_index() {
@ -154,6 +154,19 @@ async fn delete_document_by_filter() {
)
.await;
index.wait_task(1).await;
let (stats, _) = index.stats().await;
snapshot!(json_string!(stats), @r###"
{
"numberOfDocuments": 4,
"isIndexing": false,
"fieldDistribution": {
"color": 3,
"id": 4
}
}
"###);
let (response, code) =
index.delete_document_by_filter(json!({ "filter": "color = blue"})).await;
snapshot!(code, @"202 Accepted");
@ -188,6 +201,18 @@ async fn delete_document_by_filter() {
}
"###);
let (stats, _) = index.stats().await;
snapshot!(json_string!(stats), @r###"
{
"numberOfDocuments": 2,
"isIndexing": false,
"fieldDistribution": {
"color": 1,
"id": 2
}
}
"###);
let (documents, code) = index.get_all_documents(GetAllDocumentsOptions::default()).await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(documents), @r###"
@ -241,6 +266,18 @@ async fn delete_document_by_filter() {
}
"###);
let (stats, _) = index.stats().await;
snapshot!(json_string!(stats), @r###"
{
"numberOfDocuments": 1,
"isIndexing": false,
"fieldDistribution": {
"color": 1,
"id": 1
}
}
"###);
let (documents, code) = index.get_all_documents(GetAllDocumentsOptions::default()).await;
snapshot!(code, @"200 OK");
snapshot!(json_string!(documents), @r###"

View File

@ -1,8 +1,8 @@
use meili_snap::*;
use serde_json::json;
use urlencoding::encode;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn get_all_documents_bad_offset() {

View File

@ -1,11 +1,11 @@
use actix_web::test;
use http::header::ACCEPT_ENCODING;
use meili_snap::*;
use serde_json::{json, Value};
use urlencoding::encode as urlencode;
use crate::common::encoder::Encoder;
use crate::common::{GetAllDocumentsOptions, GetDocumentOptions, Server};
use crate::common::{GetAllDocumentsOptions, GetDocumentOptions, Server, Value};
use crate::json;
// TODO: partial test since we are testing error, amd error is not yet fully implemented in
// transplant
@ -40,7 +40,7 @@ async fn get_document() {
let server = Server::new().await;
let index = server.index("test");
index.create(None).await;
let documents = serde_json::json!([
let documents = json!([
{
"id": 0,
"nested": { "content": "foobar" },
@ -53,7 +53,7 @@ async fn get_document() {
assert_eq!(code, 200);
assert_eq!(
response,
serde_json::json!({
json!({
"id": 0,
"nested": { "content": "foobar" },
})
@ -64,7 +64,7 @@ async fn get_document() {
assert_eq!(code, 200);
assert_eq!(
response,
serde_json::json!({
json!({
"id": 0,
})
);
@ -75,7 +75,7 @@ async fn get_document() {
assert_eq!(code, 200);
assert_eq!(
response,
serde_json::json!({
json!({
"nested": { "content": "foobar" },
})
);
@ -122,7 +122,7 @@ async fn get_all_documents_no_options() {
assert_eq!(code, 200);
let arr = response["results"].as_array().unwrap();
assert_eq!(arr.len(), 20);
let first = serde_json::json!({
let first = json!({
"id":0,
"isActive":false,
"balance":"$2,668.55",

View File

@ -1,7 +1,8 @@
use serde_json::json;
use meili_snap::snapshot;
use crate::common::encoder::Encoder;
use crate::common::{GetAllDocumentsOptions, Server};
use crate::json;
#[actix_rt::test]
async fn error_document_update_create_index_bad_uid() {
@ -84,7 +85,13 @@ async fn update_document() {
let (response, code) = index.get_document(1, None).await;
assert_eq!(code, 200);
assert_eq!(response.to_string(), r##"{"doc_id":1,"content":"foo","other":"bar"}"##);
snapshot!(response, @r###"
{
"doc_id": 1,
"content": "foo",
"other": "bar"
}
"###);
}
#[actix_rt::test]
@ -122,7 +129,13 @@ async fn update_document_gzip_encoded() {
let (response, code) = index.get_document(1, None).await;
assert_eq!(code, 200);
assert_eq!(response.to_string(), r##"{"doc_id":1,"content":"foo","other":"bar"}"##);
snapshot!(response, @r###"
{
"doc_id": 1,
"content": "foo",
"other": "bar"
}
"###);
}
#[actix_rt::test]

View File

@ -2,10 +2,10 @@ mod data;
use meili_snap::{json_string, snapshot};
use meilisearch::Opt;
use serde_json::json;
use self::data::GetDump;
use crate::common::{default_settings, GetAllDocumentsOptions, Server};
use crate::json;
// all the following test are ignored on windows. See #2364
#[actix_rt::test]

View File

@ -1,6 +1,5 @@
use serde_json::json;
use crate::common::Server;
use crate::json;
/// Feature name to test against.
/// This will have to be changed by a different one when that feature is stabilized.

View File

@ -2,10 +2,10 @@ use actix_web::http::header::ContentType;
use actix_web::test;
use http::header::ACCEPT_ENCODING;
use meili_snap::{json_string, snapshot};
use serde_json::{json, Value};
use crate::common::encoder::Encoder;
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
#[actix_rt::test]
async fn create_index_no_primary_key() {
@ -21,7 +21,7 @@ async fn create_index_no_primary_key() {
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], Value::Null);
assert_eq!(response["details"]["primaryKey"], json!(null));
}
#[actix_rt::test]
@ -38,7 +38,7 @@ async fn create_index_with_gzip_encoded_request() {
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], Value::Null);
assert_eq!(response["details"]["primaryKey"], json!(null));
}
#[actix_rt::test]
@ -86,7 +86,7 @@ async fn create_index_with_zlib_encoded_request() {
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], Value::Null);
assert_eq!(response["details"]["primaryKey"], json!(null));
}
#[actix_rt::test]
@ -103,7 +103,7 @@ async fn create_index_with_brotli_encoded_request() {
assert_eq!(response["status"], "succeeded");
assert_eq!(response["type"], "indexCreation");
assert_eq!(response["details"]["primaryKey"], Value::Null);
assert_eq!(response["details"]["primaryKey"], json!(null));
}
#[actix_rt::test]
@ -136,7 +136,7 @@ async fn create_index_with_invalid_primary_key() {
let (response, code) = index.get().await;
assert_eq!(code, 200);
assert_eq!(response["primaryKey"], Value::Null);
assert_eq!(response["primaryKey"], json!(null));
}
#[actix_rt::test]

View File

@ -1,6 +1,5 @@
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn create_and_delete_index() {

View File

@ -1,7 +1,7 @@
use meili_snap::*;
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn get_indexes_bad_offset() {

View File

@ -1,6 +1,5 @@
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn stats() {

View File

@ -1,9 +1,9 @@
use serde_json::json;
use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime;
use crate::common::encoder::Encoder;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn update_primary_key() {

View File

@ -1,8 +1,8 @@
use meili_snap::*;
use serde_json::json;
use super::DOCUMENTS;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn search_unexisting_index() {

View File

@ -1,8 +1,8 @@
use meili_snap::snapshot;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
pub(self) static DOCUMENTS: Lazy<Value> = Lazy::new(|| {
json!([

View File

@ -1,8 +1,8 @@
use insta::{allow_duplicates, assert_json_snapshot};
use serde_json::json;
use super::*;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn formatted_contain_wildcard() {

View File

@ -1,7 +1,8 @@
use meili_snap::{json_string, snapshot};
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
pub(self) static DOCUMENTS: Lazy<Value> = Lazy::new(|| {
json!([
@ -60,3 +61,59 @@ async fn geo_sort_with_geo_strings() {
)
.await;
}
#[actix_rt::test]
async fn geo_bounding_box_with_string_and_number() {
let server = Server::new().await;
let index = server.index("test");
let documents = DOCUMENTS.clone();
index.update_settings_filterable_attributes(json!(["_geo"])).await;
index.update_settings_sortable_attributes(json!(["_geo"])).await;
index.add_documents(documents, None).await;
index.wait_task(2).await;
index
.search(
json!({
"filter": "_geoBoundingBox([89, 179], [-89, -179])",
}),
|response, code| {
assert_eq!(code, 200, "{}", response);
snapshot!(json_string!(response, { ".processingTimeMs" => "[time]" }), @r###"
{
"hits": [
{
"id": 1,
"name": "Taco Truck",
"address": "444 Salsa Street, Burritoville",
"type": "Mexican",
"rating": 9,
"_geo": {
"lat": 34.0522,
"lng": -118.2437
}
},
{
"id": 2,
"name": "La Bella Italia",
"address": "456 Elm Street, Townsville",
"type": "Italian",
"rating": 9,
"_geo": {
"lat": "45.4777599",
"lng": "9.1967508"
}
}
],
"query": "",
"processingTimeMs": "[time]",
"limit": 20,
"offset": 0,
"estimatedTotalHits": 2
}
"###);
},
)
.await;
}

View File

@ -10,9 +10,9 @@ mod pagination;
mod restrict_searchable;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
pub(self) static DOCUMENTS: Lazy<Value> = Lazy::new(|| {
json!([
@ -1104,3 +1104,59 @@ async fn camelcased_words() {
})
.await;
}
#[actix_rt::test]
async fn simple_search_with_strange_synonyms() {
let server = Server::new().await;
let index = server.index("test");
index.update_settings(json!({ "synonyms": {"&": ["to"], "to": ["&"]} })).await;
let r = index.wait_task(0).await;
meili_snap::snapshot!(r["status"], @r###""succeeded""###);
let documents = DOCUMENTS.clone();
index.add_documents(documents, None).await;
index.wait_task(1).await;
index
.search(json!({"q": "How to train"}), |response, code| {
meili_snap::snapshot!(code, @"200 OK");
meili_snap::snapshot!(meili_snap::json_string!(response["hits"]), @r###"
[
{
"title": "How to Train Your Dragon: The Hidden World",
"id": "166428"
}
]
"###);
})
.await;
index
.search(json!({"q": "How & train"}), |response, code| {
meili_snap::snapshot!(code, @"200 OK");
meili_snap::snapshot!(meili_snap::json_string!(response["hits"]), @r###"
[
{
"title": "How to Train Your Dragon: The Hidden World",
"id": "166428"
}
]
"###);
})
.await;
index
.search(json!({"q": "to"}), |response, code| {
meili_snap::snapshot!(code, @"200 OK");
meili_snap::snapshot!(meili_snap::json_string!(response["hits"]), @r###"
[
{
"title": "How to Train Your Dragon: The Hidden World",
"id": "166428"
}
]
"###);
})
.await;
}

View File

@ -1,8 +1,8 @@
use meili_snap::{json_string, snapshot};
use serde_json::json;
use super::{DOCUMENTS, NESTED_DOCUMENTS};
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn search_empty_list() {

View File

@ -1,6 +1,5 @@
use serde_json::json;
use crate::common::Server;
use crate::json;
use crate::search::DOCUMENTS;
#[actix_rt::test]

View File

@ -1,9 +1,9 @@
use meili_snap::{json_string, snapshot};
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use crate::common::index::Index;
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
async fn index_with_documents<'a>(server: &'a Server, documents: &Value) -> Index<'a> {
let index = server.index("test");

View File

@ -1,6 +1,5 @@
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn set_and_reset_distinct_attribute() {

View File

@ -1,7 +1,7 @@
use meili_snap::*;
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn settings_bad_displayed_attributes() {

View File

@ -1,16 +1,16 @@
use std::collections::HashMap;
use once_cell::sync::Lazy;
use serde_json::{json, Value};
use crate::common::Server;
use crate::common::{Server, Value};
use crate::json;
static DEFAULT_SETTINGS_VALUES: Lazy<HashMap<&'static str, Value>> = Lazy::new(|| {
let mut map = HashMap::new();
map.insert("displayed_attributes", json!(["*"]));
map.insert("searchable_attributes", json!(["*"]));
map.insert("filterable_attributes", json!([]));
map.insert("distinct_attribute", json!(Value::Null));
map.insert("distinct_attribute", json!(null));
map.insert(
"ranking_rules",
json!(["words", "typo", "proximity", "attribute", "sort", "exactness"]),
@ -229,7 +229,7 @@ macro_rules! test_setting_routes {
.chars()
.map(|c| if c == '_' { '-' } else { c })
.collect::<String>());
let (response, code) = server.service.$write_method(url, serde_json::Value::Null).await;
let (response, code) = server.service.$write_method(url, serde_json::Value::Null.into()).await;
assert_eq!(code, 202, "{}", response);
server.index("").wait_task(0).await;
let (response, code) = server.index("test").get().await;

View File

@ -1,7 +1,7 @@
use meili_snap::{json_string, snapshot};
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn set_and_reset() {

View File

@ -1,11 +1,13 @@
use std::time::Duration;
use actix_rt::time::sleep;
use meili_snap::{json_string, snapshot};
use meilisearch::option::ScheduleSnapshot;
use meilisearch::Opt;
use crate::common::server::default_settings;
use crate::common::{GetAllDocumentsOptions, Server};
use crate::json;
macro_rules! verify_snapshot {
(
@ -44,7 +46,7 @@ async fn perform_snapshot() {
let index = server.index("test");
index
.update_settings(serde_json::json! ({
.update_settings(json! ({
"searchableAttributes": [],
}))
.await;
@ -90,3 +92,95 @@ async fn perform_snapshot() {
server.index("test1").settings(),
);
}
#[actix_rt::test]
async fn perform_on_demand_snapshot() {
let temp = tempfile::tempdir().unwrap();
let snapshot_dir = tempfile::tempdir().unwrap();
let options =
Opt { snapshot_dir: snapshot_dir.path().to_owned(), ..default_settings(temp.path()) };
let server = Server::new_with_options(options).await.unwrap();
let index = server.index("catto");
index
.update_settings(json! ({
"searchableAttributes": [],
}))
.await;
index.load_test_set().await;
server.index("doggo").create(Some("bone")).await;
index.wait_task(2).await;
server.index("doggo").create(Some("bone")).await;
index.wait_task(2).await;
let (task, code) = server.create_snapshot().await;
snapshot!(code, @"202 Accepted");
snapshot!(json_string!(task, { ".enqueuedAt" => "[date]" }), @r###"
{
"taskUid": 4,
"indexUid": null,
"status": "enqueued",
"type": "snapshotCreation",
"enqueuedAt": "[date]"
}
"###);
let task = index.wait_task(task.uid()).await;
snapshot!(json_string!(task, { ".enqueuedAt" => "[date]", ".startedAt" => "[date]", ".finishedAt" => "[date]", ".duration" => "[duration]" }), @r###"
{
"uid": 4,
"indexUid": null,
"status": "succeeded",
"type": "snapshotCreation",
"canceledBy": null,
"error": null,
"duration": "[duration]",
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]"
}
"###);
let temp = tempfile::tempdir().unwrap();
let snapshots: Vec<String> = std::fs::read_dir(&snapshot_dir)
.unwrap()
.map(|entry| entry.unwrap().path().file_name().unwrap().to_str().unwrap().to_string())
.collect();
meili_snap::snapshot!(format!("{snapshots:?}"), @r###"["db.snapshot"]"###);
let snapshot_path = snapshot_dir.path().to_owned().join("db.snapshot");
#[cfg_attr(windows, allow(unused))]
let snapshot_meta = std::fs::metadata(&snapshot_path).unwrap();
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let mode = snapshot_meta.permissions().mode();
// rwxrwxrwx
meili_snap::snapshot!(format!("{:b}", mode), @"1000000100100100");
}
let options = Opt { import_snapshot: Some(snapshot_path), ..default_settings(temp.path()) };
let snapshot_server = Server::new_with_options(options).await.unwrap();
verify_snapshot!(server, snapshot_server, |server| =>
server.list_indexes(None, None),
// for some reason the db sizes differ. this may be due to the compaction options we have
// set when performing the snapshot
//server.stats(),
// The original instance contains the snapshotCreation task, while the snapshotted-instance does not. For this reason we need to compare the task queue **after** the task 4
server.tasks_filter("?from=2"),
server.index("catto").get_all_documents(GetAllDocumentsOptions::default()),
server.index("catto").settings(),
server.index("doggo").get_all_documents(GetAllDocumentsOptions::default()),
server.index("doggo").settings(),
);
}

View File

@ -1,8 +1,8 @@
use serde_json::json;
use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn get_settings_unexisting_index() {

View File

@ -1,7 +1,7 @@
use meili_snap::*;
use serde_json::json;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn swap_indexes_bad_format() {

View File

@ -1,9 +1,9 @@
mod errors;
use meili_snap::{json_string, snapshot};
use serde_json::json;
use crate::common::{GetAllDocumentsOptions, Server};
use crate::json;
#[actix_rt::test]
async fn swap_indexes() {

View File

@ -1,11 +1,11 @@
mod errors;
use meili_snap::insta::assert_json_snapshot;
use serde_json::json;
use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime;
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn error_get_unexisting_task_status() {
@ -33,7 +33,7 @@ async fn get_task_status() {
index.create(None).await;
index
.add_documents(
serde_json::json!([{
json!([{
"id": 1,
"content": "foobar",
}]),

View File

@ -17,10 +17,10 @@ bincode = "1.3.3"
bstr = "1.4.0"
bytemuck = { version = "1.13.1", features = ["extern_crate_alloc"] }
byteorder = "1.4.3"
charabia = { version = "0.8.2", default-features = false }
charabia = { version = "0.8.3", default-features = false }
concat-arrays = "0.1.2"
crossbeam-channel = "0.5.8"
deserr = "0.5.0"
deserr = { version = "0.6.0", features = ["actix-web"]}
either = { version = "1.8.1", features = ["serde"] }
flatten-serde-json = { path = "../flatten-serde-json" }
fst = "0.4.7"
@ -32,18 +32,18 @@ grenad = { version = "0.4.4", default-features = false, features = [
heed = { git = "https://github.com/meilisearch/heed", tag = "v0.12.7", default-features = false, features = [
"lmdb", "read-txn-no-tls"
] }
indexmap = { version = "1.9.3", features = ["serde"] }
indexmap = { version = "2.0.0", features = ["serde"] }
instant-distance = { version = "0.6.1", features = ["with-serde"] }
json-depth-checker = { path = "../json-depth-checker" }
levenshtein_automata = { version = "0.2.1", features = ["fst_automaton"] }
memmap2 = "0.5.10"
memmap2 = "0.7.1"
obkv = "0.2.0"
once_cell = "1.17.1"
ordered-float = "3.6.0"
rand_pcg = { version = "0.3.1", features = ["serde1"] }
rayon = "1.7.0"
roaring = "0.10.1"
rstar = { version = "0.10.0", features = ["serde"] }
rstar = { version = "0.11.0", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
slice-group-by = "0.3.0"
@ -63,7 +63,7 @@ uuid = { version = "1.3.1", features = ["v4"] }
filter-parser = { path = "../filter-parser" }
# documents words self-join
itertools = "0.10.5"
itertools = "0.11.0"
# profiling
puffin = "0.16.0"

View File

@ -122,22 +122,28 @@ only composed of alphanumeric characters (a-z A-Z 0-9), hyphens (-) and undersco
.field,
match .valid_fields.is_empty() {
true => "This index does not have configured sortable attributes.".to_string(),
false => format!("Available sortable attributes are: `{}`.",
valid_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(", ")
false => format!("Available sortable attributes are: `{}{}`.",
valid_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(", "),
.hidden_fields.then_some(", <..hidden-attributes>").unwrap_or(""),
),
}
)]
InvalidSortableAttribute { field: String, valid_fields: BTreeSet<String> },
InvalidSortableAttribute { field: String, valid_fields: BTreeSet<String>, hidden_fields: bool },
#[error("Attribute `{}` is not facet-searchable. {}",
.field,
match .valid_fields.is_empty() {
true => "This index does not have configured facet-searchable attributes. To make it facet-searchable add it to the `filterableAttributes` index settings.".to_string(),
false => format!("Available facet-searchable attributes are: `{}`. To make it facet-searchable add it to the `filterableAttributes` index settings.",
valid_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(", ")
false => format!("Available facet-searchable attributes are: `{}{}`. To make it facet-searchable add it to the `filterableAttributes` index settings.",
valid_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(", "),
.hidden_fields.then_some(", <..hidden-attributes>").unwrap_or(""),
),
}
)]
InvalidFacetSearchFacetName { field: String, valid_fields: BTreeSet<String> },
InvalidFacetSearchFacetName {
field: String,
valid_fields: BTreeSet<String>,
hidden_fields: bool,
},
#[error("Attribute `{}` is not searchable. Available searchable attributes are: `{}{}`.",
.field,
.valid_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(", "),
@ -340,8 +346,11 @@ fn conditionally_lookup_for_error_message() {
];
for (list, suffix) in messages {
let err =
UserError::InvalidSortableAttribute { field: "name".to_string(), valid_fields: list };
let err = UserError::InvalidSortableAttribute {
field: "name".to_string(),
valid_fields: list,
hidden_fields: false,
};
assert_eq!(err.to_string(), format!("{} {}", prefix, suffix));
}

View File

@ -655,6 +655,26 @@ impl Index {
}
}
/* remove hidden fields */
pub fn remove_hidden_fields(
&self,
rtxn: &RoTxn,
fields: impl IntoIterator<Item = impl AsRef<str>>,
) -> Result<(BTreeSet<String>, bool)> {
let mut valid_fields =
fields.into_iter().map(|f| f.as_ref().to_string()).collect::<BTreeSet<String>>();
let fields_len = valid_fields.len();
if let Some(dn) = self.displayed_fields(rtxn)? {
let displayable_names = dn.iter().map(|s| s.to_string()).collect();
valid_fields = &valid_fields & &displayable_names;
}
let hidden_fields = fields_len > valid_fields.len();
Ok((valid_fields, hidden_fields))
}
/* searchable fields */
/// Write the user defined searchable fields and generate the real searchable fields from the specified fields ids map.
@ -1820,11 +1840,11 @@ pub(crate) mod tests {
.unwrap();
index
.add_documents(documents!([
{ "id": 0, "_geo": { "lat": 0, "lng": 0 } },
{ "id": 1, "_geo": { "lat": 0, "lng": -175 } },
{ "id": 2, "_geo": { "lat": 0, "lng": 175 } },
{ "id": 0, "_geo": { "lat": "0", "lng": "0" } },
{ "id": 1, "_geo": { "lat": 0, "lng": "-175" } },
{ "id": 2, "_geo": { "lat": "0", "lng": 175 } },
{ "id": 3, "_geo": { "lat": 85, "lng": 0 } },
{ "id": 4, "_geo": { "lat": -85, "lng": 0 } },
{ "id": 4, "_geo": { "lat": "-85", "lng": "0" } },
]))
.unwrap();

View File

@ -97,7 +97,7 @@ const MAX_LMDB_KEY_LENGTH: usize = 500;
///
/// This number is determined by the keys of the different facet databases
/// and adding a margin of safety.
pub const MAX_FACET_VALUE_LENGTH: usize = MAX_LMDB_KEY_LENGTH - 20;
pub const MAX_FACET_VALUE_LENGTH: usize = MAX_LMDB_KEY_LENGTH - 32;
/// The maximum length a word can be
pub const MAX_WORD_LENGTH: usize = MAX_LMDB_KEY_LENGTH / 2;
@ -293,15 +293,15 @@ pub fn normalize_facet(original: &str) -> String {
#[derive(serde::Serialize, serde::Deserialize, Debug)]
#[serde(transparent)]
pub struct VectorOrArrayOfVectors {
#[serde(with = "either::serde_untagged")]
inner: either::Either<Vec<f32>, Vec<Vec<f32>>>,
#[serde(with = "either::serde_untagged_optional")]
inner: Option<either::Either<Vec<f32>, Vec<Vec<f32>>>>,
}
impl VectorOrArrayOfVectors {
pub fn into_array_of_vectors(self) -> Vec<Vec<f32>> {
match self.inner {
either::Either::Left(vector) => vec![vector],
either::Either::Right(vectors) => vectors,
pub fn into_array_of_vectors(self) -> Option<Vec<Vec<f32>>> {
match self.inner? {
either::Either::Left(vector) => Some(vec![vector]),
either::Either::Right(vectors) => Some(vectors),
}
}
}

View File

@ -280,9 +280,13 @@ impl<'a> SearchForFacetValues<'a> {
let filterable_fields = index.filterable_fields(rtxn)?;
if !filterable_fields.contains(&self.facet) {
let (valid_fields, hidden_fields) =
index.remove_hidden_fields(rtxn, filterable_fields)?;
return Err(UserError::InvalidFacetSearchFacetName {
field: self.facet.clone(),
valid_fields: filterable_fields.into_iter().collect(),
valid_fields,
hidden_fields,
}
.into());
}

View File

@ -91,11 +91,12 @@ pub fn bucket_sort<'ctx, Q: RankingRuleQueryTrait>(
/// Update the universes accordingly and inform the logger.
macro_rules! back {
() => {
assert!(
ranking_rule_universes[cur_ranking_rule_index].is_empty(),
"The ranking rule {} did not sort its bucket exhaustively",
ranking_rules[cur_ranking_rule_index].id()
);
// FIXME: temporarily disabled assert: see <https://github.com/meilisearch/meilisearch/pull/4013>
// assert!(
// ranking_rule_universes[cur_ranking_rule_index].is_empty(),
// "The ranking rule {} did not sort its bucket exhaustively",
// ranking_rules[cur_ranking_rule_index].id()
// );
logger.end_iteration_ranking_rule(
cur_ranking_rule_index,
ranking_rules[cur_ranking_rule_index].as_ref(),

View File

@ -418,19 +418,11 @@ impl<'t> Matcher<'t, '_> {
} else {
match &self.matches {
Some((tokens, matches)) => {
// If the text has to be cropped,
// compute the best interval to crop around.
let matches = match format_options.crop {
Some(crop_size) if crop_size > 0 => {
self.find_best_match_interval(matches, crop_size)
}
_ => matches,
};
// If the text has to be cropped,
// crop around the best interval.
let (byte_start, byte_end) = match format_options.crop {
Some(crop_size) if crop_size > 0 => {
let matches = self.find_best_match_interval(matches, crop_size);
self.crop_bounds(tokens, matches, crop_size)
}
_ => (0, self.text.len()),
@ -450,6 +442,11 @@ impl<'t> Matcher<'t, '_> {
for m in matches {
let token = &tokens[m.token_position];
// skip matches out of the crop window.
if token.byte_start < byte_start || token.byte_end > byte_end {
continue;
}
if byte_index < token.byte_start {
formatted.push(&self.text[byte_index..token.byte_start]);
}
@ -800,6 +797,37 @@ mod tests {
);
}
#[test]
fn format_highlight_crop_phrase_query() {
//! testing: https://github.com/meilisearch/meilisearch/issues/3975
let temp_index = TempIndex::new();
temp_index
.add_documents(documents!([
{ "id": 1, "text": "The groundbreaking invention had the power to split the world between those who embraced progress and those who resisted change!" }
]))
.unwrap();
let rtxn = temp_index.read_txn().unwrap();
let format_options = FormatOptions { highlight: true, crop: Some(10) };
let text = "The groundbreaking invention had the power to split the world between those who embraced progress and those who resisted change!";
let builder = MatcherBuilder::new_test(&rtxn, &temp_index, "\"the world\"");
let mut matcher = builder.build(text);
// should return 10 words with a marker at the start as well the end, and the highlighted matches.
insta::assert_snapshot!(
matcher.format(format_options),
@"…had the power to split <em>the</em> <em>world</em> between those who…"
);
let builder = MatcherBuilder::new_test(&rtxn, &temp_index, "those \"and those\"");
let mut matcher = builder.build(text);
// should highlight "those" and the phrase "and those".
insta::assert_snapshot!(
matcher.format(format_options),
@"…world between <em>those</em> who embraced progress <em>and</em> <em>those</em> who resisted…"
);
}
#[test]
fn smaller_crop_size() {
//! testing: https://github.com/meilisearch/specifications/pull/120#discussion_r836536295

View File

@ -20,7 +20,7 @@ mod sort;
#[cfg(test)]
mod tests;
use std::collections::{BTreeSet, HashSet};
use std::collections::HashSet;
use bucket_sort::{bucket_sort, BucketSortOutput};
use charabia::TokenizerBuilder;
@ -108,24 +108,11 @@ impl<'ctx> SearchContext<'ctx> {
(None, None) => continue,
// The field is not searchable => User error
(_fid, Some(false)) => {
let mut valid_fields: BTreeSet<_> =
fids_map.names().map(String::from).collect();
let (valid_fields, hidden_fields) = match searchable_names {
Some(sn) => self.index.remove_hidden_fields(self.txn, sn)?,
None => self.index.remove_hidden_fields(self.txn, fids_map.names())?,
};
// Filter by the searchable names
if let Some(sn) = searchable_names {
let searchable_names = sn.iter().map(|s| s.to_string()).collect();
valid_fields = &valid_fields & &searchable_names;
}
let searchable_count = valid_fields.len();
// Remove hidden fields
if let Some(dn) = self.index.displayed_fields(self.txn)? {
let displayable_names = dn.iter().map(|s| s.to_string()).collect();
valid_fields = &valid_fields & &displayable_names;
}
let hidden_fields = searchable_count > valid_fields.len();
let field = field_name.to_string();
return Err(UserError::InvalidSearchableAttribute {
field,
@ -604,16 +591,24 @@ fn check_sort_criteria(ctx: &SearchContext, sort_criteria: Option<&Vec<AscDesc>>
for asc_desc in sort_criteria {
match asc_desc.member() {
Member::Field(ref field) if !crate::is_faceted(field, &sortable_fields) => {
let (valid_fields, hidden_fields) =
ctx.index.remove_hidden_fields(ctx.txn, sortable_fields)?;
return Err(UserError::InvalidSortableAttribute {
field: field.to_string(),
valid_fields: sortable_fields.into_iter().collect(),
})?
valid_fields,
hidden_fields,
})?;
}
Member::Geo(_) if !sortable_fields.contains("_geo") => {
let (valid_fields, hidden_fields) =
ctx.index.remove_hidden_fields(ctx.txn, sortable_fields)?;
return Err(UserError::InvalidSortableAttribute {
field: "_geo".to_string(),
valid_fields: sortable_fields.into_iter().collect(),
})?
valid_fields,
hidden_fields,
})?;
}
_ => (),
}

View File

@ -94,7 +94,7 @@ use crate::heed_codec::facet::{FacetGroupKey, FacetGroupKeyCodec, FacetGroupValu
use crate::heed_codec::ByteSliceRefCodec;
use crate::update::index_documents::create_sorter;
use crate::update::merge_btreeset_string;
use crate::{BEU16StrCodec, Index, Result, BEU16};
use crate::{BEU16StrCodec, Index, Result, BEU16, MAX_FACET_VALUE_LENGTH};
pub mod bulk;
pub mod delete;
@ -191,7 +191,16 @@ impl<'i> FacetsUpdate<'i> {
for result in database.iter(wtxn)? {
let (facet_group_key, ()) = result?;
if let FacetGroupKey { field_id, level: 0, left_bound } = facet_group_key {
let normalized_facet = left_bound.normalize(&options);
let mut normalized_facet = left_bound.normalize(&options);
let normalized_truncated_facet: String;
if normalized_facet.len() > MAX_FACET_VALUE_LENGTH {
normalized_truncated_facet = normalized_facet
.char_indices()
.take_while(|(idx, _)| *idx < MAX_FACET_VALUE_LENGTH)
.map(|(_, c)| c)
.collect();
normalized_facet = normalized_truncated_facet.into();
}
let set = BTreeSet::from_iter(std::iter::once(left_bound));
let key = (field_id, normalized_facet.as_ref());
let key = BEU16StrCodec::bytes_encode(&key).ok_or(heed::Error::Encoding)?;

View File

@ -28,8 +28,8 @@ pub fn extract_docid_word_positions<R: io::Read + io::Seek>(
indexer: GrenadParameters,
searchable_fields: &Option<HashSet<FieldId>>,
stop_words: Option<&fst::Set<&[u8]>>,
allowed_separators: Option<&Vec<&str>>,
dictionary: Option<&Vec<&str>>,
allowed_separators: Option<&[&str]>,
dictionary: Option<&[&str]>,
max_positions_per_attributes: Option<u32>,
) -> Result<(RoaringBitmap, grenad::Reader<File>, ScriptLanguageDocidsMap)> {
puffin::profile_function!();
@ -55,12 +55,10 @@ pub fn extract_docid_word_positions<R: io::Read + io::Seek>(
tokenizer_builder.stop_words(stop_words);
}
if let Some(dictionary) = dictionary {
// let dictionary: Vec<_> = dictionary.iter().map(String::as_str).collect();
tokenizer_builder.words_dict(dictionary.as_slice());
tokenizer_builder.words_dict(dictionary);
}
if let Some(separators) = allowed_separators {
// let separators: Vec<_> = separators.iter().map(String::as_str).collect();
tokenizer_builder.separators(separators.as_slice());
tokenizer_builder.separators(separators);
}
let tokenizer = tokenizer_builder.build();
@ -228,9 +226,9 @@ fn process_tokens<'a>(
) -> impl Iterator<Item = (usize, Token<'a>)> {
tokens
.skip_while(|token| token.is_separator())
.scan((0, None), |(offset, prev_kind), token| {
.scan((0, None), |(offset, prev_kind), mut token| {
match token.kind {
TokenKind::Word | TokenKind::StopWord | TokenKind::Unknown => {
TokenKind::Word | TokenKind::StopWord if !token.lemma().is_empty() => {
*offset += match *prev_kind {
Some(TokenKind::Separator(SeparatorKind::Hard)) => 8,
Some(_) => 1,
@ -246,7 +244,7 @@ fn process_tokens<'a>(
{
*prev_kind = Some(token.kind);
}
_ => (),
_ => token.kind = TokenKind::Unknown,
}
Some((*offset, token))
})

View File

@ -46,7 +46,7 @@ pub fn extract_facet_string_docids<R: io::Read + io::Seek>(
if normalised_value.len() > MAX_FACET_VALUE_LENGTH {
normalised_truncated_value = normalised_value
.char_indices()
.take_while(|(idx, _)| idx + 4 < MAX_FACET_VALUE_LENGTH)
.take_while(|(idx, _)| *idx < MAX_FACET_VALUE_LENGTH)
.map(|(_, c)| c)
.collect();
normalised_value = normalised_truncated_value.as_str();

View File

@ -28,11 +28,13 @@ pub struct ExtractedFacetValues {
///
/// Returns the generated grenad reader containing the docid the fid and the orginal value as key
/// and the normalized value as value extracted from the given chunk of documents.
/// We need the fid of the geofields to correctly parse them as numbers if they were sent as strings initially.
#[logging_timer::time]
pub fn extract_fid_docid_facet_values<R: io::Read + io::Seek>(
obkv_documents: grenad::Reader<R>,
indexer: GrenadParameters,
faceted_fields: &HashSet<FieldId>,
geo_fields_ids: Option<(FieldId, FieldId)>,
) -> Result<ExtractedFacetValues> {
puffin::profile_function!();
@ -84,7 +86,10 @@ pub fn extract_fid_docid_facet_values<R: io::Read + io::Seek>(
let value = from_slice(field_bytes).map_err(InternalError::SerdeJson)?;
match extract_facet_values(&value) {
match extract_facet_values(
&value,
geo_fields_ids.map_or(false, |(lat, lng)| field_id == lat || field_id == lng),
) {
FilterableValues::Null => {
facet_is_null_docids.entry(field_id).or_default().insert(document);
}
@ -177,12 +182,13 @@ enum FilterableValues {
Values { numbers: Vec<f64>, strings: Vec<(String, String)> },
}
fn extract_facet_values(value: &Value) -> FilterableValues {
fn extract_facet_values(value: &Value, geo_field: bool) -> FilterableValues {
fn inner_extract_facet_values(
value: &Value,
can_recurse: bool,
output_numbers: &mut Vec<f64>,
output_strings: &mut Vec<(String, String)>,
geo_field: bool,
) {
match value {
Value::Null => (),
@ -193,13 +199,30 @@ fn extract_facet_values(value: &Value) -> FilterableValues {
}
}
Value::String(original) => {
// if we're working on a geofield it MUST be something we can parse or else there was an internal error
// in the enrich pipeline. But since the enrich pipeline worked, we want to avoid crashing at all costs.
if geo_field {
if let Ok(float) = original.parse() {
output_numbers.push(float);
} else {
log::warn!(
"Internal error, could not parse a geofield that has been validated. Please open an issue."
)
}
}
let normalized = crate::normalize_facet(original);
output_strings.push((normalized, original.clone()));
}
Value::Array(values) => {
if can_recurse {
for value in values {
inner_extract_facet_values(value, false, output_numbers, output_strings);
inner_extract_facet_values(
value,
false,
output_numbers,
output_strings,
geo_field,
);
}
}
}
@ -215,7 +238,7 @@ fn extract_facet_values(value: &Value) -> FilterableValues {
otherwise => {
let mut numbers = Vec::new();
let mut strings = Vec::new();
inner_extract_facet_values(otherwise, true, &mut numbers, &mut strings);
inner_extract_facet_values(otherwise, true, &mut numbers, &mut strings, geo_field);
FilterableValues::Values { numbers, strings }
}
}

View File

@ -35,7 +35,7 @@ pub fn extract_vector_points<R: io::Read + io::Seek>(
// lazily get it when needed
let document_id = || -> Value {
let document_id = obkv.get(primary_key_id).unwrap();
serde_json::from_slice(document_id).unwrap()
from_slice(document_id).unwrap()
};
// first we retrieve the _vectors field
@ -52,12 +52,14 @@ pub fn extract_vector_points<R: io::Read + io::Seek>(
}
};
for (i, vector) in vectors.into_iter().enumerate().take(u16::MAX as usize) {
let index = u16::try_from(i).unwrap();
let mut key = docid_bytes.to_vec();
key.extend_from_slice(&index.to_be_bytes());
let bytes = cast_slice(&vector);
writer.insert(key, bytes)?;
if let Some(vectors) = vectors {
for (i, vector) in vectors.into_iter().enumerate().take(u16::MAX as usize) {
let index = u16::try_from(i).unwrap();
let mut key = docid_bytes.to_vec();
key.extend_from_slice(&index.to_be_bytes());
let bytes = cast_slice(&vector);
writer.insert(key, bytes)?;
}
}
}
// else => the `_vectors` object was `null`, there is nothing to do

View File

@ -49,8 +49,8 @@ pub(crate) fn data_from_obkv_documents(
geo_fields_ids: Option<(FieldId, FieldId)>,
vectors_field_id: Option<FieldId>,
stop_words: Option<fst::Set<&[u8]>>,
allowed_separators: Option<Vec<&str>>,
dictionary: Option<Vec<&str>>,
allowed_separators: Option<&[&str]>,
dictionary: Option<&[&str]>,
max_positions_per_attributes: Option<u32>,
exact_attributes: HashSet<FieldId>,
) -> Result<()> {
@ -59,7 +59,13 @@ pub(crate) fn data_from_obkv_documents(
original_obkv_chunks
.par_bridge()
.map(|original_documents_chunk| {
send_original_documents_data(original_documents_chunk, lmdb_writer_sx.clone())
send_original_documents_data(
original_documents_chunk,
indexer,
lmdb_writer_sx.clone(),
vectors_field_id,
primary_key_id,
)
})
.collect::<Result<()>>()?;
@ -76,7 +82,6 @@ pub(crate) fn data_from_obkv_documents(
&faceted_fields,
primary_key_id,
geo_fields_ids,
vectors_field_id,
&stop_words,
&allowed_separators,
&dictionary,
@ -265,11 +270,33 @@ fn spawn_extraction_task<FE, FS, M>(
/// - documents
fn send_original_documents_data(
original_documents_chunk: Result<grenad::Reader<File>>,
indexer: GrenadParameters,
lmdb_writer_sx: Sender<Result<TypedChunk>>,
vectors_field_id: Option<FieldId>,
primary_key_id: FieldId,
) -> Result<()> {
let original_documents_chunk =
original_documents_chunk.and_then(|c| unsafe { as_cloneable_grenad(&c) })?;
if let Some(vectors_field_id) = vectors_field_id {
let documents_chunk_cloned = original_documents_chunk.clone();
let lmdb_writer_sx_cloned = lmdb_writer_sx.clone();
rayon::spawn(move || {
let result = extract_vector_points(
documents_chunk_cloned,
indexer,
primary_key_id,
vectors_field_id,
);
let _ = match result {
Ok(vector_points) => {
lmdb_writer_sx_cloned.send(Ok(TypedChunk::VectorPoints(vector_points)))
}
Err(error) => lmdb_writer_sx_cloned.send(Err(error)),
};
});
}
// TODO: create a custom internal error
lmdb_writer_sx.send(Ok(TypedChunk::Documents(original_documents_chunk))).unwrap();
Ok(())
@ -291,10 +318,9 @@ fn send_and_extract_flattened_documents_data(
faceted_fields: &HashSet<FieldId>,
primary_key_id: FieldId,
geo_fields_ids: Option<(FieldId, FieldId)>,
vectors_field_id: Option<FieldId>,
stop_words: &Option<fst::Set<&[u8]>>,
allowed_separators: &Option<Vec<&str>>,
dictionary: &Option<Vec<&str>>,
allowed_separators: &Option<&[&str]>,
dictionary: &Option<&[&str]>,
max_positions_per_attributes: Option<u32>,
) -> Result<(
grenad::Reader<CursorClonableMmap>,
@ -322,25 +348,6 @@ fn send_and_extract_flattened_documents_data(
});
}
if let Some(vectors_field_id) = vectors_field_id {
let documents_chunk_cloned = flattened_documents_chunk.clone();
let lmdb_writer_sx_cloned = lmdb_writer_sx.clone();
rayon::spawn(move || {
let result = extract_vector_points(
documents_chunk_cloned,
indexer,
primary_key_id,
vectors_field_id,
);
let _ = match result {
Ok(vector_points) => {
lmdb_writer_sx_cloned.send(Ok(TypedChunk::VectorPoints(vector_points)))
}
Err(error) => lmdb_writer_sx_cloned.send(Err(error)),
};
});
}
let (docid_word_positions_chunk, docid_fid_facet_values_chunks): (Result<_>, Result<_>) =
rayon::join(
|| {
@ -350,8 +357,8 @@ fn send_and_extract_flattened_documents_data(
indexer,
searchable_fields,
stop_words.as_ref(),
allowed_separators.as_ref(),
dictionary.as_ref(),
*allowed_separators,
*dictionary,
max_positions_per_attributes,
)?;
@ -378,6 +385,7 @@ fn send_and_extract_flattened_documents_data(
flattened_documents_chunk.clone(),
indexer,
faceted_fields,
geo_fields_ids,
)?;
// send docid_fid_facet_numbers_chunk to DB writer

View File

@ -359,8 +359,8 @@ where
geo_fields_ids,
vectors_field_id,
stop_words,
separators,
dictionary,
separators.as_deref(),
dictionary.as_deref(),
max_positions_per_attributes,
exact_attributes,
)
@ -2550,6 +2550,25 @@ mod tests {
db_snap!(index, word_position_docids, 3, @"74f556b91d161d997a89468b4da1cb8f");
}
/// Index multiple different number of vectors in documents.
/// Vectors must be of the same length.
#[test]
fn test_multiple_vectors() {
let index = TempIndex::new();
index.add_documents(documents!([{"id": 0, "_vectors": [[0, 1, 2], [3, 4, 5]] }])).unwrap();
index.add_documents(documents!([{"id": 1, "_vectors": [6, 7, 8] }])).unwrap();
index
.add_documents(
documents!([{"id": 2, "_vectors": [[9, 10, 11], [12, 13, 14], [15, 16, 17]] }]),
)
.unwrap();
let rtxn = index.read_txn().unwrap();
let res = index.search(&rtxn).vector([0.0, 1.0, 2.0]).execute().unwrap();
assert_eq!(res.documents_ids.len(), 3);
}
#[test]
fn reproduce_the_bug() {
/*

View File

@ -573,7 +573,7 @@ impl<'a, 't, 'u, 'i> Settings<'a, 't, 'u, 'i> {
tokenizer
.tokenize(text)
.filter_map(|token| {
if token.is_word() {
if token.is_word() && !token.lemma().is_empty() {
Some(token.lemma().to_string())
} else {
None
@ -608,13 +608,18 @@ impl<'a, 't, 'u, 'i> Settings<'a, 't, 'u, 'i> {
for (word, synonyms) in user_synonyms {
// Normalize both the word and associated synonyms.
let normalized_word = normalize(&tokenizer, word);
let normalized_synonyms =
synonyms.iter().map(|synonym| normalize(&tokenizer, synonym));
let normalized_synonyms: Vec<_> = synonyms
.iter()
.map(|synonym| normalize(&tokenizer, synonym))
.filter(|synonym| !synonym.is_empty())
.collect();
// Store the normalized synonyms under the normalized word,
// merging the possible duplicate words.
let entry = new_synonyms.entry(normalized_word).or_insert_with(Vec::new);
entry.extend(normalized_synonyms);
if !normalized_word.is_empty() && !normalized_synonyms.is_empty() {
let entry = new_synonyms.entry(normalized_word).or_insert_with(Vec::new);
entry.extend(normalized_synonyms.into_iter());
}
}
// Make sure that we don't have duplicate synonyms.
@ -1422,6 +1427,43 @@ mod tests {
assert!(result.documents_ids.is_empty());
}
#[test]
fn thai_synonyms() {
let mut index = TempIndex::new();
index.index_documents_config.autogenerate_docids = true;
let mut wtxn = index.write_txn().unwrap();
// Send 3 documents with ids from 1 to 3.
index
.add_documents_using_wtxn(
&mut wtxn,
documents!([
{ "name": "ยี่ปุ่น" },
{ "name": "ญี่ปุ่น" },
]),
)
.unwrap();
// In the same transaction provide some synonyms
index
.update_settings_using_wtxn(&mut wtxn, |settings| {
settings.set_synonyms(btreemap! {
"japanese".to_string() => vec![S("ญี่ปุ่น"), S("ยี่ปุ่น")],
});
})
.unwrap();
wtxn.commit().unwrap();
// Ensure synonyms are effectively stored
let rtxn = index.read_txn().unwrap();
let synonyms = index.synonyms(&rtxn).unwrap();
assert!(!synonyms.is_empty()); // at this point the index should return something
// Check that we can use synonyms
let result = index.search(&rtxn).query("japanese").execute().unwrap();
assert_eq!(result.documents_ids.len(), 2);
}
#[test]
fn setting_searchable_recomputes_other_settings() {
let index = TempIndex::new();

View File

@ -186,12 +186,16 @@ fn create_value(value: &Document, mut selectors: HashSet<&str>) -> Document {
let array = create_array(array, &sub_selectors);
if !array.is_empty() {
new_value.insert(key.to_string(), array.into());
} else {
new_value.insert(key.to_string(), Value::Array(vec![]));
}
}
Value::Object(object) => {
let object = create_value(object, sub_selectors);
if !object.is_empty() {
new_value.insert(key.to_string(), object.into());
} else {
new_value.insert(key.to_string(), Value::Object(Map::new()));
}
}
_ => (),
@ -211,6 +215,8 @@ fn create_array(array: &[Value], selectors: &HashSet<&str>) -> Vec<Value> {
let array = create_array(array, selectors);
if !array.is_empty() {
res.push(array.into());
} else {
res.push(Value::Array(vec![]));
}
}
Value::Object(object) => {
@ -637,6 +643,24 @@ mod tests {
);
}
#[test]
fn empty_array_object_return_empty() {
let value: Value = json!({
"array": [],
"object": {},
});
let value: &Document = value.as_object().unwrap();
let res: Value = select_values(value, vec!["array.name", "object.name"]).into();
assert_eq!(
res,
json!({
"array": [],
"object": {},
})
);
}
#[test]
fn all_conflict_variation() {
let value: Value = json!({