Compare commits

...

1132 Commits

Author SHA1 Message Date
13025594a8 Merge #5041
5041: Update version for the next release (v1.11.1) in Cargo.toml r=dureuill a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: dureuill <dureuill@users.noreply.github.com>
2024-11-06 11:35:26 +00:00
2c1c33166d Merge #5039
5039: Add 3s timeout to embedding requests made during search r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #5032 

## What does this PR do?
- Add a 3-second timeout to embedding requests against a remote embedder made in the context of search. The timeout triggers when there are failing requests due to rate-limiting.
- Add a test of that timeout.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-06 10:56:50 +00:00
cdb6e3f45a Update version for the next release (v1.11.1) in Cargo.toml 2024-11-06 08:35:51 +00:00
1d574bd443 Add test 2024-11-06 09:25:41 +01:00
37a4fd7f99 Add deadline of 3 seconds to embedding requests made in the context of hybrid search 2024-11-06 09:25:24 +01:00
3753f87fd8 Merge #5011
5011: Revamp analytics r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5009

## What does this PR do?
- Force every analytics to go through a trait that forces you to handle aggregation correcty
- Put the code to retrieve the `user-agent`, `timestamp` and `requests.total_received` in common between all aggregates, so there is no mistake
- Get rids of all the different channel for each kind of event in favor of an any map
- Ensure that we never [send empty event ever again](https://github.com/meilisearch/meilisearch/pull/5001)
- Merge all the sub-settings route into a global « Settings Updated » event.
- Fix: When using one of the three following feature, we were not sending any analytics IF they were set from the global route
  - /non-separator-tokens
  - /separator-tokens
  - /dictionary

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-21 15:08:49 +00:00
5675585fe8 move all the searches structures to new modules 2024-10-20 17:54:43 +02:00
af589c85ec reverse all the settings to keep the last one received instead of the first one received in case we receive the same setting multiple times 2024-10-20 17:40:31 +02:00
ac919df37d simplify the trait a bit more by getting rids of the downcast_aggregate method 2024-10-20 17:36:29 +02:00
73b5722896 rename the other parameter of the aggregate method to new to avoid confusion 2024-10-20 17:31:35 +02:00
c94679bde6 apply review comments 2024-10-20 17:24:12 +02:00
89e2d2b2b9 fix the doctest 2024-10-17 13:55:49 +02:00
3a7a20c716 remove the segment feature and always import segment 2024-10-17 11:21:14 +02:00
fa1db6b721 fix the tests 2024-10-17 09:55:30 +02:00
1ab6fec903 send all experimental features in the info event including the runtime one 2024-10-17 09:49:21 +02:00
18ac4032aa Remove the experimental feature seen 2024-10-17 09:35:11 +02:00
d9115b74f0 move the analytics settings code to a dedicated file 2024-10-17 09:32:54 +02:00
0fde49640a make clippy happy 2024-10-17 09:18:25 +02:00
4ee65d870e remove a lot of ununsed code 2024-10-17 09:14:34 +02:00
ef77c7699b add the required shared values between all the events and fix the timestamp 2024-10-17 09:06:23 +02:00
7382fb21e4 fix the main 2024-10-17 08:38:11 +02:00
e4ace98004 fix all the routes + move to a better version of mopa 2024-10-17 01:04:25 +02:00
aa7a34ffe8 make the aggregate method send 2024-10-17 00:43:34 +02:00
6728cfbfac fix the analytics 2024-10-17 00:38:18 +02:00
ea6883189e finish the analytics in all the routes 2024-10-16 21:17:06 +02:00
fdeb47fb54 implements all routes 2024-10-16 17:16:33 +02:00
e66fccc3f2 get rids of the analytics closure 2024-10-16 15:51:48 +02:00
73e87c152a rewrite most of the analytics especially the settings 2024-10-16 15:43:27 +02:00
75b2f22add Merge #5008
5008: Display vectors when no custom vectors where ever provided r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes the issue reported on [Discord](https://discord.com/channels/1006923006964154428/1294653031958446080/1295336784896589967).

## What does this PR do?
- Normal behavior of Meilisearch is to hide `_vectors` even when `retrieveVectors: true` when there is an explicit list of displayed attributes that does not contain vectors
- However, this relied on the field id for the `_vectors` field to exist, which wasn't the case when no `_vectors` was manually provided to documents. This would often be the case for people using autoembedders such as the OpenAI integration.
- This PR fixes the behavior by looking for the `_vectors` string in the `displayedAttributes` when there is no `_vectors` fid.
- This PR also adds a test for this specific situation, that would fail before the PR, and pass after the PR


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-15 13:08:47 +00:00
5a74d4729c Add test failing before this PR, OK now 2024-10-14 16:23:28 +02:00
e44e7b5e81 Fix retrieveVectors when explicitly passed in displayed attributes without any document containing _vectors 2024-10-14 16:17:19 +02:00
a0b3887709 Merge #5006
5006: Bring back changes from v1.10.3 r=Kerollmops a=irevoire

# Pull Request

## Related issue
Port the following PR to the latest version: https://github.com/meilisearch/meilisearch/pull/5000
See its description for more information

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-14 14:06:35 +00:00
4b4a6c7863 Update meilisearch/src/option.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-10-14 14:39:34 +02:00
3085092e04 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-14 14:39:34 +02:00
c4efd1df4e Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-14 14:39:34 +02:00
c32282acb1 improve doc 2024-10-14 14:39:34 +02:00
92070a3578 Implement the experimental drop search after and nb search per core 2024-10-14 14:39:33 +02:00
a90563df3f Merge #5001
5001: Do not send empty edit document by function r=Kerollmops a=irevoire

# Pull Request

We realized that we had a huge usage of the feature from user who didn’t enable the feature at all. That shouldn’t be possible.
After a big investigation with `@gmourier` 
![image](https://github.com/user-attachments/assets/eae3e851-dc5b-4616-80ee-7237a4871522)
We found the issue, it was in the engine

## What does this PR do?
- Do not send the edit by function event to segment if no event was received during this batch

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-11 08:27:16 +00:00
466604725e Do not send empty edit document by function 2024-10-10 23:47:15 +02:00
995394a516 Merge #4993
4993: Update mini-dashboard r=ManyTheFish a=curquiza

Remove the forced capitalized attribute name

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-10-10 05:57:45 +00:00
6e37ae8619 Update mini-dashboard 2024-10-09 19:13:14 +02:00
657c645603 Merge #4992
4992: fix the bad experimental search queue size r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes #4991 

## What does this PR do?
- Set the right default value for the experimental search queue size in the config file


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-09 10:45:48 +00:00
7f5d0837c3 fix the bad experimental search queue size 2024-10-09 11:46:57 +02:00
0566f2549d Merge #4972
4972: Add binary quantized to error messages r=irevoire a=dureuill

was missing in error messages

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-10-02 09:23:55 +00:00
0c2661ea90 Fix tests 2024-10-02 11:20:29 +02:00
62dfbd6255 Add binary quantized to allowed fields for source adds its sources 2024-10-02 11:20:02 +02:00
cc669f90d5 Merge #4971
4971: update arroy r=dureuill a=irevoire

# Pull Request

Fix part of https://github.com/meilisearch/meilisearch/issues/3715


## What does this PR do?
- Update arroy to the latest version, most change are maintenance changes
- The performances of adding vectors to arroy should slightly improve
- Forward the build cancellation function to arroy so it can stop building trees when we have to stop an indexing process


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-10-02 05:53:51 +00:00
b1dc10e771 uses the new cancellation method in arroy 2024-10-01 17:45:49 +02:00
4b598fa648 update arroy 2024-10-01 17:31:12 +02:00
71b364286b Merge #4957
4957: Update charabia feature flags r=dureuill a=ManyTheFish

# Pull Request

Add charabia's `turkish` feature flag into Meilisearch default tokenization flag



[All tests pipeline](https://github.com/meilisearch/meilisearch/actions/runs/11030036031)

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-26 20:19:21 +00:00
86183e0807 Merge #4960
4960: Update rhai r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4956

A fix has been implemented in https://github.com/rhaiscript/rhai/issues/916

## What does this PR do?
- Use the latest version of rhai containing the fix

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-26 15:03:01 +00:00
78a4b7949d update rhai to a version that shouldn’t panic 2024-09-26 15:04:03 +02:00
dc2cb58cf1 use charabia default for all-tokenization 2024-09-25 11:12:30 +02:00
e9580fe619 Add turkish normalization 2024-09-25 11:03:17 +02:00
8205254f4c Merge #4955
4955: Upgrade "batch failed" log to error level r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4916 


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-25 08:18:44 +00:00
efdc5739d7 Merge #4953
4953: Move the multi arroy index logic to the arroy wrapper r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4948

## What does this PR do?
- Make the `ArroyWrapper` we introduced in the last PR handle all the embedded for a specific docid itself.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-24 15:02:24 +00:00
b31e9bea26 while retrieving the readers on an arroywrapper, stops at the first empty reader 2024-09-24 16:33:17 +02:00
7f048b9732 early exit in the clear and contains 2024-09-24 15:02:38 +02:00
8b4e2c7b17 Remove now unused method 2024-09-24 15:00:25 +02:00
645a55317a merge the build and quantize method 2024-09-24 14:54:24 +02:00
8caf97db86 Merge #4954
4954: Fix bench by adding embedder r=ManyTheFish a=dureuill

Fix benchmark workloads following breaking change on embedders

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-24 12:53:34 +00:00
b8a74e0464 fix comments 2024-09-24 10:59:15 +02:00
fd8447c521 fix the del items thing 2024-09-24 10:52:05 +02:00
f2d187ba3e rename the index method to embedder_index 2024-09-24 10:39:40 +02:00
79d8a7a51a rename the embedder index for clarity 2024-09-24 10:36:28 +02:00
86da0e83fe Upgrade "batch failed" log to ERROR level 2024-09-24 10:02:53 +02:00
0704fb71e9 Fix bench by adding embedder 2024-09-24 09:56:47 +02:00
1e4d4e69c4 finish the arroywrapper 2024-09-23 18:56:15 +02:00
6ba4baecbf first ugly step 2024-09-23 15:15:26 +02:00
7f20c13f3f Merge #4943
4943: Correct broken links in README r=curquiza a=iornstein

# Pull Request

## Related issue
Fixes #4942

## What does this PR do?
- Corrects some broken links in the README. My suspicion is that some of these documentation articles were moved around without someone updating links in the README.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? _(well the contributing guidelines led me to create an issue first)_
- [x] Have you read the contributing guidelines? _yes_
- [x] Have you made sure that the title is accurate and descriptive of the changes? _yes_

Thank you so much for contributing to Meilisearch!


Co-authored-by: Ian Ornstein <ian.ornstein@gmail.com>
2024-09-19 19:22:04 +00:00
462a2329f1 Merge #4941
4941: Implement the binary quantization in meilisearch r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4873

## What does this PR do?
- Add a settings for the binary quantization
- Once enabled, the bq cannot be disabled

TODO:
- [ ] Missing a bunch of tests

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-19 15:50:24 +00:00
afa3ae0cbd WIP 2024-09-19 17:42:52 +02:00
f6483cf15d apply review comment 2024-09-19 16:47:06 +02:00
bd34ed01d9 Merge #4945
4945: Add swedish in default pipelines r=dureuill a=ManyTheFish

# Summary
## Fix Swedish support

In Swedish the characters `å`/`ä`/`ö` are completely different than `a` or `o`  and should not be normalized as the same character.
because the Swedish specialized pipeline was not activated by default, these characters were normalized even with the settings:
```json
{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}
```

## Update Charabia adding German support

German segmentation will now be activated using the setting:
```json
{
  "localizedAttributes": [ { "locales": ["deu"], "attributePatterns": ["*"] } ]
}
```

# TODO

- [x] Activate Swedish Pipeline
- [x] Add a test to avoid future regressions
- [x] Update Charabia


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-19 14:42:03 +00:00
74199f328d Make clippy happy 2024-09-19 16:27:34 +02:00
1113c42de0 fix broken comments 2024-09-19 16:18:36 +02:00
465afe01b2 Add test for German 2024-09-19 16:09:01 +02:00
7d6768e4c4 Add german tokenization pipeline 2024-09-19 16:09:01 +02:00
f77661ec44 Update Charabia v0.9.1 2024-09-19 16:08:59 +02:00
b8fd85a46d Get rids of useless collect before an iteration on the readers 2024-09-19 15:57:38 +02:00
fd43c6c404 Improve the error message explaining you can't un-bq an embedder 2024-09-19 15:51:29 +02:00
2564ec1496 Update milli/src/index.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:44 +02:00
b6b73fe41c Update milli/src/update/settings.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:14 +02:00
6dde41cc46 stop using a local version of arroy and instead point to the git repo with the rev 2024-09-19 15:25:38 +02:00
163f8023a1 remove debug println 2024-09-19 12:13:25 +02:00
2b120b89e4 update the test now that the embedder must be specified 2024-09-19 12:08:59 +02:00
84f842233d snapshots the embedder settings in the dump import with vector test 2024-09-19 12:00:58 +02:00
633537ccd7 fix updating documents without updating the settings 2024-09-19 12:00:58 +02:00
e8d7c00d30 add a test on the settings value 2024-09-19 12:00:58 +02:00
3f6301dbc9 fix the missing embedder name in the error message when trying to disable the binary quantization 2024-09-19 12:00:58 +02:00
ca71b63ed1 adds integration tests 2024-09-19 12:00:58 +02:00
2b6952eda1 rename the ArroyReader to an ArroyWrapper since it can read and write 2024-09-19 12:00:58 +02:00
79f29eed3c fix the tests and the arroy_readers method 2024-09-19 12:00:58 +02:00
cc45e264ca implement the binary quantization in meilisearch 2024-09-19 12:00:56 +02:00
5f474a640d Merge #4938
4938: Remove default embedder r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4738 

## What does this PR do?

[See public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1044b06b651f80edb9d4ef6dc367bad0)

- Remove `hybrid.embedder` boolean from analytics because embedder is now mandatory and so the boolean would always be `true`
- Rework search kind so that a search without query but with vector is a vector search regardless of (non-zero) semantic ratio


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 09:17:14 +00:00
bbaee3dbc6 Add Swedish pipeline in all-tokenization feature 2024-09-19 08:34:51 +02:00
877717cb26 Add a test using Swedish documents 2024-09-19 08:34:04 +02:00
716817122a Correct broken links in README 2024-09-18 16:30:29 -05:00
ff523a2357 Merge #4939
4939: Introduce the `STARTS WITH` filter operator r=irevoire a=Kerollmops

This PR fixes #4872 by introducing the `STARTS WITH` filter operator and gating it under the _contains filter_ experimental feature along with the `CONTAINS` one. I also updated [the experimental feature discussion page](https://github.com/orgs/meilisearch/discussions/763).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-09-18 10:19:48 +00:00
29c3aca72a Merge #4929
4929: Add facets support to federated r=Kerollmops a=dureuill

# Pull Request

## Related issue 

- Fixes #4932 (sprint issue)
- Fixes  #4913 (user-opened issue)

## What does this PR do?

See [public usage](https://meilisearch.notion.site/v1-11-Federated-search-59b30e03383c40729d7541a3dffb0069)

> [!CAUTION]
> This PR introduces a 🚨**breaking change**🚨: `queries.facets` when `federation` is present and non-`null` is now **an error**

### Implementation standpoint:

- Facet distribution: fix issue where truncated facet distribution would have a wrong order
- facet distribution: implement Display for OrderBy


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-18 09:47:20 +00:00
00f8d03f43 Use f32::min and f32::max 2024-09-18 11:46:10 +02:00
50981ea778 Update the error messages 2024-09-18 11:44:29 +02:00
c2caff1716 Remove obsolete enum 2024-09-18 11:26:43 +02:00
4c355bede7 Merge #4937
4937: Support iso 639 1 r=ManyTheFish a=ManyTheFish

# Pull Request

## Related issue
Fixes #4827

## What does this PR do?
- Add iso-639-1 variants to the Locales enum
- Convert iso-639-1 into iso-639-3


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-18 05:29:32 +00:00
174d69ff72 Don't override max value in indexes 2024-09-17 18:16:14 +02:00
52a52f97cf Update tests 2024-09-17 17:49:12 +02:00
5de4b48552 Fixup error messages 2024-09-17 17:49:00 +02:00
df648ce7a6 Update tests 2024-09-17 17:40:14 +02:00
af8edab21d Remove mention of sort order and recommend changing index settings on inconsistent order error 2024-09-17 17:39:51 +02:00
c42746c4cd Update tests 2024-09-17 17:22:14 +02:00
98b77aec66 Remove runtime sortFacetValuesBy 2024-09-17 17:22:03 +02:00
54d3ba3357 Fix tests that check error message content 2024-09-17 17:14:39 +02:00
6e058709f2 Rustfmt 2024-09-17 17:02:06 +02:00
0fbf9ea5b1 Factorize using macro 2024-09-17 17:00:03 +02:00
9f1fb4b425 Introduce the STARTS WITH filter operator gated under an experimental feature 2024-09-17 16:44:11 +02:00
1120a5296c Update tests 2024-09-17 16:30:43 +02:00
a35a339c3d Touchup error message 2024-09-17 16:30:43 +02:00
cac5836f6f Remove hybrid.embedder boolean from analytics because embedder is now mandatory 2024-09-17 16:30:43 +02:00
5239ae0297 Rework search kind so that a search without query but with vector is a vector search regardless of semantic ratio 2024-09-17 16:30:43 +02:00
2fdb1d8018 SearchQueryGet can fail 2024-09-17 16:30:43 +02:00
3c5e363554 Remove default embedders 2024-09-17 16:30:43 +02:00
da0dd6febf Make embedder mandatory 2024-09-17 16:30:43 +02:00
a197d63ab6 simplify tests 2024-09-17 15:30:12 +02:00
390eadb733 Support iso-639-1 2024-09-17 15:01:01 +02:00
93f0317b94 Merge #4936
4936: Update version for the next release (v1.11.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-09-17 11:47:08 +00:00
29ff02f3ff Update version for the next release (v1.11.0) in Cargo.toml 2024-09-17 11:45:48 +00:00
d9e0df74ea update test 2024-09-17 10:39:48 +02:00
dc8a662209 federated queries: adjust error message 2024-09-17 10:39:48 +02:00
6732dd95d7 Update tests 2024-09-17 10:39:48 +02:00
95da428dc8 Use route in federated 2024-09-17 10:39:48 +02:00
38c4be1c8e compute_facets accepts Route argument to fixup error code 2024-09-17 10:39:48 +02:00
91dfab317f New error 2024-09-17 10:39:48 +02:00
47e3c4b5c3 Add new tests 2024-09-17 10:39:48 +02:00
533f1d4345 Federated search: support facets 2024-09-17 10:39:48 +02:00
7b55462610 BREAKING CHANGE: errors if queries.facets in federated search 2024-09-17 10:39:48 +02:00
f6114a1ff2 Introduce ComputedFacets and compute_facet_distribution_stats 2024-09-17 10:39:48 +02:00
7c084b1286 SearchQueriesWithIndex changes 2024-09-17 10:39:47 +02:00
57f9517a98 Required changes to IndexUid 2024-09-17 10:39:47 +02:00
72cc573e0a Add new error types 2024-09-17 10:39:47 +02:00
a48b1d5a79 Update existing tests following error message changes 2024-09-17 10:39:47 +02:00
a94a87ee54 Slightly changes existing error messages 2024-09-17 10:39:47 +02:00
4b55ba68bc Merge #4911
4911: Bump quinn-proto from 0.11.3 to 0.11.8 r=Kerollmops a=dependabot[bot]

Bumps [quinn-proto](https://github.com/quinn-rs/quinn) from 0.11.3 to 0.11.8.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/quinn-rs/quinn/releases">quinn-proto's releases</a>.</em></p>
<blockquote>
<h2>quinn-proto 0.11.5</h2>
<h2>What's Changed</h2>
<ul>
<li>No workspace lints by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1955">quinn-rs/quinn#1955</a></li>
</ul>
<h2>quinn-proto 0.11.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix panic in example due to unset default crypto provider by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1882">quinn-rs/quinn#1882</a></li>
<li>Fix zero-length connection IDs by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1883">quinn-rs/quinn#1883</a></li>
<li>Add support for NetBSD, fix OpenBSD by <a href="https://github.com/flub"><code>`@​flub</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1884">quinn-rs/quinn#1884</a></li>
<li>docs(udp): replace AsRawFd and AsRawSocket with AsFd and AsSocket by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1890">quinn-rs/quinn#1890</a></li>
<li>Resolve stopped/received_reset futures on lost connections by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1886">quinn-rs/quinn#1886</a></li>
<li>Bump version numbers (quinn 0.11.2, -proto 0.11.3) by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1891">quinn-rs/quinn#1891</a></li>
<li>udp: bump version to 0.5.2 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1892">quinn-rs/quinn#1892</a></li>
<li>docs(quinn): Clarify effects of setting AckFrequencyConfig by <a href="https://github.com/gretchenfrage"><code>`@​gretchenfrage</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1894">quinn-rs/quinn#1894</a></li>
<li>Apply clippy suggestions from Rust 1.79 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1895">quinn-rs/quinn#1895</a></li>
<li>Only send MAX_STREAMS when &gt;1/8 of flow control window is consumed  by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1898">quinn-rs/quinn#1898</a></li>
<li>fix: remove unused dependency tracing-attributes by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1903">quinn-rs/quinn#1903</a></li>
<li>proto: make initial destination cid configurable  by <a href="https://github.com/thynson"><code>`@​thynson</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1897">quinn-rs/quinn#1897</a></li>
<li>Allow configuring rng seed through <code>EndpointConfig</code> by <a href="https://github.com/aochagavia"><code>`@​aochagavia</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1901">quinn-rs/quinn#1901</a></li>
<li>quinn: introduce waking helpers by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1908">quinn-rs/quinn#1908</a></li>
<li>Wake blocked streams on 0-RTT rejection by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1905">quinn-rs/quinn#1905</a></li>
<li>Upgrade to rustc-hash 2 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1909">quinn-rs/quinn#1909</a></li>
<li>Fix unnecessary Incoming warning on Endpoint drop by <a href="https://github.com/gretchenfrage"><code>`@​gretchenfrage</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1907">quinn-rs/quinn#1907</a></li>
<li>Revise and add additional 0-rtt doc comments by <a href="https://github.com/gretchenfrage"><code>`@​gretchenfrage</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1826">quinn-rs/quinn#1826</a></li>
<li>docs: remove reference to sendmmsg by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1914">quinn-rs/quinn#1914</a></li>
<li>Fix debug assert with reordered ACKs by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1893">quinn-rs/quinn#1893</a></li>
<li>quinn: Make <code>Endpoint::client</code> dual-stack V6 by default by <a href="https://github.com/gretchenfrage"><code>`@​gretchenfrage</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1913">quinn-rs/quinn#1913</a></li>
<li>bench(udp): measure non-GSO &amp; GSO on localhost by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1915">quinn-rs/quinn#1915</a></li>
<li>proto: avoid overflow in handshake done statistic by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1918">quinn-rs/quinn#1918</a></li>
<li>Use workspace dependencies for all external dependencies by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1919">quinn-rs/quinn#1919</a></li>
<li>Fix lack of reexport of ConnectionStats and ResetError by <a href="https://github.com/TirushOne"><code>`@​TirushOne</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1920">quinn-rs/quinn#1920</a></li>
<li>[non-breaking] deps(udp): make tracing optional and add optional log by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1923">quinn-rs/quinn#1923</a></li>
<li>fix(udp): feature flag tracing in windows.rs by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1932">quinn-rs/quinn#1932</a></li>
<li>Bump MSRV to 1.70 following tokio 1.39 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1939">quinn-rs/quinn#1939</a></li>
<li>Raise default idle timeout to 30 seconds by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1938">quinn-rs/quinn#1938</a></li>
<li>Discard pre-handshake packets after the handshake by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1937">quinn-rs/quinn#1937</a></li>
<li>Apply suggestions from Clippy 1.80 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1941">quinn-rs/quinn#1941</a></li>
<li>chore(quinn): feature flag socket2 imports by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1933">quinn-rs/quinn#1933</a></li>
<li>refactor: move rust-version to workspace Cargo.toml by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1940">quinn-rs/quinn#1940</a></li>
<li>chore: move common package data to workspace Cargo.toml by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1943">quinn-rs/quinn#1943</a></li>
<li>Endpoint stats interface by <a href="https://github.com/ryleung-solana"><code>`@​ryleung-solana</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1900">quinn-rs/quinn#1900</a></li>
<li>Expose the Handshake Confirmed state by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1944">quinn-rs/quinn#1944</a></li>
<li>Exclude metrics with freestanding getters from EndpointStats by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1945">quinn-rs/quinn#1945</a></li>
<li>Fix incorrect initial DCID indexing on retried connections by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1946">quinn-rs/quinn#1946</a></li>
<li>Add expect message to unwrap in PacketBuilder by <a href="https://github.com/casey"><code>`@​casey</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1951">quinn-rs/quinn#1951</a></li>
<li>Revert &quot;proto: yield transport error for Initial packets with no CRYPTO&quot; by <a href="https://github.com/Ralith"><code>`@​Ralith</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1952">quinn-rs/quinn#1952</a></li>
<li>refactor(udp): introduce log facade by <a href="https://github.com/mxinden"><code>`@​mxinden</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1935">quinn-rs/quinn#1935</a></li>
<li>Update cargo-deny-action to v2 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/quinn-rs/quinn/pull/1953">quinn-rs/quinn#1953</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="7c09b02073"><code>7c09b02</code></a> proto: bump version to 0.11.8 for release (<a href="https://redirect.github.com/quinn-rs/quinn/issues/1981">#1981</a>)</li>
<li><a href="59bccd2e7e"><code>59bccd2</code></a> Version bump <code>quinn</code> to enforce patched <code>quinn-proto</code></li>
<li><a href="a8ec510fd1"><code>a8ec510</code></a> proto: avoid panicking on rustls server config errors</li>
<li><a href="c26e8cd2f7"><code>c26e8cd</code></a> Bump versions</li>
<li><a href="e01609ccd8"><code>e01609c</code></a> Merge commit from fork</li>
<li><a href="c292a3c6a6"><code>c292a3c</code></a> Fix and test validation of IDCID length</li>
<li><a href="bb02a12a84"><code>bb02a12</code></a> fix(.github/android): use API level 26</li>
<li><a href="5e5cc93645"><code>5e5cc93</code></a> fix(.github/android): pass matrix.target and increase api to v26</li>
<li><a href="cef42cccef"><code>cef42cc</code></a> fix(udp): typo in sendmsg error log</li>
<li><a href="edf16a6f10"><code>edf16a6</code></a> ci(rust.yml): add workflow testing feature permutations</li>
<li>Additional commits viewable in <a href="https://github.com/quinn-rs/quinn/compare/quinn-proto-0.11.3...quinn-proto-0.11.8">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=quinn-proto&package-manager=cargo&previous-version=0.11.3&new-version=0.11.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 13:32:32 +00:00
23e14138bb facet distribution: implement Display for OrderBy 2024-09-12 17:43:50 +02:00
e44325683a Facet distribution: fix issue where truncated facet distribution would have a wrong order 2024-09-12 17:43:49 +02:00
02c2b660f8 Merge #4920
4920: Change OpenAI default model r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4856

See also [public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#b4685a48c4784262a149ec307ec58671)

## What does this PR do?
- make the `text-embedding-3-small` the default model for OpenAI instead of `text-embedding-ada-002`. Existing embedders are not impacted


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-11 07:08:39 +00:00
f18e9cb7b3 Change openai default model 2024-09-09 13:09:35 +02:00
db0cf3b2ed Merge #4912
4912: Allow Meilitool to dumplessly, offline upgrade v1.9 -> v1.10 in some conditions r=Kerollmops a=dureuill

- bail early if the DB contains at least 1 REST embedder, providing the list of detected REST embedders, and without modifying the DB
- Might depend on the feature set that meilitool was compiled with and the featureset that the Meilisearch that created the DB was compiled with 💀. In case of runtime error, try again with a different feature set (passing or not passing `-p meilitool` when building after a `cargo clean`)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-05 09:11:23 +00:00
f6abf01d2c Check REST embedders before touching the DB 2024-09-05 10:49:59 +02:00
28da759f11 meilitool: Support dumpless upgrade from v1.9 to v1.10 when there are no REST embedders 2024-09-05 10:08:38 +02:00
ea96d19525 Change versioning in meili 2024-09-05 10:08:06 +02:00
d352b1ee83 Add serde to meilitool 2024-09-05 10:07:33 +02:00
3f3cebf5f9 Bump quinn-proto from 0.11.3 to 0.11.8
Bumps [quinn-proto](https://github.com/quinn-rs/quinn) from 0.11.3 to 0.11.8.
- [Release notes](https://github.com/quinn-rs/quinn/releases)
- [Commits](https://github.com/quinn-rs/quinn/compare/quinn-proto-0.11.3...quinn-proto-0.11.8)

---
updated-dependencies:
- dependency-name: quinn-proto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-03 20:50:30 +00:00
b278815617 Merge #4908
4908: Bring back changes from release v1.10.1 to main r=dureuill a=irevoire

# Pull Request

Following the [latest release](https://github.com/meilisearch/meilisearch/releases/tag/v1.10.1), this PR brings back the changes to main.

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2024-09-03 14:28:12 +00:00
40e13ceef3 Merge #4892
4892:  Add a documentTemplateMaxBytes parameter to limit the max length of document templates r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4885 

See [public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#a3d63628129e40adba943ae7b8ec06c2)



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-03 11:50:07 +00:00
18a2c13e4e add analytics 2024-09-03 12:07:59 +02:00
ed19b7c3c3 Only reindex if the size increased 2024-09-03 12:07:59 +02:00
66bda2ce8a fix tests 2024-09-03 12:07:58 +02:00
1ac008926b Add maxBytes parameter 2024-09-03 12:07:15 +02:00
c49d892c82 Changes to prompt 2024-09-03 12:07:10 +02:00
de962a26f3 New error type when maxBytes is null 2024-09-03 12:01:04 +02:00
005204e9e5 make the code of init_web_app in common between most tests 2024-09-03 11:40:05 +02:00
1040e5e2b4 spawn on search queue per test 2024-09-03 11:20:25 +02:00
80408c92dc Merge #4906
4906: Add searchable fields to template r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4886 

See [public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1dd6f0eee5a1422888e1c5d48e107cd1)

## What does this PR do?
- `Prompt::render` now requires and uses metadata to indicate if the fields are searchable or not
- Changes default template
- Updated tests
- Correctly reindex vectors when the list of searchable fields changes in a settings update.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-03 07:14:58 +00:00
fa1a0beb0c fix conflicts after rebase 2024-09-02 18:15:42 +02:00
5aefe7cd17 add the snapshots 2024-09-02 16:27:51 +02:00
e6dd66e4a0 Do not fail the whole batch when a single document deletion by filter fails 2024-09-02 16:27:51 +02:00
6e3839d8b6 autobatch document deletion by filter 2024-09-02 16:27:51 +02:00
cd271b8762 stop trying to process searches after one minute 2024-09-02 16:27:51 +02:00
3ce8500d4c ensure we never early exit when we have a permit and remove the warning when we implicitely drop a permit 2024-09-02 16:27:51 +02:00
588000d398 add a warning to help us find when we forget to drop explicitely drop a permit 2024-09-02 16:27:51 +02:00
92b151607c explicitely drop the search permit 2024-09-02 16:27:51 +02:00
42e7499260 Update version for the next release (v1.10.1) in Cargo.toml 2024-09-02 16:27:51 +02:00
41aa1e1424 Only spawn one search queue in actix-web 2024-09-02 16:27:50 +02:00
24ace5c381 Add reindexing test 2024-09-02 13:37:01 +02:00
21296190a3 Reindex embedders 2024-09-02 13:00:53 +02:00
03fda78901 update other tests 2024-09-02 11:31:31 +02:00
30a143f149 Test new facilities 2024-09-02 11:31:23 +02:00
4464d319af Change default template to use the new facility 2024-09-02 11:30:59 +02:00
580ea2f450 Pass the fields <-> ids map with metadata to render 2024-09-02 11:30:10 +02:00
915cf4bae5 Add field.is_searchable property to fields 2024-09-02 11:28:53 +02:00
9a756cf2c5 Merge #4888
4888: bring back v1.10.0 into main r=Kerollmops a=ManyTheFish



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-27 14:02:08 +00:00
36d8684dc8 Merge #4881
4881: Infer locales from index settings r=curquiza a=ManyTheFish

# Pull Request

## Related issue
Fixes #4828
Fixes #4816
## What does this PR do?
- Add some test using `AttributesToSearchOn`
- Make the search infer the language based on the index settings when the `locales` filed is not precise


CI is now working:
https://github.com/meilisearch/meilisearch/actions/runs/10490050545/job/29055955667



Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-21 14:18:16 +00:00
b12e997c8a Add pinyin flag 2024-08-21 14:38:04 +02:00
8bf89ec394 Infer locales from index settings 2024-08-21 10:47:40 +02:00
ee62d9ce30 Merge #4845
4845: Fix perf regression facet strings r=ManyTheFish a=dureuill

Benchmarks between v1.9 and v1.10 show a performance regression of about x2 (+3dB regression) for most indexing workloads (+44s for hackernews).

[Benchmark interpretation in the engine weekly meeting](https://www.notion.so/meilisearch/Engine-weekly-4d49560d374c4a87b4e3d126a261d4a0?pvs=4#98a709683276450295fcfe1f8ea5cef3).

- Initial investigation pointed to #4819 as the origin of the regression.
- Further investigation points towards the hypernormalization of each facet value in `extract_facet_string_docids`
- Most of the slowdown is in `normalize_facet_strings`, and precisely in `detection.language()`.

This PR improves the situation (-10s compared with `main` for hackernews, so only +34s regression compared with `v1.9`) by skipping normalization when it can be skipped.

I'm not sure how to fix the root cause though. Should we skip facet locale normalization for now? Cc `@ManyTheFish` 

---

Tentative resolution options:

1. remove locale normalization from facet. I'm not sure why this is required, I believe we weren't doing this before, so maybe we can stop doing that again.
2. don't do language detection when it can be helped: won't help with the regressions in benchmark, but maybe we can skip language detection when the locales contain only one language?
3. use a faster language detection library: `@Kerollmops` told me about https://github.com/quickwit-oss/whichlang which bolsters x10 to x100 throughput compared with whatlang. Should we consider replacing whatlang with whichlang? Now I understand whichlang supports fewer languages than whatlang, so I also suggest:
4. use whichlang when the list of locales is empty (autodetection), or when it only contains locales that whichlang can detect. If the list of locales contains locales that whichlang *cannot* detect, **then** use whatlang instead.

---

> [!CAUTION]
> this PR contains a commit that adds detailed spans, that were used to detect which part of `extract_facet_string_docids` was taking too much time. As this commit adds spans that are called too often and adds 7s overhead, it should be removed before landing.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-19 06:29:48 +00:00
0f965d3574 Remove hotloop's spans 2024-08-14 14:33:36 +02:00
ade54493ab Only detect language for a facet if several locales have been specified by the user in the settings 2024-08-14 12:03:52 +02:00
07c8ed0459 Merge #4864
4864: Don't remove facet value when multiple original values map to the same normalized value r=ManyTheFish a=dureuill

# Pull Request

## Related issue

Fixes #4860 

> [!WARNING]  
> This PR contains a fix to the immediate issue, but it looks like the underlying data model is faulty: there is only one possible "original" value for each normalized value in a facet of a document, while because of array values (or manually written nested fields, if you're evil), it is technically possible to have multiple, distinct original values mapping to the same normalized value.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-08-13 14:04:17 +00:00
c3cdc407ec Avoid unnecessary clone() 2024-08-08 14:57:02 +02:00
2f10273d14 Group by normalized values, make sure you don't remove a value where there remains at still one value that normalizes towards it 2024-08-08 14:02:53 +02:00
321639364f Merge #4861
4861: Make sure the index scheduler never stops running r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4748

## What does this PR do?
- Whatever happens, we always try to process tasks once every minute (if no tasks are enqueued that's practically free)

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-08-07 16:21:54 +00:00
442d06dce7 ensure the run function doesn't panic even if the tick function does 2024-08-07 17:50:32 +02:00
8f6a98df07 make sure the index scheduler never stops running 2024-08-07 17:06:43 +02:00
b44e17c4c3 Merge #4858
4858: also intersect the universe for searchOnAttributes r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4857 

## What does this PR do?
- intersect with the universe (which does not contain the filtered out ids) when looking up documents for words, even when using `searchOnAttributes`


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-08-07 13:15:26 +00:00
e3ef0ae19e also intersect the universe for searchOnAttributes 2024-08-06 14:06:56 +02:00
57f7af77c7 Merge #4846
4846: Add OpenAI tests r=dureuill a=dureuill

# Pull Request

## Related issue
Part of fixing #4757 

## What does this PR do?
- OpenAI embedder: don't pass apiKey when it is empty (slightly improves error messages)
- rest embedder and rest-based embedders: specialize the authorization denied error message depending on the configuration source
- fix existing tests
- Adds assets containing prerecorded texts to embed and the embeddings obtained from OpenAI
- Adds an asset containing a tokenized long document and the embedding obtained from OpenAI for this token
- Uses the wiremock crate to mock the OpenAI API: parse the openai request, lookup the response in assets, craft an openai response


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-08-05 10:49:28 +00:00
2d16d0aea1 Merge #4839
4839: In prometheus metrics return the route pattern instead of the real route when returning the HTTP requests total r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4825

## What does this PR do?
- return the route pattern instead of the real route when returning the HTTP requests total


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-08-05 10:14:51 +00:00
c817718e07 Merge #4853
4853: Fix rhai deletion r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4849 

## What does this PR do?
- insert inside of the bitmap instead of pushing into it.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-08-01 16:34:31 +00:00
e64d0e0ca8 use insert instead of push for bitmaps 2024-08-01 18:32:45 +02:00
21aa430b5e Fix openai tests 2024-07-31 17:57:55 +02:00
8535dc0be2 Fix existing tests 2024-07-31 17:57:32 +02:00
72b9005344 Redact uid for Value 2024-07-31 17:57:13 +02:00
420c33132c Merge #4850
4850: Use a fixed date format regardless of features r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4844 

## What does this PR do?

Given the following script: 
```
cargo run -- --db-path meili.ms
sleep 3
curl -s -X POST http://127.0.0.1:7700/indexes -H 'Content-Type: application/json' --data-binary '{"uid": "movies", "primaryKey": "id"}'
sleep 3
cargo run  -p meilisearch --db-path meili.ms
sleep 3
curl -s -X POST http://127.0.0.1:7700/indexes/movies/search -H 'Content-Type: application/json' --data-binary '{}'
```

- Before this PR, the final search returns a decoding error.
- After this PR, the search completes successfully

### Technical standpoint

This PR fixes two locations where the formatting of dates were dependent on the feature set of the `time` crate.

1. The `IndexStats` had two fields without the serialization format specified
2. More subtly, the index dates (`createdAt,` `updatedAt`) were using value remapping in the main DB to `SerdeJson<OffsetDateTime>`, which was using whatever default format was available. This was fixed by creating a local `OffsetDateTime` wrapper that would specify the serialization format 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-31 15:32:26 +00:00
9ef710cad4 Use wrapper that forces the desired date format 2024-07-31 17:12:19 +02:00
48f7329a83 Specify index_mapper on IndexStats 2024-07-31 17:11:28 +02:00
ab1ec9ca21 Add tokenized test 2024-07-31 15:03:45 +02:00
9d6efd92d2 new assets for tokenized test 2024-07-31 15:03:45 +02:00
abdb337fd6 Add openai tests 2024-07-31 15:03:45 +02:00
1c755c8899 Add openai responses 2024-07-31 15:03:45 +02:00
3a42c3134e update tests after changing authorized error message 2024-07-31 15:03:45 +02:00
5aa6cb3600 Specialize authorized error message depending on config source 2024-07-31 15:03:44 +02:00
9b7764575b openai: don't pass apiKey when it is empty 2024-07-31 15:03:44 +02:00
0e68718027 Add detailed spans 2024-07-31 13:05:47 +02:00
7c3fc8c655 Split settings and document facet string extractions 2024-07-31 10:57:46 +02:00
8acd3f50bb skip normalization when the locales and values are the same 2024-07-31 09:53:00 +02:00
25791e3f46 Merge #4836
4836: Attach declared localized-attributes subroutes r=dureuill a=dureuill

RC.0 unexpectedly doesn't contain the `GET /indexes/{indexUid}/localized-attributes` and `PUT /indexes/{indexUid}/localized-attributes` subroute.

This PR makes them available.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-30 19:01:54 +00:00
866922ecc3 Merge #4808
4808: Make the tests run faster r=irevoire a=irevoire

## Index-Scheduler

### Only check the consistency of the index-scheduler on snapshots when running in release mode

This saves 12s on the tests, and since the tests run in release mode in the CI, we don't lose any information.
From 28s to 16s

### We were snapshotting the index for no reason in `advance_till`, I removed this call

This saved an additional 8s on the tests, going from 16s to 8s.

----

After these two optimizations, the test suite as a whole executes 14% quicker

## Meilisearch integration tests

While profiling this test suite, nothing stands out. The only noticeable thing is that we're losing most of our time creating and dropping threads.
I made the theory that by sharing a single common instance between all integrations tests I would gain some time again.

In 355a7acd1c I saved another 15s by only testing this theory on the module that tests the error messages. 
But we can do it on many more tests. **We must take care of not making any test flaky, though**.

## Use two indexing threads

By moving from one to two indexing threads, we gain an additional 30% in performance.

# Conclusion

## Before

The execution of the test suite was taking around:
- 4m40s on my computer
- 15 minutes on the debug CI with cache
- 29 minutes on the Windows CI with cache

## After

The execution of the test suite is taking around:
- 2m20 on my computer
- 8 minutes on the debug CI with cache
- 29 minutes on the Windows CI with cache

## This means the test suite should now run ~50% faster on your computer; the CI may report errors twice faster, but we'll still wait for ~the same amount of time to merge a PR


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-30 15:11:30 +00:00
f05ea04879 In prometheus metrics return the route pattern instead of the real route when returning the HTTP requests total 2024-07-30 16:24:49 +02:00
b1b3a1a98b add a get, set and put test for the localized attributes setting 2024-07-30 15:51:02 +02:00
143d6cde10 Merge #4835
4835: Log error from main using tracing r=irevoire a=dureuill

Engine follow-up to https://github.com/meilisearch/meilisearch-support/issues/252#issuecomment-2251288276 (private link)

> `@meilisearch/engine-team` we need to open a PR to tracing::error! when an error occurs in the Meilisearch main. It would be nice to have it included in the second RC

<img width="1349" alt="Error logged when launching Meilisearch to import dump on path where the dump doesn't exist" src="https://github.com/user-attachments/assets/e5d2ae6e-f810-4029-9787-3b6ea9d47cfd">

---

<img width="1349" alt="Error logges when launching Meilisearch with a db path that is not writeable" src="https://github.com/user-attachments/assets/f672d78d-04b0-4d02-9402-259eaa6e2b62">



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-30 13:43:50 +00:00
c457069367 ensure a test is 100% not flaky 2024-07-30 15:41:51 +02:00
bb1283222e make clippy happy 2024-07-30 15:10:56 +02:00
7a5a38f870 fix a sync issue on empty indexes 2024-07-30 15:09:12 +02:00
ded3cd0dd6 an additionnal 30% of perf for the tests 2024-07-30 15:03:54 +02:00
68f885f1c4 fix two snapshots 2024-07-30 14:45:59 +02:00
9372c34dab prepare the tests to share indexes with api key 2024-07-30 14:34:11 +02:00
6666c57880 reduce the number of thread spawned by milli 2024-07-30 14:34:10 +02:00
b53a019b07 fix the initialization problem over the shared indexes with documents 2024-07-30 14:24:57 +02:00
d262b1df32 craft an API over the Shared Server and Shared index to avoid hard to debug mistakes 2024-07-30 14:24:57 +02:00
ed795bc837 fmt 2024-07-30 14:24:57 +02:00
993264227d reuse an index with already indexed documents instead of reindexing from scratch 2024-07-30 14:24:57 +02:00
953d3a44bd make the new_shared function synchronous and stop indexing documents when it's not required 2024-07-30 14:24:57 +02:00
e5345fb0eb shave off 15s by providing a shared instance to the integration tests 2024-07-30 14:24:55 +02:00
2d9a055fb9 stops snapshotting in advance_till when we don't need to 2024-07-30 13:57:12 +02:00
110dc01f40 only check the consistency of the index-scheduler on snapshots when running in release mode 2024-07-30 13:57:12 +02:00
9719dec443 Attach declared attributes-localized subroutes 2024-07-29 16:19:35 +02:00
fa77a949aa Log error from main using tracing 2024-07-29 14:58:39 +02:00
abe128476f Merge #4830
4830: Use the dtolnay's Rust Toolchain r=dureuill a=Kerollmops

Fixes the CI by using another rust-toolchain GitHub repo.

Note: the [helix-editor/rust-toolchain repository](https://github.com/helix-editor/rust-toolchain) has been deleted so we moved to the [dtolnay/rust-toolchain](https://github.com/dtolnay/rust-toolchain) one. However, the dtolnay's one doesn't support `rust-toolchain.toml` and the version is directly in the rust-toolchain@version. We keep the `rust-toolchain.toml` for local builds only.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-29 08:33:59 +00:00
a663e408ad Move to the right rust toolchain version 2024-07-29 10:06:34 +02:00
986991277f Use the dtolnay rust toolchain 2024-07-29 10:00:40 +02:00
c2c1ba39ee Merge #4826
4826: Update Charabia v0.9.0 r=dureuill a=ManyTheFish

# Pull Request

## Related Changelog
https://github.com/meilisearch/charabia/releases/tag/v0.9.0

## Notable Change for Meilisearch
Adds all math symbols from https://www.compart.com/en/unicode/category/Sm to the default separator list.



Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-07-25 14:08:38 +00:00
35567b2137 Update Charabia v0.9.0 2024-07-25 16:02:14 +02:00
00c97c7152 Merge #4818
4818: Custom headers and QoL improvements r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4734 
Depends on #4815 

## What does this PR do?
- Adds custom headers for rest embedders ([public usage](https://meilisearch.notion.site/v1-10-AI-search-changes-737c9d7d010d4dd685582bf5dab579e2#41354652885242c899def07e36a66d49))
- Quality of life: allow specifying `dimensions` for `ollama` embedders ([public usage](https://meilisearch.notion.site/v1-10-AI-search-changes-737c9d7d010d4dd685582bf5dab579e2#37218531431343dab3d2d3a9a1937e9d)). As for `rest` embedders, specifying `dimensions` disables the "test" embedding when the embedder is spawned.
- Improve error message again when indexing documents that don't have a vector for a user-provided vector
  1. Remove the contents of the document
  2. Display the docid of the first document that triggered the error
  3. Indicate how many documents in that chunk suffered from the same issue for that embedder


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-25 13:33:11 +00:00
d4ea7cc2a9 fix clippy 👉👈 2024-07-25 12:10:32 +02:00
8532fe8afc Fix tests 2024-07-25 12:10:32 +02:00
2413592bbf Display docid when there are documents without manual embeddings for a manual embedder 2024-07-25 12:10:32 +02:00
553440632e Introduce Setting::some_or_not_set 2024-07-25 12:01:52 +02:00
7a347966da Allow explicit dimensions for ollama 2024-07-25 12:01:51 +02:00
6c598fa06d test custom headers 2024-07-25 12:01:51 +02:00
8338df0dbe Fix tests 2024-07-25 12:01:51 +02:00
4654d51e05 Add custom headers for REST embedder 2024-07-25 12:01:51 +02:00
22ef2d877f Ensure test server has a single indexing thread 2024-07-25 12:01:51 +02:00
76bc2c18e8 Merge #4819
4819: Language settings r=dureuill a=ManyTheFish

# Pull Request

## Related issue
Fixes #4749 

## What does this PR do?
- [Implement localized search](c0c6955c0d)
- [Implement localized attributes settings](bde827b055)

## Related PRD

- [PRD](https://www.notion.so/meilisearch/Define-language-settings-to-impact-relevancy-bee62e18b7584c4f87d18a7654855329)
- [Public usage](https://www.notion.so/meilisearch/v1-10-Language-settings-usage-26c5d98b553349d9abacbe7aff698e4e)


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-25 09:00:33 +00:00
59115fd058 Fix tests 2024-07-25 10:52:57 +02:00
a918561ac1 Fix PR comments 2024-07-25 10:52:56 +02:00
70d71581ee fix clippy 2024-07-25 10:52:56 +02:00
4fbe048cbf Update Cargo.lock 2024-07-25 10:52:56 +02:00
e06fbcc607 Update snapshots 2024-07-25 10:52:56 +02:00
04fa44e7eb Implement localized attributes settings 2024-07-25 10:51:27 +02:00
90c0a6db7d Implement localized search 2024-07-25 10:51:27 +02:00
d82f8fd904 Add tests 2024-07-25 10:51:27 +02:00
cc02920f2b Update charabia 2024-07-25 10:51:27 +02:00
c26bd68de5 Merge #4815
4815: Rest embedder api mk2 r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4756

- [x] [REST API parameter names and behavior are unclear](https://github.com/meilisearch/documentation/pull/2824#issuecomment-2124073720)
  - unclear names are removed. There remain only two parameters: `request`, a template of what Meilisearch's request to the embedding server should be, and `response`, a template of what the embedding server's response to Meilisearch should look like
- [x] [Bad error message or bad default value when we don't specify the `query` parameter](85d8455c11/meilisearch/tests/vector/rest.rs (L105-L140))
  - The replacement for `query`, which is `request`, is now a mandatory parameter. Omitting it will result in the following error message : "`.embedders.rest`: Missing field `request` (note: this field is mandatory for source rest)", which is clear
- [x] [Bad error message when both `pathToEmbeddings` and `embeddingObject` are missing](2141cb3b69/meilisearch/tests/vector/rest.rs (L142-L178))
  - These parameters no longer exist. Now, the point of extraction is given directly by the location of an `{{embedding}}` placeholder in the `response` parameter.
- [x] [Unexpected error when we don't specify both `pathToEmbeddings` and `embeddingObject` (only once should be required)](2141cb3b69/meilisearch/tests/vector/rest.rs (L180-L260))
  - These parameters no longer exist. Now, the point of extraction is given directly by the location of an `{{embedding}}` placeholder in the `response` parameter.
- [x] [Should not panic when the dimensions specified do not work with the model](2141cb3b69/meilisearch/tests/vector/rest.rs (L262-L299))
  - This no longer panics, instead returns "While embedding documents for embedder `rest`: runtime error: was expecting embeddings of dimension `2`, got embeddings of dimensions `3`"
- [x] [Be more flexible on the type of data that is accepted](https://github.com/meilisearch/meilisearch/issues/4757#issuecomment-2201948531)
  - [x] Always accept arrays of embeddings even if `inputType` is set to `text`
    - This is controlled by the repeat placeholder `"{..}"`, an array of embeddings can be configured even if the input is not in an array.
  - [x] Accept arrays of result at the root level and texts/array of text at the root level.
    -  doable with `request: "{{text}}"` and `response: "{{embedding}}"` or `response: ["{{embedding}}"]` (see test `vector::rest::server_raw`)

## What does this PR do?
- [See public usage](https://meilisearch.notion.site/v1-10-AI-search-changes-737c9d7d010d4dd685582bf5dab579e2#8de842673ffa4a139210094a89c1ec3e)
- Add new `milli::vector::json_template` module to parse JSON templates with an injection placeholder and a repeat placeholder
- Change rest embedder to use two JSON templates
- Change ollama and openai embedders to use the new rest embedder
- Update settings
- Update and add tests

## Breaking change

> [!CAUTION]
> This PR is a breaking change to the REST embedder.
> Importing a dump containing a REST embedder configuration will fail in v1.10 with an error: "Error: unknown field `query`, expected one of `source`, `model`, `revision`, `apiKey`, `dimensions`, `documentTemplate`, `url`, `request`, `response`, `distribution` at line 1 column 752".

Upgrade procedure:

1. Remove any embedder with source "rest"
2. Create a dump
3. Import that dump in a v1.10
4. Re-add any removed embedder, using the new settings.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-24 16:32:52 +00:00
80fdea9afc Merge pull request #4823 from meilisearch/explicit-check-bench
Explicitly check permissions when receiving a slash command
2024-07-24 17:34:07 +02:00
e3faacd160 Explicitly check permissions when receiving a slash command 2024-07-24 17:09:25 +02:00
988552e178 add tests on the rest embedder 2024-07-24 14:34:17 +02:00
0d8199f3b7 Change parameters in milli settings 2024-07-24 14:34:17 +02:00
4b74803dae Change parameters in vector settings 2024-07-24 14:34:17 +02:00
d731fa661b ollama and openai use new EmbedderOptions 2024-07-24 14:34:17 +02:00
a1beddd5d9 rest embedder: use json_template 2024-07-24 14:34:17 +02:00
4109182ca4 Add json_template module 2024-07-24 14:34:12 +02:00
1a297c048e Error changes 2024-07-24 14:34:12 +02:00
ecee0c922f Merge #4822
4822: HuggingFace: Clearer error message when a model is not supported r=Kerollmops a=dureuill

# Pull Request

## Related issue
Context: <https://github.com/meilisearch/meilisearch/discussions/4820>

## What does this PR do?
- Improve error message when a model configuration cannot be loaded and its "architectures" field does not contain "BertModel"

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-23 14:09:47 +00:00
303e601b87 HuggingFace: Clearer error message when a model is not supported 2024-07-23 15:13:22 +02:00
f6d2c59bca Merge #4817
4817: Update version for the next release (v1.10.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-07-22 15:51:20 +00:00
50b7093f8e Update version for the next release (v1.10.0) in Cargo.toml 2024-07-22 13:54:38 +00:00
48bc797dce Merge #4812
4812: Allow `MEILI_NO_VERGEN` env var to skip vergen r=irevoire a=dureuill

- vergen checks the state of the `.git` directory to embed commit information into the `meilisearch` binary and the `cargo xtask bench` invocations.
- This check unfortunately results in too many recompilation of the `meilisearch` binary.
- This PR allows skipping vergen when the `MEILI_NO_VERGEN` variable is present in the environment

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-18 16:16:01 +00:00
c6b33fd407 Allow MEILI_NO_VERGEN env var to skip vergen 2024-07-18 17:28:01 +02:00
6e9d0de8b7 Merge #4806
4806: Update rustls as much as possible r=Kerollmops a=irevoire

# Pull Request

## Related issue
Part of https://github.com/meilisearch/meilisearch/issues/4753

## What does this PR do?
- Update rustls as much as possible

## What is missing

In rustls-0.22.0 two structures we were using have been removed with no explanation or workaround
<img width="518" alt="image" src="https://github.com/user-attachments/assets/fa112db1-3400-4163-8819-7913f22d6b87">



Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 17:00:01 +00:00
1bfb16386c Update rustls as much as possible 2024-07-17 18:21:26 +02:00
ea73615abf Merge #4804
4804: Implements the experimental contains filter operator r=irevoire a=irevoire

# Pull Request
Related PRD: (private link) https://www.notion.so/meilisearch/Contains-Like-Filter-Operator-0d8ad53c6761466f913432eb1d843f1e
Public usage page: https://meilisearch.notion.site/Contains-filter-operator-usage-3e7421b0aacf45f48ab09abe259a1de6

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3613

## What does this PR do?
- Extract the contains operator from this PR: https://github.com/meilisearch/meilisearch/pull/3751
- Gate it behind a feature flag
- Add tests


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 15:47:11 +00:00
02c61eabfa fix the range reported when the experimental feature has not been set 2024-07-17 16:54:33 +02:00
56b60ec7a0 apply review comment 2024-07-17 16:13:40 +02:00
8f416e8f34 Merge #4805
4805: Log the time to index a batch of task r=Kerollmops a=irevoire

This was proposed by `@qdequele` in a private conversation and I think it’s a nice addition.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 11:45:39 +00:00
cf760cbfb1 Log the time to index a batch of task 2024-07-17 11:56:57 +02:00
2af9481804 Implements the experimental contains filter operator« 2024-07-17 11:13:37 +02:00
7a292b572a Merge #4801
4801: AI quality-of-life improvements r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4802 

## What does this PR do?
This PR implements several quality-of-life improvements described in the [public usage](https://meilisearch.notion.site/v1-10-AI-search-changes-737c9d7d010d4dd685582bf5dab579e2#ece824a1814e47a0a986d786baff1be9)


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-17 09:00:47 +00:00
8d6ac261ae Add tests on various failure modes for embedders 2024-07-16 13:39:02 +02:00
b4c8b01c88 Update existing snapshots 2024-07-16 13:39:01 +02:00
24240934f9 Improve errors when indexing documents with a user provided embedder 2024-07-16 13:39:01 +02:00
f4c94ac57f manual embedders: limit max size of errors to 250 2024-07-16 13:39:01 +02:00
4087a88dbe rest|ollama|openai: increase tries to 10 + randomize retry duration 2024-07-16 13:39:00 +02:00
5adacf2f45 OpenAI: embed only the first MAX_TOKENS tokens 2024-07-16 13:39:00 +02:00
65d0c32aa7 Allow overriding OpenAI's url 2024-07-16 13:39:00 +02:00
82647bcded When retrieveVectors is true, retrieve _vectors.embedder even if there are no vector for that embedder 2024-07-16 13:39:00 +02:00
1582c7e788 Merge #4769
4769: Federated search r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4747 

[Usage](https://meilisearch.notion.site/v1-10-federated-search-698dfe36ab6b4668b044f735fb40f0b2)

## What does this PR do?
- multi-search now allows a top-level federation object. When not `null`, the results of multi-search are modified to be a single list of results rather than a list of a list of results
- changed lifetimes around tokenizer et al. to be able to make hits one by one rather than using a vector
- adds `roaring` to Meilisearch itself. As the federated search happens at the Meilisearch level (reuses the search functions declared at the Meilisearch level + merge happens after the hits were created), `RoaringBitmap`s are needed to track the candidates: hits that were seen,  all candidates.
- Refactor `make_hits` to allow for an individual, optimized `make_hit` 
- Score details comparison no longer fail when sorting on different field names or target point (for geo)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-16 08:14:46 +00:00
20094eba06 Apply review comments 2024-07-15 12:43:29 +02:00
c35904d6e8 search::federated::ranking_rules -> search::ranking_rules 2024-07-15 08:43:22 +02:00
2cacc448b6 Rename src/search.rs -> src/search/mod.rs 2024-07-15 08:43:21 +02:00
a61b852695 Add tests 2024-07-15 08:43:21 +02:00
3167411e98 Analytics 2024-07-15 08:43:21 +02:00
83d71662aa Changes to multi_search route 2024-07-15 08:43:21 +02:00
5c323cecc7 search: introduce federated search 2024-07-15 08:43:21 +02:00
77b9347fff Merge #4783
4783: Update minimal ubuntu version used from 18.04 to 20.04 r=curquiza a=curquiza

Fixes #4782 

Co-authored-by: curquiza <clementine@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-11 16:44:30 +00:00
c85dd9f635 install a default stable toolchain before cargo build tries to install cross 2024-07-11 18:43:47 +02:00
7da95d62e2 Add DEBIAN_FRONTEND to avoid interaction with tzdata 2024-07-11 18:43:47 +02:00
2cda1360ee Remove ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION in CI 2024-07-11 18:43:47 +02:00
5f9c05b944 Update minimal ubuntu version used from 18.04 to 20.04 2024-07-11 18:43:47 +02:00
d3a6d2a6fa search: introduce hitmaker 2024-07-11 16:35:59 +02:00
2123d76089 search: introduce "search_from_kind" 2024-07-11 16:35:11 +02:00
edab4e75b0 Make SearchKind cloneable 2024-07-11 16:33:24 +02:00
b9982587d4 Add new errors to meilisearch 2024-07-11 16:31:44 +02:00
e83da00446 Milli changes to match to allow for more flexible lifetimes 2024-07-11 16:29:35 +02:00
7fb3e378ff Do not fail sort comparisons when the field name or target point are different 2024-07-11 16:28:14 +02:00
12a7a45930 Add roaring to meilisearch 2024-07-11 16:27:50 +02:00
677ed6bbf6 Merge #4787
4787: Add index exists function in index_scheduler which stops opening indexes to only check if they exist. r=Kerollmops a=Karribalu

# Pull Request

## Related issue
Fixes #4784

## What does this PR do?
- Added index_exists function in the index_scheduler.
- Resolved opening indexes to only check if they exist.
- Made changes to existing tests to test this function.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2024-07-11 13:05:20 +00:00
29b44e5541 Merge #4626
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops

This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808). It uses [the Rhai programming language](https://rhai.rs/) to let users express the modifications they want apply.

You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483) through Docker by using the following command:

```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```

## TODO
 - [x] Support the `DocumentEdition` task in dumps.
 - [x] Remove the unwraps and panics.
 - [x] Improve error codes for the `function` parameter.
 - [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0) 🚀
 - [x] Make it an experimental feature (only restrict the HTTP calls).
 - [x] It must be possible not to send a context.
 - [x] Rebase on main.
 - [x] Check that the script cannot do any io.
 - [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
 - [x] Change the `editionCode` to the clearer `function` field name in the tasks.
 - [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
 - [x] Support deleting documents when the `doc` is `()` (nil, null).
 - [x] Support canceling document edition.
 - [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/)).
 - [x] Limit the number of instruction by function execution.
 - [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
 - [x] Ignore unmodified documents in the tasks count.
 - [x] Make the `filter` field optional (not forced to be `null`).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-11 09:02:55 +00:00
6e80364c50 Apply review comments 2024-07-11 11:00:27 +02:00
603676cb3b Address PR review changes 2024-07-10 19:42:16 +01:00
23e102ca71 Address PR review changes 2024-07-10 19:33:16 +01:00
f36f34c2f7 Merge #4717
4717: Implement intersection at end on the search pipeline r=Kerollmops a=Kerollmops

This PR is akin to #4713 and #4682 because it uses the new RoaringBitmap method to do the intersections directly on the serialized bytes for the bytes LMDB/heed returns. More work related to this issue can be done, and I listed that in #4780.

Running the following command shows where we use bitand/intersection operations and where we can potentially apply this optimization.
```sh
rg --type rust --vimgrep '\s&[=\s]' milli/src/search
```

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-10 15:01:33 +00:00
3bac22fd87 We do not do intersections with the universe when it is related to cache 2024-07-10 16:49:36 +02:00
ce61cb7fe6 Simplify and speedup an intersection pass 2024-07-10 16:49:36 +02:00
1693d1a311 Simplify the check to decide to stop a loop 2024-07-10 16:49:36 +02:00
febea735ca Remove the unused universe parameter from resolve_negative_phrases 2024-07-10 16:49:36 +02:00
93ba051094 Remove the invalid get_phrases_docids universe parameter 2024-07-10 16:49:35 +02:00
cd7a20fa32 Make it work by avoid storing invalid stuff in the cache 2024-07-10 16:49:35 +02:00
41f51adbec Do less useless intersections 2024-07-10 16:49:35 +02:00
0ca1a4e805 Always do the intersections with the universe 2024-07-10 16:49:34 +02:00
50a7393c55 Modify the compute_query_term_subset_docids function to accept the universe 2024-07-10 16:49:34 +02:00
837274f853 Restrict even more the Rhai engine 2024-07-10 16:30:18 +02:00
487997f6ad Support the new editDocumentsByFunction experimental feature 2024-07-10 16:29:18 +02:00
94809090a3 Support not specifying a context 2024-07-10 16:29:18 +02:00
01144b2c74 Make the edit documents by function route experimental 2024-07-10 16:29:18 +02:00
e97600eead Improve the analytics for the document edition by function 2024-07-10 16:29:18 +02:00
767553519d Create errors for the HTTP route issues 2024-07-10 16:29:18 +02:00
aace587dd1 Create errors for the internal processing ones 2024-07-10 16:29:18 +02:00
e706023969 Fix some analytics issues 2024-07-10 16:29:17 +02:00
bcd0c5f5a4 Support DocumentEdition in dumps 2024-07-10 16:29:17 +02:00
f35d6710f3 Update rhai to v1.19.0 2024-07-10 16:29:17 +02:00
b7b8f564c3 delete-me: Simply support generating dump 2024-07-10 16:29:05 +02:00
862d49e4af Editing documents requires the documents.all action (add, get, and del) 2024-07-10 16:29:05 +02:00
81ec0abad1 Use the new rayon-par-bridge library 2024-07-10 16:29:04 +02:00
b67d385cf0 Parallelize the edition functions 2024-07-10 16:28:54 +02:00
dfecb25814 Disable the time package 2024-07-10 16:28:37 +02:00
2eae2015d7 Support aborting documents edition by function 2024-07-10 16:28:15 +02:00
33fa17bf12 Support deleting documents with functions 2024-07-10 16:28:15 +02:00
400e6b93ce Support user-provided context for documents edition 2024-07-10 16:28:15 +02:00
f32e6c32fc Rename editionCode to function 2024-07-10 16:28:15 +02:00
f4add93043 Limit the number of script operations 2024-07-10 16:28:14 +02:00
f07256971a Fix tests 2024-07-10 16:28:14 +02:00
2fae96ac14 Show the actual number of actually edited documents 2024-07-10 16:28:14 +02:00
246f0e7130 Make the filter field really optional 2024-07-10 16:28:14 +02:00
45af18ae9c Check the Rhai syntax before accepting the script 2024-07-10 16:28:13 +02:00
2d97164d9f It works perfectly with some Rhai 2024-07-10 16:28:13 +02:00
efc156a4a4 Executing Lua works correctly 2024-07-10 16:27:36 +02:00
ba85959642 Support filtering the documents to edit with lua 2024-07-10 16:23:21 +02:00
1702b5cf44 Prepare for processing documents edition 2024-07-10 16:23:21 +02:00
2099b4f0dd Merge #4786
4786: Update dependencies r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes #4753

## What does this PR do?
- Update all dependencies except rustls
- [x] Release charabia
- [x] Update charabia
- [x] Double check that the docker build works after updating charabia



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-10 13:23:54 +00:00
0d5bc4578e Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-10 15:21:43 +02:00
8f60ad0a23 apply review comments 2024-07-10 14:38:19 +02:00
9570139eeb update contributing.md with the new lindera update 2024-07-10 14:28:43 +02:00
9d6885793e Upgrade dependencies 2024-07-10 13:46:24 +02:00
98cd6a865c Update dependencies after removing useless ones 2024-07-10 13:37:24 +02:00
5f4530ce57 Remove more unused dependencies 2024-07-10 13:36:34 +02:00
0ecaf861fa fix ci 2024-07-10 10:06:59 +02:00
4d5005b01a make clippy happy 2024-07-10 10:06:59 +02:00
952e742321 update charabia 2024-07-09 23:41:29 +02:00
ee9aa63044 update rust version 2024-07-09 23:41:29 +02:00
43db4f4242 update fxprof_processed_profile 2024-07-09 23:41:29 +02:00
9feba5028d update byte-unit 2024-07-09 23:41:29 +02:00
0a40a98bb6 Make milli use edition 2021 (#4770)
* Make milli use edition 2021

* Add lifetime annotations to milli.

* Run cargo fmt
2024-07-09 17:25:39 +02:00
aac15f6719 Merge #4781
4781: Correct apk usages in Dockerfile r=curquiza a=PeterDaveHello


# Pull Request

## Related issue

No issue was created because this is very trivial.

## What does this PR do?

Correct apk usages in Dockerfile

There is no need to use apk with `update` or `--update-cache` when `--no-cache` is used, which will make sure the index is the latest, and leave no temporary files behind.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Peter Dave Hello <hsu@peterdavehello.org>
2024-07-09 08:51:29 +00:00
ea21b948b1 Address PR review changes 2024-07-09 09:18:57 +01:00
53a359286c Merge #4785
4785: Bump zerovec from 0.10.1 to 0.10.4 r=dureuill a=dependabot[bot]

Bumps [zerovec](https://github.com/unicode-org/icu4x) from 0.10.1 to 0.10.4.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/unicode-org/icu4x/blob/main/CHANGELOG.md">zerovec's changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>icu4x 1.5.x</h2>
<ul>
<li><code>icu_calendar</code>
<ul>
<li>(1.5.1) Fix Japanese calendar Gregorian era year 0 (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4968">unicode-org/icu4x#4968</a>)</li>
<li>(1.5.2) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to <code>repr(Rust)</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5049">unicode-org/icu4x#5049</a>)</li>
</ul>
</li>
<li><code>icu_datetime</code>
<ul>
<li>(1.5.1) Fix incorrect assertion in week-of-year formatting (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4977">unicode-org/icu4x#4977</a>)</li>
</ul>
</li>
<li><code>icu_casemap</code>
<ul>
<li>(1.5.1) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to <code>repr(Rust)</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5049">unicode-org/icu4x#5049</a>)</li>
</ul>
</li>
<li><code>icu_capi</code>
<ul>
<li>(1.5.1) Fix situations in which <code>libc_alloc</code> is specified as a dependency (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5119">unicode-org/icu4x#5119</a>)</li>
</ul>
</li>
<li><code>icu_properties</code>
<ul>
<li>(1.5.1) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to <code>repr(Rust)</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5049">unicode-org/icu4x#5049</a>)</li>
</ul>
</li>
<li><code>zerovec</code>
<ul>
<li>(0.10.3) Fix size regression by making <code>twox-hash</code> dep <code>no_std</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5007">unicode-org/icu4x#5007</a>)</li>
<li>(0.10.3) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to <code>repr(Rust)</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5049">unicode-org/icu4x#5049</a>)</li>
<li>(0.10.4) Enforce C,packed on OptionVarULE (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5143">unicode-org/icu4x#5143</a>)</li>
</ul>
</li>
<li><code>zerovec_derive</code>
<ul>
<li>(0.10.3) Enforce C,packed, not just packed, on ULE types, fixing for incoming changes to <code>repr(Rust)</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/5049">unicode-org/icu4x#5049</a>)</li>
</ul>
</li>
</ul>
<h2>icu4x 1.5 (May 28, 2024)</h2>
<ul>
<li>Components
<ul>
<li>General
<ul>
<li>Compiled data updated to CLDR 45 and ICU 75 (unicode-org#4782)</li>
</ul>
</li>
<li><code>icu_calendar</code>
<ul>
<li>Fix duration offsetting and negative-year bugs in several calendars including Chinese, Islamic, Coptic, Ethiopian, and Hebrew (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4904">#4904</a>)</li>
<li>Improved approximation for Persian calendrical calculations (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4713">unicode-org/icu4x#4713</a>)</li>
<li>Fix weekday calculations in negative ISO years (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4894">unicode-org/icu4x#4894</a>)</li>
<li>New <code>DateTime::local_unix_epoch()</code> convenience constructor (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4479">unicode-org/icu4x#4479</a>)</li>
<li>Add caching for all islamic calendars (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4785">unicode-org/icu4x#4785</a>)</li>
<li>Add caching for chinese based calendars (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4411">unicode-org/icu4x#4411</a>, <a href="https://redirect.github.com/unicode-org/icu4x/pull/4468">unicode-org/icu4x#4468</a>)</li>
<li>Switch Hebrew to faster keviyah/Four Gates calculations (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4504">unicode-org/icu4x#4504</a>)</li>
<li>Replace 2820-year with 33-year cycle in Persian calendar, with override table (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4770">unicode-org/icu4x#4770</a>, <a href="https://redirect.github.com/unicode-org/icu4x/pull/4775">unicode-org/icu4x#4775</a>, <a href="https://redirect.github.com/unicode-org/icu4x/pull/4796">unicode-org/icu4x#4796</a>)</li>
<li>Fix bugs in several calendars with new continuity test (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4904">unicode-org/icu4x#4904</a>)</li>
<li>Fix year 2319 in the Chinese calendar (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4929">unicode-org/icu4x#4929</a>)</li>
<li>Fix ISO weekday calculations in negative years (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4894">unicode-org/icu4x#4894</a>)</li>
</ul>
</li>
<li><code>icu_collections</code>
<ul>
<li>Switch from <code>wasmer</code> to <code>wasmi</code> in <code>icu_codepointtrie_builder</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4621">unicode-org/icu4x#4621</a>)</li>
</ul>
</li>
<li><code>icu_normalizer</code>
<ul>
<li>Make UTS 46 normalization non-experimental (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4712">#4712</a>)</li>
</ul>
</li>
<li><code>icu_datetime</code>
<ul>
<li>Experimental &quot;neo&quot; datetime formatter with support for semantic skeleta and fine-grained data slicing (<a href="https://redirect.github.com/unicode-org/icu4x/issues/1317">unicode-org/icu4x#1317</a>, <a href="https://redirect.github.com/unicode-org/icu4x/issues/3347">unicode-org/icu4x#3347</a>)</li>
<li><code>Writeable</code> and <code>Display</code> implementations now don't return <code>fmt::Error</code>s that don't originate from the <code>fmt::Write</code> anymore (<a href="https://redirect.github.com/unicode-org/icu4x/issues/4732">#4732</a>, <a href="https://redirect.github.com/unicode-org/icu4x/issues/4851">#4851</a>, <a href="https://redirect.github.com/unicode-org/icu4x/issues/4863">#4863</a>)</li>
<li>Make <code>CldrCalendar</code> trait sealed except with experimental feature (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4392">unicode-org/icu4x#4392</a>)</li>
<li><code>FormattedDateTime</code> and <code>FormattedZonedDateTime</code> now implement <code>Clone</code> and <code>Copy</code> (<a href="https://redirect.github.com/unicode-org/icu4x/pull/4476">unicode-org/icu4x#4476</a>)</li>
</ul>
</li>
<li><code>icu_experimental</code></li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a href="https://github.com/unicode-org/icu4x/commits/ind/zerovec@0.10.4">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=zerovec&package-manager=cargo&previous-version=0.10.1&new-version=0.10.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-09 08:02:04 +00:00
47e526f5ea Add index exists function in index_scheduler 2024-07-08 22:27:10 +01:00
4aa7d386d8 remove http and uses actix_web::http instead 2024-07-08 21:17:10 +02:00
84fabb9314 Bump zerovec from 0.10.1 to 0.10.4
Bumps [zerovec](https://github.com/unicode-org/icu4x) from 0.10.1 to 0.10.4.
- [Release notes](https://github.com/unicode-org/icu4x/releases)
- [Changelog](https://github.com/unicode-org/icu4x/blob/main/CHANGELOG.md)
- [Commits](https://github.com/unicode-org/icu4x/commits/ind/zerovec@0.10.4)

---
updated-dependencies:
- dependency-name: zerovec
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-08 18:38:44 +00:00
cd46ebd6b5 remove insta deprecating 2024-07-08 18:38:05 +02:00
ef8d9a20f8 update actix-web 2024-07-08 18:36:32 +02:00
6afa578688 update most incompatible dependencies 2024-07-08 18:31:15 +02:00
300bdfc2a7 update most dependencies 2024-07-08 18:09:12 +02:00
e7e74c0099 Correct apk usages in Dockerfile
There is no need to use apk with `update` or `--update-cache` when `--no-cache` is used, which will make sure the index is the latest, and leave no temporary files behind.
2024-07-08 21:53:58 +08:00
05cc2d1fac Merge #4779
4779: CI: Add workaround to keep using Ubuntu 18.04 r=Kerollmops a=dureuill

Uses `ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true`

Refs: https://github.com/actions/checkout/issues/1590#issuecomment-2207052044

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-08 09:58:28 +00:00
22b9c277d0 CI: Add ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION workaround to keep using Ubuntu 18.04 2024-07-08 11:04:11 +02:00
16bde973aa Merge pull request #4778 from meilisearch/meilisearch-kawaii-logo
Change the Meilisearch logo to the kawaii version
2024-07-07 18:18:32 +02:00
13d1d78a2d Change the Meilisearch logo to the kawaii version 2024-07-07 18:14:02 +02:00
b2b7a633a6 Merge #4774
4774: Rename the sortable into the filterable movies workload r=dureuill a=Kerollmops

Fixes the workload name of one movie searchable.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-04 10:07:01 +00:00
7be109cafe Rename the sortable into the filterable movies workload 2024-07-04 11:53:18 +02:00
6ebefd1067 Merge #4773
4773: New workload to ignore the initial compression phase r=dureuill a=Kerollmops

This PR introduces a new workload to ignore the time spent initially compressing the documents.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-04 09:02:02 +00:00
d25ae36e22 Introduce a new workload to ignore the initial compression phase 2024-07-04 10:58:16 +02:00
b64b4ab6ca Merge #4762
4762: Add search benchmarks r=Kerollmops a=dureuill

# Pull Request

## What does this PR do?
- [x] Modifies `xtask bench` so that workloads support an optional `target` argument. `target` defaults to `indexing::=trace`
- [x] Refactor the spans in the search to offer finer profiling granularity
- [x] Add search workloads  
- [x] Updates documentation in `BENCHMARKS.md`


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-03 08:39:29 +00:00
427861b323 Update documentation in BENCHMARKS.md 2024-07-02 16:13:54 +02:00
d29cb75061 Add search workloads 2024-07-02 16:13:54 +02:00
128e6c7502 Search: spans with a finer granularity 2024-07-02 16:13:53 +02:00
3129f96603 xtask bench: Add support for overriding the profiling target 2024-07-02 16:12:50 +02:00
c701d89fdc Merge #4754
4754: bring back v1.9.0 changes to main r=irevoire a=ManyTheFish



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-07-02 13:30:50 +00:00
3d9befd64f fix warning 2024-07-02 15:30:16 +02:00
ee14d5196c fix the tests 2024-07-02 15:18:30 +02:00
d96372b9c4 Merge branch 'main' into tmp-release-v1.9.0 2024-07-02 14:48:50 +02:00
ea67816a21 Merge #4758
4758: Bump docker/build-push-action from 5 to 6 r=curquiza a=dependabot[bot]

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/build-push-action/releases">docker/build-push-action's releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<ul>
<li>Export build record and generate <a href="https://docs.docker.com/build/ci/github-actions/build-summary/">build summary</a> by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1120">docker/build-push-action#1120</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.24.0 to 0.26.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1132">docker/build-push-action#1132</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1136">docker/build-push-action#1136</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1138">docker/build-push-action#1138</a></li>
<li>Bump braces from 3.0.2 to 3.0.3 in <a href="https://redirect.github.com/docker/build-push-action/pull/1137">docker/build-push-action#1137</a></li>
</ul>
<blockquote>
<p>[!NOTE]
This major release adds support for generating <a href="https://docs.docker.com/build/ci/github-actions/build-summary/">Build summary</a> and exporting build record for your build. You can disable this feature by setting <a href="https://docs.docker.com/build/ci/github-actions/build-summary/#disable-job-summary"> <code>DOCKER_BUILD_NO_SUMMARY: true</code> environment variable in your workflow</a>.</p>
</blockquote>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.4.0...v6.0.0">https://github.com/docker/build-push-action/compare/v5.4.0...v6.0.0</a></p>
<h2>v5.4.0</h2>
<ul>
<li>Show builder information before building by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1128">docker/build-push-action#1128</a></li>
<li>Handle attestations correctly with provenance and sbom inputs by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1086">docker/build-push-action#1086</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.19.0 to 0.24.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1088">docker/build-push-action#1088</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1105">docker/build-push-action#1105</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1121">docker/build-push-action#1121</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1127">docker/build-push-action#1127</a></li>
<li>Bump undici from 5.28.3 to 5.28.4 in <a href="https://redirect.github.com/docker/build-push-action/pull/1090">docker/build-push-action#1090</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.3.0...v5.4.0">https://github.com/docker/build-push-action/compare/v5.3.0...v5.4.0</a></p>
<h2>v5.3.0</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.18.0 to 0.19.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1080">docker/build-push-action#1080</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.2.0...v5.3.0">https://github.com/docker/build-push-action/compare/v5.2.0...v5.3.0</a></p>
<h2>v5.2.0</h2>
<ul>
<li>Disable quotes detection for <code>outputs</code> input by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1074">docker/build-push-action#1074</a></li>
<li>Warn about ignored inputs by <a href="https://github.com/favonia"><code>`@​favonia</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/1019">docker/build-push-action#1019</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.14.0 to 0.18.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/1070">docker/build-push-action#1070</a></li>
<li>Bump undici from 5.26.3 to 5.28.3 in <a href="https://redirect.github.com/docker/build-push-action/pull/1057">docker/build-push-action#1057</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.1.0...v5.2.0">https://github.com/docker/build-push-action/compare/v5.1.0...v5.2.0</a></p>
<h2>v5.1.0</h2>
<ul>
<li>Add <code>annotations</code> input by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/992">docker/build-push-action#992</a></li>
<li>Add <code>secret-envs</code> input by <a href="https://github.com/elias-lundgren"><code>`@​elias-lundgren</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/980">docker/build-push-action#980</a></li>
<li>Bump <code>`@​babel/traverse</code>` from 7.17.3 to 7.23.2 in <a href="https://redirect.github.com/docker/build-push-action/pull/991">docker/build-push-action#991</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.13.0-rc.1 to 0.14.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/990">docker/build-push-action#990</a> <a href="https://redirect.github.com/docker/build-push-action/pull/1006">docker/build-push-action#1006</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v5.0.0...v5.1.0">https://github.com/docker/build-push-action/compare/v5.0.0...v5.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="15560696de"><code>1556069</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1158">#1158</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li><a href="57e1d34ac3"><code>57e1d34</code></a> chore: update generated content</li>
<li><a href="309982ebc9"><code>309982e</code></a> chore(deps): Bump <code>`@​docker/actions-toolkit</code>` from 0.27.0 to 0.28.0</li>
<li><a href="9476c25b2a"><code>9476c25</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1153">#1153</a> from crazy-max/export-retention</li>
<li><a href="97be5a4928"><code>97be5a4</code></a> chore: update generated content</li>
<li><a href="9cac6c8ea0"><code>9cac6c8</code></a> use default retention days for build export artifact</li>
<li><a href="31159d49c0"><code>31159d4</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1149">#1149</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li><a href="07e1c3e148"><code>07e1c3e</code></a> chore: update generated content</li>
<li><a href="f7febd621d"><code>f7febd6</code></a> chore(deps): Bump <code>`@​docker/actions-toolkit</code>` from 0.26.2 to 0.27.0</li>
<li><a href="f6010ea701"><code>f6010ea</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/1147">#1147</a> from docker/dependabot/npm_and_yarn/docker/actions-t...</li>
<li>Additional commits viewable in <a href="https://github.com/docker/build-push-action/compare/v5...v6">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/build-push-action&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-02 12:36:19 +00:00
c885fcebcc Bump docker/build-push-action from 5 to 6
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v5...v6)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-02 12:28:28 +00:00
b6e1a1f2f5 Merge #4761
4761: Add vX Docker tag when publishing Docker image r=Kerollmops a=curquiza

Following this: https://github.com/meilisearch/meilisearch/discussions/4759

Co-authored-by: Clémentine <clementine@meilisearch.com>
2024-07-02 11:11:39 +00:00
277f4883f6 Add vX Docker tag when publishing Docker image 2024-07-02 12:11:44 +02:00
015d90a962 merge main 2024-07-01 11:50:36 +02:00
0df84bbba7 Merge #4746
4746: Fix hybrid search limit offset r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4745

## What does this PR do?
- Apply offset and limit to the keyword search results when they are returned early.
- Add a test that is initially failing, and then passes


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-27 12:47:08 +00:00
e53de15b8e Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
8c4921b9dd Add failing test on limit+offset for hybrid search 2024-06-27 14:21:34 +02:00
f6a00f4a90 Merge #4740
4740: Make `embeddings` optional and improve error message for `regenerate` r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4741

## What does this PR do?
- Make the `embeddings` parameter optional when manually specifying embeddings for an embedder
- Adds a lot of tests around malformed `_vectors.embedder` objects
- Use `deserr` to deserialize the `_vectors.embedder` field, improving error messages


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 10:06:28 +00:00
ce08dc509b add more tests and improve the location of the error 2024-06-27 11:51:45 +02:00
1daaed163a Make _vectors.:embedding.regenerate mandatory + tests + error messages 2024-06-27 11:04:58 +02:00
809e742253 Merge #4731
4731: Fix the missing geo distance when one or both of the lat / lng are string r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4193

## What does this PR do?
- Properly extract the lat / lng when one or both of them are string
- Add a test 


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 07:33:22 +00:00
decdfe03bc Merge #4724
4724: Improve tenant token error messages r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes  #4727

## What does this PR do?
- Introduce a bunch of new error messages around tenant tokens
- Ignore the error messages in most tests that were doing for loop over multiple kinds of errors
- Introduce new tests that specifically test these error messages


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 06:47:40 +00:00
aae5c324d7 Merge #4703
4703: Update yaup r=ManyTheFish a=irevoire

There was a bug in `yaup` where serializing a structure with an array would give you a wrong query parameter.

Now, yaup is also in charge of sending the initial `?` before the query parameters.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-27 06:10:15 +00:00
a108d8f6f3 update yaup 2024-06-26 16:03:51 +02:00
34cf576339 Merge #4706
4706: specify the rust toolchain r=irevoire a=irevoire

The action we were using was not working with the `rust-toolchain.toml` file.
But the repository is not maintained anymore.
While looking for a solution, I found out that [helix](https://github.com/helix-editor/rust-toolchain) solved the issue on their side by forking the repo and adding a few fixes. That's what I use currently, but I don't know if it's a sustainable solution in the long term

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-26 12:56:18 +00:00
eb292a7a62 Fix the missing geo distance when one or both of the lat / lng are string 2024-06-26 14:50:15 +02:00
e28332a904 set the rust toolchain to the v1.75.0 2024-06-26 14:01:28 +02:00
a1dcde6b9a Update meilisearch/src/extractors/authentication/mod.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2024-06-26 14:00:21 +02:00
544e98ca99 use teh current version for clippy 2024-06-26 13:58:25 +02:00
1e4699b82c Merge #4716
4716: Fix bad http status and error message on wrong payload  r=irevoire a=Karribalu

# Pull Request

## Related issue
Fixes #4698

## What does this PR do?
- Fixes bad http status when bad payload with gzip Content-Encoding

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2024-06-26 08:00:51 +00:00
2c09c324f7 Merge #4730
4730: fix a possibly flaky test r=irevoire a=irevoire

On slow CI, it was possible for a document addition to _not_ to be processed and then get autobatched with an index deletion, which changed their task summary details in the end.
Now, I wait for the task to finish, and the result will always be the same

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-26 07:32:51 +00:00
3d6b61d8d2 fix flakyness for real 2024-06-26 09:24:09 +02:00
1374b661d1 fix a possibly flaky test 2024-06-26 09:14:59 +02:00
7e3c306c54 Merge #4725
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong

# Pull Request

## Related issue
Fixes #4696 

## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: JWSong <thdwjddn123@gmail.com>
2024-06-26 07:06:04 +00:00
2608a596a0 Update error message and add tests for incomplete compressed document 2024-06-25 18:36:29 +01:00
e16edb2c35 use the helix action since the official one doesn't support the rust-toolchain file 2024-06-25 17:00:50 +02:00
5c758438fc Update the CI to take the rust-toolchain file into account 2024-06-25 16:59:23 +02:00
ab6cac2321 specify the rust toolchain 2024-06-25 16:59:23 +02:00
6fb36ed30e get rid of the redundant info in document_addition_with_huge_int_primary_key 2024-06-25 23:54:27 +09:00
dcdc83946f accept large number as string 2024-06-25 21:41:47 +09:00
3c4c46377b Merge #4665
4665: Add missing Korean support r=ManyTheFish a=junhochoi

Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.

# Pull Request

## Related issue

#3443 #3882 

## What does this PR do?
- Improvement on enabling Korean support

Inspired by the work (#3882) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Junho Choi <jh.choi@catenoid.net>
2024-06-25 11:51:21 +00:00
7da21bb601 introduce as many custom error message as possible 2024-06-25 12:40:51 +02:00
13161fd7d0 Merge #4722
4722: Grow by 1TB instead of 1MB r=dureuill a=dureuill

When an index reaches 1TB, increases its size by 1TB rather than 1MB

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-25 10:17:58 +00:00
b81e2951a9 Merge #4723
4723: Fixes for Rust v1.79 r=ManyTheFish a=dureuill

cherry-picked from the `release-v1.9.0` branch

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-25 09:21:29 +00:00
d75e0098c7 Fixes for Rust v1.79 2024-06-25 11:16:06 +02:00
27496354e2 Grow by 1TB instead of 1MB 2024-06-25 09:01:11 +02:00
2e0ff56f3f Add missing Korean support
Some configuration is missing `korean` features and
add a test case in `milli/src/search/mod.rs`.
2024-06-25 12:45:21 +09:00
a74fb87d1e start introducing new error messages 2024-06-24 19:00:53 +02:00
558b66e535 makes most tests works with variable error messages 2024-06-24 19:00:44 +02:00
cade18bd47 Update README.md (#4721) 2024-06-24 15:47:10 +02:00
298c7b0c93 Merge #4715
4715: Build all arroy indexes that need to be built r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4588

## What does this PR do?
- Update arroy
- Ensure we always rebuild the arroy indexes that need to be built


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-24 09:32:04 +00:00
606e108420 fix all the flaky snapshots 2024-06-24 11:13:45 +02:00
7be17b7e4c add the missing snapshots 2024-06-24 10:52:57 +02:00
1693332cab Update arroy and always build the tree that need to be built 2024-06-24 10:14:03 +02:00
ddd564665b Merge #4713
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops

This PR is akin to #4682, but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.

A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-06-24 05:23:46 +00:00
2a38f5c757 Run Rustfmt 2024-06-21 00:14:26 +01:00
133d33d72c Merge remote-tracking branch 'origin/main' 2024-06-20 23:55:17 +01:00
fb683fe88b Fix bad http status and error message on wrong payload 2024-06-20 23:55:09 +01:00
4ae11bfd31 Merge #4710
4710: Only spawn thread pool once (v1.9) r=irevoire a=dureuill

# Pull Request

See #4707 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-20 11:45:32 +00:00
9736e16a88 Make clippy happy 2024-06-20 13:02:44 +02:00
6fa4da8ae7 Improve facet distribution speed in count mode 2024-06-20 12:58:51 +02:00
19d7cdc20d Improve facet distribution speed in lexico mode 2024-06-20 12:57:08 +02:00
c229200820 Merge #4712
4712: Update mini-dashboard 2.14 r=irevoire a=curquiza

Fixes #4668

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-06-20 08:47:22 +00:00
bad28cc9e2 Update mini-dashboard 2.14 2024-06-20 10:01:36 +02:00
534f696b29 Update the README to link more demos (#4711)
This Pull Request adds two new interesting demos to a brand new list, which replaces the short _Try it_ text just below the Where2Watch showcase image hoping people will notice them.
2024-06-20 09:53:06 +02:00
a04041c8f2 Only spawn the pool once 2024-06-19 16:25:33 +02:00
b347b66619 Revert "Add june 11th webinar banner" (#4705) 2024-06-18 18:45:50 +02:00
e580d6b98f Merge #4693
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops

This PR fixes #4611.

### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-18 07:45:03 +00:00
8ba65e333b add snapshot files 2024-06-17 16:50:26 +02:00
43875e6758 fix bug around nested fields 2024-06-17 15:59:30 +02:00
d7844a6e45 add a bunch of tests on the errors of the distinct at search time 2024-06-17 15:37:32 +02:00
e9bf4c43a4 Merge #4649
4649: Don't store the vectors in the documents database r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607

## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-17 12:32:03 +00:00
a8a0854421 Update meilisearch/src/analytics/segment_analytics.rs 2024-06-17 14:30:50 +02:00
0a8f50695e Fixes for Rust v1.79 2024-06-13 17:47:44 +02:00
09d9b63e1c - test case where all vectors were generated
- update tests following changes in behavior from previous commit
2024-06-13 17:16:41 +02:00
b9b938c902 Change retrieveVectors behavior:
- when the feature is disabled, documents are never modified
- when the feature is enabled and `retrieveVectors` is disabled, `_vectors` is removed from documents
- when the feature is enabled and `retrieveVectors` is enabled, vectors from the vectors DB are merged with `_vectors` in documents

Additionally `_vectors` is never displayed when the `displayedAttributes` list does not contain either `*` or `_vectors`

- fixed an issue where `_vectors` was not injected when all vectors in the dataset where always generated
2024-06-13 17:13:36 +02:00
6bf07d969e add failing test 2024-06-13 15:49:42 +02:00
e35ef31738 Small changes following review 2024-06-13 14:20:48 +02:00
3f212a8202 Update tests 2024-06-12 18:13:34 +02:00
bc547dad6f Update dump file 2024-06-12 18:12:56 +02:00
3bc8f81abc user_provided => regenerate 2024-06-12 18:12:20 +02:00
a89eea233b Fix vectors injection 2024-06-12 17:10:19 +02:00
34fabed214 Add test for vector writeback 2024-06-12 17:09:34 +02:00
fca9fe39b3 Update test snapshots 2024-06-12 14:50:55 +02:00
f5cf01e7d1 Rework extraction to use EmbedderAction 2024-06-12 14:50:55 +02:00
d1dd7e5d09 In transform for removed embedders, write back their user provided vectors in documents, and clear the writers 2024-06-12 14:50:55 +02:00
d18c1f77d7 Update embedder configs with a finer granularity
- no longer clear vector DB between any two embedder changes
2024-06-12 14:50:55 +02:00
d0b05ae691 Add EmbedderAction to settings 2024-06-12 14:50:54 +02:00
e9bf4eb100 Reformulate ParsedVectorsDiff in terms of VectorState 2024-06-12 14:11:44 +02:00
b368105272 Add EmbedderConfigs::into_inner 2024-06-12 14:11:44 +02:00
e0eff08095 Merge #4685
4685: Fix ci tests r=dureuill a=ManyTheFish

# Pull Request
Make the all following CI succeed:
https://github.com/meilisearch/meilisearch/actions/runs/9477183091

## Related issue
Fixes #4629

## What does this PR do?
- Change the test behavior for `swedish-recomposition` feature flag
- Remove the `-v` parameter from grep

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-06-12 07:58:33 +00:00
304a9df52d Remove -v parameter 2024-06-12 07:22:24 +02:00
39f60abd7d Add and modify distinct tests 2024-06-11 17:53:53 -04:00
1991bd03da Distinct at search erases the distinct in the settings 2024-06-11 17:02:39 -04:00
ee39309aae Improve errors and introduce a new InvalidSearchDistinct error code 2024-06-11 16:03:39 -04:00
0d31be1494 Make the distinct work at search 2024-06-11 11:39:35 -04:00
3493093c4f add a batch of tests 2024-06-11 16:03:54 +02:00
7cef2299cf Fix behavior when removing a document 2024-06-11 09:45:08 +02:00
a838f39fce Merge #4682
4682: Speed Up Filter ANDs operations r=Kerollmops a=Kerollmops

This PR fixes #4659 and improves the way we do AND operations by using the latest [RoaringBitmap feature to do intersections with serialized bitmaps](https://github.com/RoaringBitmap/roaring-rs/pull/281). Doing so drastically reduces the time spent reading, copying bytes in memory to use and keep a subset of the containers in the bitmap.

### Some Example Results

With a 45M documents dataset running on a good NVMe. This example filter was taking 77ms and with this PR only 13ms (6x speedup):

```sql
artist = 'The Beatles' AND (duration 150 TO 500 OR duration NOT EXISTS) AND genres IN [Rock, 'Rock and Roll'] AND rating > 4 AND released_year 1960 TO 1990
```

By reordering the filter AND clauses we can reach a constant 8ms execution time. However, note that it is a manual operation. On the other side the previous filter pipeline is still at a constant 45ms execution time with this filter. (6x speedup)

```sql
artist = 'The Beatles' AND genres IN [Rock, 'Rock and Roll'] AND released_year 1960 TO 1990 AND (duration 150 TO 500 OR duration NOT EXISTS)
```

### To Do
- [x] Rebase on `release-v1.9.0`.
- [ ] ~Skip branches of the facet/filter tree when nothing is in common with the universe~ slower this way.
- [x] When the universe is required use the universe given in parameter if possible.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-06-11 02:51:17 +00:00
600e97d9dc gate the retrieveVectors parameter behind the vectors feature flag 2024-06-10 18:26:12 +02:00
d1962b2b0f Merge #4691
4691: Add june 11th webinar banner r=curquiza a=Strift

# Pull Request

This PR adds a banner in the README to promote tomorrow's webinar event.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Strift <laurent@meilisearch.com>
2024-06-10 16:17:21 +00:00
8b450b84f8 Add june 11th webinar banner 2024-06-10 17:45:14 +02:00
7add7d053c Merge #4689
4689: Bring back changes from v1.8.2 into v1.9.0 r=curquiza a=dureuill



Co-authored-by: dureuill <dureuill@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
2024-06-10 14:03:55 +00:00
7559dfc814 Merge tag 'v1.8.2' into release-v1.9.0 2024-06-10 15:07:34 +02:00
6c6c4732a1 Merge #4681
4681: Fix concurrency issue r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4654 

## What does this PR do?
- Asynchronously drop permits


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-10 09:36:08 +00:00
0502b17501 log the state of the index-scheduler in all failed tests 2024-06-10 10:52:49 +02:00
3976fe660e Merge #4688
4688: Update version for the next release (v1.8.2) in Cargo.toml r=dureuill a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: dureuill <dureuill@users.noreply.github.com>
2024-06-10 08:28:34 +00:00
50f8218a5d Asynchronously drop permits 2024-06-10 10:19:57 +02:00
19585f1a4f Update version for the next release (v1.8.2) in Cargo.toml 2024-06-10 07:59:36 +00:00
8ec6e175e5 Replace roaring patch to the v0.10.5 2024-06-07 22:11:26 -04:00
57d066595b fix Tests almost all features 2024-06-06 17:24:50 +02:00
75b2e02cd2 Log more stuff around filtering 2024-06-06 11:00:07 -04:00
40f05fe156 Bump roaring to the latest commit 2024-06-06 10:59:55 -04:00
734d1c53ad fix a panic in yaup 2024-06-06 16:31:07 +02:00
52d0d35b39 Revert "Reduce the universe while exploring the facet tree" because it's slower this way
This reverts commit 14026115f21409535772ede0ee4273f37848dd61.
2024-06-06 09:17:51 -04:00
5432776132 Reduce the universe while exploring the facet tree 2024-06-06 09:17:51 -04:00
66470b27e6 Use the MultiOps trait for IN operations 2024-06-06 09:17:51 -04:00
0a9bd398c7 Improve the NOT operator to use the universe when possible 2024-06-06 09:17:51 -04:00
7967e93c16 Skip evaluating when a universe is empty, nothing can be found 2024-06-06 09:17:51 -04:00
a6f3a01c6a Expose the universe to do efficient intersections on deserialization 2024-06-06 09:17:51 -04:00
4ca4a3f954 Make the CboRoaringBitmapCodec support intersection on deserialization 2024-06-06 09:17:51 -04:00
e4a69c5ac3 Introduce the FacetGroupLazyValue type 2024-06-06 09:17:50 -04:00
ff2e498267 Patch roaring to use the version supporting intersection on deserialization 2024-06-06 09:17:50 -04:00
531e3d7d6a MultiOps trait for OR operations 2024-06-06 09:17:50 -04:00
63dded3961 implements the new analytics for the get documents routes 2024-06-06 11:39:29 +02:00
2cdcb703d9 fix the deletion of vectors and add a test 2024-06-06 11:39:29 +02:00
6607875f49 add the retrieveVectors parameter to the get and fetch documents route 2024-06-06 11:39:29 +02:00
ea61e5cbec makes clippy happy x2 2024-06-06 11:39:29 +02:00
31a793d226 fix the regeneration of the embeddings in the search 2024-06-06 11:39:29 +02:00
d85ab23b82 rename all occurences of user_defined to user_provided for consistency 2024-06-06 11:39:29 +02:00
b7349910d9 implements mor review comments 2024-06-06 11:39:29 +02:00
49fa41ce65 apply first round of review comments 2024-06-06 11:39:29 +02:00
400cf3eb92 add api error test on the new retrieveVectors parameter 2024-06-06 11:39:29 +02:00
376b3a19a7 makes clippy and fmt happy 2024-06-06 11:39:29 +02:00
d92c173fdc update the new similar tests 2024-06-06 11:39:29 +02:00
b867829ef1 remove useless dbg 2024-06-06 11:39:29 +02:00
6b29676e7e update snapshots 2024-06-06 11:39:29 +02:00
caad40964a implements the analytics 2024-06-06 11:39:29 +02:00
cc5dca8321 fix two bug and add a dump test 2024-06-06 11:39:29 +02:00
5d50850e12 always push the user defined vectors in arroy 2024-06-06 11:39:29 +02:00
a73ccc78a6 forward the embedding config to the extractors 2024-06-06 11:39:28 +02:00
9eb6f522ea wraps the index embedding config in a struct 2024-06-06 11:37:30 +02:00
04f6523f3c expose a new parameter to retrieve the embedders at search time 2024-06-06 11:36:11 +02:00
30d66abf8d fix the test 2024-06-06 11:36:11 +02:00
84e498299b Remove the vectors from the documents database 2024-06-06 11:36:11 +02:00
7a84697570 never store the _vectors as searchable or faceted fields 2024-06-06 11:36:11 +02:00
4148fbbe85 provide a method to get all the nested fields ids from a name 2024-06-06 11:36:11 +02:00
cb765ad249 Merge #4684
4684: Update Charabia v0.8.11 r=irevoire a=ManyTheFish

# Update Charabia v0.8.11

### Adds a new normalizer to normalize œ to oe and æ to ae
Now search words containing `œ` or `æ` will be retrieved using `oe` or `ae`, like `Daemon` <=> `Dæmon`

### Fix: make `chinese-normalization-pinyin` feature flag compile
Fixes #4629



Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-06-06 08:59:49 +00:00
2e50c6ec81 Update Charabia 2024-06-06 10:18:43 +02:00
40b2345394 Merge #4680
4680: Speedup additional searchables r=Kerollmops a=ManyTheFish

Fixes #4492.

## To Do
 - [x] Do not call the `InnerSettingsDiff::only_additional_fields` function too many times

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-06-05 15:39:28 +00:00
30293883e0 Fix condition mistake 2024-06-05 17:30:07 +02:00
b833be46b9 Avoid running proximity when only the exact attributes changes 2024-06-05 17:30:07 +02:00
0a4118329e Put only_additional_fields to None if the difference gives an empty result. 2024-06-05 17:30:07 +02:00
261e92d7e6 Skip iterating over documents when the faceted field list doesn't change 2024-06-05 17:30:07 +02:00
5cd08979b1 iterate over the faceted fields instead of over the whole document 2024-06-05 17:30:07 +02:00
2af7e4dbe9 Rename the embeddings workloads 2024-06-05 17:30:07 +02:00
a998b881f6 Cache a lot of operations to know if a field must be indexed 2024-06-05 17:30:07 +02:00
b81953a65d Add a span for the prepare_for_documents_reindexing 2024-06-05 17:30:07 +02:00
091bb157f1 Add a span for the settings diff creation 2024-06-05 17:30:07 +02:00
1b639ce44b Reduce the number of complex calls to settings diff functions 2024-06-05 17:30:07 +02:00
87cf8a3c94 Introduce a new way to determine the operations to perform on the fields 2024-06-05 17:30:07 +02:00
0f578348f1 Introduce a dedicated function to write proximity entries in database 2024-06-05 17:30:07 +02:00
fad4675abe Give the settings diff to the write_typed_chunk_into_index function 2024-06-05 17:30:07 +02:00
1ab03c4ede Fix an issue with settings diff and * in the searchable attributes 2024-06-05 17:30:07 +02:00
0c6e4b2f00 Introducing a new into_del_add_obkv_conditional_operation function 2024-06-05 17:30:07 +02:00
42b3f52ef9 Introduce the SettingDiff only_additional_fields method 2024-06-05 17:30:07 +02:00
93f5defedc Merge #4656
4656: Adding a new `searchableAttribute` no longer re-index all the attributes r=ManyTheFish a=Kerollmops

Fixes #4492.

## To Do
 - [x] Do not call the `InnerSettingsDiff::only_additional_fields` function too many times
 - [ ] Add tests

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-06-05 14:51:14 +00:00
33241a6b12 Fix condition mistake 2024-06-05 16:00:24 +02:00
ff87b4db26 Avoid running proximity when only the exact attributes changes 2024-06-05 12:48:44 +02:00
ba9fadc8f1 Put only_additional_fields to None if the difference gives an empty result. 2024-06-05 10:51:16 +02:00
98e062a714 Merge #4675
4675: Update actix-web 4.5.1 -> 4.6.0 r=dureuill a=dureuill

# Pull Request

- actix-web 4.5.1 -> 4.6.0
- actix-http 3.6.0 -> 3.7.0
- actix-web-static-files (commit 2d3b6160) -> 4.0.1
- tracing-actix-web 0.7.9 -> 0.7.10
- brotli 3.4.0 -> 6.0.0

## Related issue
Fixes #4625 


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-05 07:40:35 +00:00
d29d4f88da Skip iterating over documents when the faceted field list doesn't change 2024-06-04 15:31:24 +02:00
17c5ceeb9d iterate over the faceted fields instead of over the whole document 2024-06-04 14:04:20 +02:00
8412665957 Update actix-web 4.5.1 -> 4.6.0 2024-06-04 09:54:30 +02:00
fc584f1db3 Merge #4666
4666: Add a score threshold search parameter r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4609

## What does this PR do?
- See [usage](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#95b76ded400342ba9ab3d67c734836f0) and [the known limitation](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#e4e32195bf0e4195b5daecdbb7a97a17)


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-03 08:42:44 +00:00
2b6db6541e Changes after review 2024-06-03 10:30:00 +02:00
d6bd88ce4f Merge #4667
4667: Frequency matching strategy r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #3773

## What does this PR do?
- add test for matching strategy
- implement frequency matching strategy

See the [PRD for more details](https://www.notion.so/meilisearch/Frequency-Matching-Strategy-0f3ba08833a442a39590a53a1505ab00).

[Public API](https://www.notion.so/meilisearch/frequency-matching-strategy-89868fb7fc584026bc56e378eb854a7f).


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-05-30 14:53:31 +00:00
c32d746069 Rename the embeddings workloads 2024-05-30 16:46:57 +02:00
b9a0ff0dd6 Cache a lot of operations to know if a field must be indexed 2024-05-30 16:18:23 +02:00
75496af985 Add a span for the prepare_for_documents_reindexing 2024-05-30 12:14:22 +02:00
0e9eb9eedb Add a span for the settings diff creation 2024-05-30 12:08:27 +02:00
c2fb7afe59 fmt 2024-05-30 12:06:46 +02:00
3f1a510069 Add tests and fix matching strategy 2024-05-30 12:02:42 +02:00
3a78e988da Reduce the number of complex calls to settings diff functions 2024-05-30 11:23:07 +02:00
d9e5074189 Introduce a new way to determine the operations to perform on the fields 2024-05-30 11:23:07 +02:00
bc210bdc00 Introduce a dedicated function to write proximity entries in database 2024-05-30 11:23:06 +02:00
4bf83f701c Give the settings diff to the write_typed_chunk_into_index function 2024-05-30 11:23:06 +02:00
db3887929f Fix an issue with settings diff and * in the searchable attributes 2024-05-30 11:22:50 +02:00
9af103a88e Introducing a new into_del_add_obkv_conditional_operation function 2024-05-30 11:22:49 +02:00
99211eb375 Introduce the SettingDiff only_additional_fields method 2024-05-30 11:22:49 +02:00
41976b82b1 Tests for ranking_score_threshold 2024-05-30 11:22:26 +02:00
c36410fcbf Analytics for ranking score threshold 2024-05-30 11:22:12 +02:00
7ce2691374 Add ranking score threshold to similar API 2024-05-30 11:21:31 +02:00
4f03b0cf5b Add ranking score threshold to similar 2024-05-30 11:20:50 +02:00
c26db7878c Expose rankingScoreThreshold in API 2024-05-30 10:32:35 +02:00
06a9803544 Merge #4664
4664: Update README.md r=curquiza a=tpayet

Add hybrid & semantic as a feature

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Thomas Payet <thomas@meilisearch.com>
2024-05-29 16:55:20 +00:00
b2588d8101 Update README.md
Add hybrid & semantic as a feature
2024-05-29 17:48:48 +02:00
62d27172f4 Merge #4663
4663: Bring back release v1.8.1 into main r=ManyTheFish a=ManyTheFish



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: ManyTheFish <ManyTheFish@users.noreply.github.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-05-29 14:47:38 +00:00
1ab88e10b9 Merge branch 'main' into merge-release-v1.8.1-in-main 2024-05-29 16:24:00 +02:00
6a4b2516aa WIP 2024-05-29 16:21:24 +02:00
aac1d769a7 Add ranking_score_threshold to milli 2024-05-29 14:17:09 +02:00
abdc4afcca Implement Frequency matching strategy 2024-05-29 13:59:08 +02:00
75d5c0ae1f Merge #4647
4647: Feature: get similar documents r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4610 

## What does this PR do?
[Usage](https://meilisearch.notion.site/Get-similar-documents-usage-540919ca755c4da0b7cdee273db3f290)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-05-29 11:42:23 +00:00
a88554216a Merge #4657
4657: Update version for the next release (v1.9.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-05-29 11:14:19 +00:00
2cf3e1c80a Temporarily ignore perform snapshot test under Windows 2024-05-29 12:42:47 +02:00
e1fbfde6c4 Merge branch 'main' into merge-release-v1.8.1-in-main 2024-05-29 11:31:03 +02:00
27b75ec648 merge main into v1.8.1 2024-05-29 11:26:07 +02:00
07fdb081a4 Update version for the next release (v1.9.0) in Cargo.toml 2024-05-28 14:19:40 +00:00
ca006e38ec Basic tests 2024-05-28 15:28:19 +02:00
e26bd87780 Error tests for similar routes 2024-05-28 15:28:19 +02:00
c01e498a63 Test server can call similar 2024-05-28 15:28:19 +02:00
ca6cc4654b Add similar route 2024-05-28 15:28:19 +02:00
3bd9d2478c Add error codes 2024-05-28 15:27:43 +02:00
54b15059a0 Analytics changes 2024-05-28 15:27:43 +02:00
d35278320e Add support functions for accessing arroy writers and readers 2024-05-28 15:27:43 +02:00
e172e938e7 add search rules directly takes the filter rather than the searchquery 2024-05-28 15:22:25 +02:00
02b3d82c60 filtered_universe accepts index and txn instead of SearchContext 2024-05-28 15:22:12 +02:00
fd2c95999d Change validate_document_id to public and remove extra layer of result 2024-05-28 15:21:19 +02:00
e248d2a1e6 Merge #4655
4655: Remove `exportPuffinReport` experimental feature r=Kerollmops a=Kerollmops

This PR fixes #4605 by removing every trace of Puffin. Puffin is a great tool, but we use a better approach to measuring performance.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-28 07:01:16 +00:00
487431a035 Fix tests 2024-05-27 16:12:20 +02:00
b6d450d484 Remove puffin experimental feature 2024-05-27 15:59:28 +02:00
dc949ab46a Remove puffin usage 2024-05-27 15:59:14 +02:00
7f3e51349e Remove puffin for the dependencies 2024-05-27 15:53:06 +02:00
19acc65ad2 Merge #4646
4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops

This PR implements what is described in #4485. It reduces the number of disk writes and disk usage.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-23 16:06:50 +00:00
3a3ab17714 Merge #4651
4651: Allow to comment with the results of benchmark invocation r=Kerollmops a=dureuill



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-05-23 15:32:09 +00:00
eaf57056ca comment with the results of benchmarks 2024-05-23 15:34:39 +02:00
e340705634 Change benchmark outputs
- logs to stderr instead of stdout
- prints links to the dashboard when there is a dashboard
2024-05-23 15:29:06 +02:00
fe17c0f52e Construct the minimal OBKVs according to the settings diff 2024-05-23 11:23:57 +02:00
14bc80e3df Merge #4633
4633: Allow to mark vectors as "userProvided" r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4606 

## What does this PR do?

[See usage in PRD](https://meilisearch.notion.site/v1-9-AI-search-changes-e90d6803eca8417aa70a1ac5d0225697#deb96fb0595947bda7d4a371100326eb)

- Extends the shape of the special `_vectors` field in documents.
    - previously, the `_vectors` field had to be an object, with each field the name of a configured embedder, and each value either `null`, an embedding (array of numbers), or an array of embeddings.
    - In this PR, the value of an embedder in the `_vectors` field can additionally be an object. The object has two fields:
      1. `embeddings`: `null`, an embedding (array of numbers), or an array of embeddings.
      2. `userProvided`: a boolean indicating if the vector was provided by the user.
    - The previous form `embedder_or_array_of_embedders` is semantically equivalent to:
    ```json
    {
        "embeddings": embedder_or_array_of_embedders,
        "userProvided": true
    }
    ```
- During the indexing step, the subfields and values of the `_vectors` field that have `userProvided` set to **false** are added in the vector DB, but not in the documents DB: that means that future modifications of the documents will trigger a regeneration of that particular vector using the document template.
- This allows **importing** embeddings as a one-shot process, while still retaining the ability to regenerate embeddings on document change.
- The dump process now uses this ability: it enriches the `_vectors` fields of documents with the embeddings that were autogenerated, marking them as not `userProvided`. This allows importing the vectors from a dump without regenerating them.

### Tests

This PR adds the following tests

- Long-needed hybrid search tests of a simple hf embedder
- Dump test that imports vectors. Due to the difficulty of actually importing a dump in tests, we just read the dump and check it contains the expected content.
- Tests in the index-scheduler: this tests that documents containing the same kind of instructions as in the dump indexes as expected


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-05-23 08:17:54 +00:00
bc5663e673 FieldIdsMap no longer useful thanks to #4631 2024-05-22 16:06:15 +02:00
8a941c0241 Smaller review changes 2024-05-22 14:44:42 +02:00
3412e7fbcf "[]" is deserialized as 0 embedding rather than 1 embedding of dim 0 2024-05-22 12:25:21 +02:00
16037e2169 Don't remove embedders that are not in the config from the document DB 2024-05-22 12:24:51 +02:00
8f7c8ca7f0 Remove now unused error variant 2024-05-22 12:23:43 +02:00
ba75d23bfe Merge #4648
4648: Update version for the next release (v1.8.1) in Cargo.toml r=ManyTheFish a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: ManyTheFish <ManyTheFish@users.noreply.github.com>
2024-05-21 16:38:36 +00:00
7fbb3bf8e8 Update version for the next release (v1.8.1) in Cargo.toml 2024-05-21 15:13:03 +00:00
500ddc76b5 Make the flattened sorter optional 2024-05-21 16:16:36 +02:00
9066a446a3 Merge #4642
4642: Index the _geo fields when changing the setting while there is already documents in the DB r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4640
Fixes https://github.com/meilisearch/meilisearch/issues/4628

## What does this PR do?
- Add an integration test that first indexes the document and then changes the settings
- Fix `extract_geo_point` by detecting if the `_geo` field has been faceted in this setting change and index all documents

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-05-21 13:16:11 +00:00
eccbcf5130 Increase index-scheduler test timeouts 2024-05-21 14:59:08 +02:00
943f8dba0c Make clippy happy 2024-05-21 14:58:41 +02:00
1aa8ed9ef7 Make the original sorter optional 2024-05-21 14:53:26 +02:00
f762307838 Fix clippy 2024-05-21 13:44:20 +02:00
3e94a90722 Fixes 2024-05-21 13:39:46 +02:00
abe29772db Merge #4644
4644: Revert "Stream documents" and keep heed+arroy to the latest verion r=Kerollmops a=irevoire

Reverts meilisearch/meilisearch#4544

Fixes https://github.com/meilisearch/meilisearch/issues/4641

I didn’t realize that some http clients were not handling chunked http requests like you would expect (if you ask the body, it gives you the body), which made the previous PR breaking.

There is no way to provide a good fix to the issue we initially wanted to fix without breaking meilisearch and that’s not planned for now.

Co-authored-by: Tamo <irevoire@protonmail.ch>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-05-21 10:21:47 +00:00
c9ac7f2e7e update heed to latest version 2024-05-20 15:19:00 +02:00
7e251b43d4 Revert "Stream documents" 2024-05-20 15:09:45 +02:00
9969f7a638 Add test on index-scheduler 2024-05-20 14:44:10 +02:00
b17cb56dee Test array of vectors 2024-05-20 14:44:10 +02:00
afcd7b9f0c Test hybrid search with hf embedder 2024-05-20 14:44:10 +02:00
fc7e817221 Index geo points based on the settings differences 2024-05-20 12:27:26 +02:00
0f78703b85 add a test reproducing the bug 2024-05-20 10:58:08 +02:00
30cf972987 Add test with a dump 2024-05-20 10:36:18 +02:00
d05d49ffd8 Fix tests 2024-05-20 10:36:18 +02:00
0462ebbe58 Don't write an empty _vectors field 2024-05-20 10:36:18 +02:00
2f7a8a4efb Don't write vectors that weren't autogenerated in document DB 2024-05-20 10:36:18 +02:00
02714ef5ed Add vectors from vector DB in dump 2024-05-20 10:36:18 +02:00
52d9cb6e5a Refactor vector indexing
- use the parsed_vectors module
- only parse `_vectors` once per document, instead of once per embedder per document
2024-05-20 10:36:17 +02:00
261de888b7 Add function to get the embeddings of a document in an index 2024-05-20 10:36:17 +02:00
98c811247e Add parsed vectors module 2024-05-20 10:25:59 +02:00
59ecf1cea7 Merge #4544
4544: Stream documents r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4383


### Perf
2M hackernews:

main:
Time to retrieve: 7s
RAM consumption: 2+GiB

stream:
Time to retrieve: 4.7s
RAM consumption: Too small

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-05-17 14:49:08 +00:00
273c6e8c5c uses the latest version of heed to get rid of unsafe code 2024-05-16 18:31:32 +02:00
897d25780e update milli to latest version 2024-05-16 18:31:32 +02:00
c85d1752dd keep the same rtxn to compute the filters on the documents and to stream the documents later on 2024-05-16 18:31:32 +02:00
8e6ffbfc6f stream documents 2024-05-16 18:31:32 +02:00
7c19c072fa Merge #4631
4631: Split the field id map from the weight of each fields r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4484

## What does this PR do?
- Make the (internal) searchable fields database always contain the searchable fields (instead of None when the user-defined searchable fields were not defined)
- Introduce a new « fieldids_weights_map » that does the mapping between a fieldId and its Weight
- Ensure that when two searchable fields are swapped, the field ID map doesn't change anymore (and thus, doesn't re-index)
- Uses the weight instead of the order of the searchable fields in the attribute ranking rule at search time
- When no searchable attributes are defined, make all their weights equal to zero
- When a field is declared as searchable and contains nested fields, all its subfields share the same weight

## Impact on relevancy

### When no searchable attributes are declared

When no searchable attributes are declared, all the fields have the same importance instead of randomly giving more importance to the field we've encountered « the most early » in the life of the index.

This means that before this PR, send the following json:
```json
[
  { "id": 0, "name": "kefir", "color": "white" },
  { "id": 1, "name": "white", "last name": "spirit" }
]
```

Would make the field `name` more important than the field `color` or `last name`.
This means that searching for `white` would make the document `1` automatically higher ranked than the document `0`.

After this PR, all the fields have the same weight, and none are considered more important than others.

### When a nested field is made searchable

The second behavior change that happened with this PR is in the case you're sending this document, for example:

```json
{
  "id": 0,
  "name": "tamo",
  "doggo": {
    "name": "kefir",
    "surname": "le kef"
  },
  "catto": "gromez"
}
```

Previously, defining the searchable attributes as: `["tamo", "doggo", "catto"]` was actually defining the « real » searchable attributes in the engine as: `["tamo", "doggo", "catto", "doggo.name", "doggo.surname"]`, which means that `doggo.name` and `doggo.surname` were _NOT_ where the user expected them and had completely different weights than `doggo`.
In this PR all the weights have been unified, and the « real » searchable fields look like this:
```json
[ "tamo", "doggo", "doggo.name", "doggo.surname", "catto"]
   ^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^    ^^^^^
Weight 0                 Weight 1                  Weight 2

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-05-16 09:59:24 +00:00
673b6e1dc0 fix a flaky test 2024-05-16 11:28:14 +02:00
f2d0a59f1d when no searchable attributes are defined, makes all the weight equals to zero 2024-05-16 01:06:33 +02:00
c78a2fa4f5 rename method and variable around the attributes to search on feature 2024-05-15 18:04:42 +02:00
5542f1d9f1 get back to what we were doingb efore in the DB cache and with the restricted field id 2024-05-15 18:00:39 +02:00
ad4d8502b3 stops storing the whole fieldids weights map when no searchable are defined 2024-05-15 17:16:10 +02:00
7ec4e2a3fb apply all style review comments 2024-05-15 15:02:26 +02:00
9fffb8e83d make clippy happy 2024-05-14 17:36:32 +02:00
caa6a7149a make the attribute ranking rule use the weights and fix the tests 2024-05-14 17:36:32 +02:00
a0082c4df9 add a failing test on the attribute ranking rule 2024-05-14 17:00:02 +02:00
b0afe0972e stop updating the fields ids map when fields are only swapped 2024-05-14 17:00:02 +02:00
9ecde41853 add a test on the current behaviour 2024-05-14 17:00:02 +02:00
685f452fb2 Fix the indexing of the searchable 2024-05-14 17:00:02 +02:00
4e4a1ddff7 gate a test behind the required feature 2024-05-14 17:00:02 +02:00
c22460045c Stops returning an option in the internal searchable fields 2024-05-14 17:00:02 +02:00
76bb6d565c Merge #4624
4624: Add "precommands" to benchmark r=dureuill a=dureuill

# Pull Request

## Related issue
Helps for https://github.com/meilisearch/meilisearch/issues/4493

## What does this PR do?
- Add support for precommands for cargo xtask bench
- update benchmark docs
- update workload files


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-05-13 08:27:56 +00:00
9d3ff11b21 Modify existing workload files to use precommands 2024-05-07 14:03:14 +02:00
43763eb98a Document precommands 2024-05-07 12:26:22 +02:00
2a0ece814c Add precommands to workloads 2024-05-07 12:23:36 +02:00
95fcd17373 Merge #4622
4622: Bump Rustls to non-vulnerable versions r=Kerollmops a=Kerollmops

This PR Fixes #4599 by bumping the Rustls dependency to v0.21.12 and [ureq to v2.9.7](https://github.com/algesten/ureq/blob/main/CHANGELOG.md#297) (which bump rustls to v0.22.4).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-07 09:47:30 +00:00
ac4bc143c4 Bump ureq to v2.9.7 2024-05-07 10:39:38 +02:00
f33a1282f8 Bump Rustls to v0.21.12 2024-05-07 10:31:39 +02:00
4d5971f343 Merge #4621
4621: Bring back changes from v1.8.0 into main r=curquiza a=curquiza



Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-05-06 13:46:39 +00:00
ecb5c506b3 Merge #4619
4619: Use http path pattern instead of full path in metrics r=irevoire a=gh2k

# Pull Request

## Related issue

Fixes #3983 

## What does this PR do?

- This records only the HTTP pattern in metrics instead of the full path

An alternative solution was proposed in #4145, but this doesn't really fix the root cause of the issue. The problem I'm experiencing at my end is that by using the full path, the number of labels is far too high to be useful. It is normal practice to use the path with variable placeholders, instead of the fully-expanded path.

The example given in the ticket was endpoints under `/tasks`, but this can also be a very significant problem under `/indexes/{index-uid}/documents`. e.g.:
<img width="1510" alt="Screenshot 2024-05-03 at 12 14 36" src="https://github.com/meilisearch/meilisearch/assets/6530014/1df2ec19-5f69-4164-90d2-f65c59f9b544">

This patch replaces the fully-expanded path with the matched pattern.

The linked PR also mentions paths under other routes, e.g. `/static`, but this feels like a separate concern and these can be stripped out at the Prometheus end by filters if they are unwanted. The most important thing is to make the paths usable so that we can still get stats on e.g. the number of document deletes we see.

## PR checklist

Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Simon Detheridge <s@sd.ai>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-05-06 09:37:32 +00:00
3698aef66b fix warning 2024-05-06 11:36:37 +02:00
7f5ab3cef5 Use http path pattern instead of full path in metrics 2024-05-03 12:29:31 +01:00
c668043c4f Merge #4617
4617: Destructure `EmbedderOptions` so we don't miss some options r=dureuill a=dureuill

# Pull Request

## Related issue
#4595 was caused by the code not destructuring the embedder options.


## What does this PR do?
This PR adds the missing `url` parameter for ollama, and makes sure similar issue cannot happen in the future



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-05-02 14:55:32 +00:00
5a305bfdea Remove unused struct 2024-05-02 16:14:37 +02:00
f4dd73ec8c Destructure EmbedderOptions so we don't miss some options 2024-05-02 15:39:36 +02:00
66dce4600d Merge #4603
4603: Update charabia v0.8.10 r=Kerollmops a=ManyTheFish

- Update Charabia v0.8.10
- Add `swedish-recomposition` as an optional feature flag

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-30 13:04:02 +00:00
fe51ceca6d Update lock file 2024-04-30 14:33:37 +02:00
88174b8ae4 Update charabia v0.8.10 2024-04-30 14:30:23 +02:00
ebca29f3de Merge #4597
4597: Fix embeddings settings update r=ManyTheFish a=ManyTheFish

# Pull Request
- add some conditions reducing the work done when changing the settings
- add some benchmarks on embedders

## Related issue
Fixes #4585


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-25 16:37:28 +00:00
c793b6ef6d Merge #4600
4600: Fix embedders api r=ManyTheFish a=ManyTheFish

# Pull Request

## Related issue
Fixes #4594
Fixes #4595


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-25 13:16:33 +00:00
cbbfff3594 Remove debuging prints 2024-04-25 10:37:18 +02:00
dbcf50589b Fix clippy 2024-04-25 10:36:10 +02:00
3e5cd027a5 Merge #4593
4593: Stop crashing when panic occurs in thread pool r=ManyTheFish a=Kerollmops

This PR fixes #4362 by introducing a new boolean to catch panics in the rayon thread pool. The boolean is read after performing the operations in rayon, and the indexation process is stopped. This first version doesn't expose the panic message but marks the task as failed.

The current implementation exposes a `ThreadPoolNoAbort` wrapper. The `rayon::ThreadPool` has been wrapped to check that nothing went wrong after running the `ThreadPool::install` function. An atomic boolean and some `store/load` logic make the system work efficiently.

Before, Meilisearch was completely crashing...

<img width="1563" alt="Capture d’écran 2024-04-22 à 15 49 02" src="https://github.com/meilisearch/meilisearch/assets/3610253/ce114917-a881-4fbb-85df-c195fcf0c7cb">

Now, it handles the panics correctly and marks the task as failed.

<img width="1558" alt="Capture d’écran 2024-04-22 à 15 42 14" src="https://github.com/meilisearch/meilisearch/assets/3610253/8bd031ef-5e8f-4a12-a91e-c823597a2344">


Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-04-24 16:27:08 +00:00
7468c1cf8d Introduce WildcardSetting that are serialized as wildcards by default 2024-04-24 18:15:03 +02:00
d4aeff92d0 Introduce the ThreadPoolNoAbort wrapper 2024-04-24 16:40:12 +02:00
e87cb373de Avoid intermediate serializing when displaying settings 2024-04-24 12:33:07 +02:00
9b76501875 Display set API key for Ollama embedder 2024-04-24 12:33:07 +02:00
6247e95dc3 Add benchmark for embeddings 2024-04-23 17:42:20 +02:00
b3173d0423 Remove useless dots in the error messages 2024-04-22 18:09:33 +02:00
96cc5319c8 Introduce a new internal error type to categorize panics 2024-04-22 18:09:33 +02:00
0c7003c5df Introduce an atomic to catch panics in thread pools 2024-04-22 18:09:33 +02:00
a1aa999026 Add conditions reducing wrok 2024-04-22 14:18:35 +02:00
aa0bbbb246 Merge #4578
4578: Remove useless analytics r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes #4577

## What does this PR do?
Remove the following analytics:
- `Health Seen`
- `Stats Seen`
- `Task Seen`
- `Version Seen`


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-18 13:30:42 +00:00
a04012c33e Merge #4583
4583: Update charabia v0.8.9 r=irevoire a=ManyTheFish

# Pull Request
- Update Charabia v0.8.9
- Add the optional feature flag activating pinyin normalization

## Related issue
Fixes  #4574


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-18 09:42:42 +00:00
c71b5d09ff Updatre charabia v0.8.9 2024-04-18 11:38:26 +02:00
248e22005a Merge #4582
4582: Fix some typos in comments r=curquiza a=writegr

# Pull Request

## Related issue

No

## What does this PR do?

 fix some typos in comments

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: writegr <wellweek@outlook.com>
2024-04-18 07:07:33 +00:00
ab43a8a949 chore: fix some typos in comments
Signed-off-by: writegr <wellweek@outlook.com>
2024-04-18 14:12:52 +08:00
4a8459b799 Merge #4576
4576: increase the default search time budget from 150ms to 1.5s r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes #4575

## What does this PR do?
- increase the default search time budget from 150ms to 1.5s


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-17 16:04:47 +00:00
442de982a9 Merge #4581
4581: Always show facet numbers in alpha order in the facet distribution r=ManyTheFish a=Kerollmops

This PR fixes #4559 by making sure that the number facets (facets that come from numbers from the documents) are always displayed in alpha order, even when there is a small amount to display.

The issue was due to some algorithms executed when the number of facet values to display was small. We can see that now, facet values are always displayed correctly.

```json
"facetDistribution": {
    "release_year": {
        "2010": 1,
        "2011": 1,
        "2012": 1,
        "2013": 1,
        "2014": 1,
        "2015": 1,
        "2016": 1,
        "2017": 1,
        "2018": 1,
        "2019": 19,
        "2020": 1,
        "2021": 1,
        "2022": 1,
        "2023": 1,
        "2024": 1,
        "2025": 1
    }
}
```

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-04-17 15:18:58 +00:00
c923adf222 Fix facet distribution for alpha on facet numbers 2024-04-17 16:31:16 +02:00
2dfee2fad5 Merge #4580
4580: Update the search logs r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4579

## What does this PR do?
- Update the debug implementation of the search query and search results so it’s way smaller and doesn’t display useless information


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-17 14:25:43 +00:00
4a68e9f6ae reorganize the debug implementation of the search results and only dispaly the meaningful informations 2024-04-17 13:42:10 +02:00
206887c7a2 update the SearchQuery Debug implementation so it’s smaller and gives the most important informations first 2024-04-17 12:57:19 +02:00
2f170fe2d5 Merge #4504
4504: Avoid clearing db in transform r=ManyTheFish a=ManyTheFish

# Pull Request

## Related issue
Fixes #4478



Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-17 10:41:00 +00:00
df29ba709a Make some cleaning in Arcs 2024-04-17 12:33:25 +02:00
2dd9dd6d0a remove the Health Seen analytic 2024-04-17 11:43:40 +02:00
3acfab2eb7 Fix PR comments 2024-04-17 10:55:51 +02:00
e1f27de51a remove the Stats Seen analytic 2024-04-16 18:49:41 +02:00
abae31aee0 remove the Task Seen analytic 2024-04-16 18:48:10 +02:00
70ce0095ea remove the Version Seen analytic 2024-04-16 18:48:03 +02:00
19137be0ea increase the default search time budget from 150ms to 1.5s 2024-04-16 18:09:49 +02:00
a1ea224da9 Fix tests 2024-04-16 17:29:34 +02:00
87a93ba47d fix clippy 2024-04-16 14:39:30 +02:00
eaf113ef34 Fix wod pair proximity error when nothing has to be extracted 2024-04-16 14:39:30 +02:00
5ab901dd30 Fix tests 2024-04-16 14:39:30 +02:00
e5ae337aae Comeback to sorters in extract_word_docids
using buffers and merge the keys manually is less efficient
2024-04-16 14:39:30 +02:00
bad46f88d6 Fix embedder test 2024-04-16 14:39:30 +02:00
a489b406b4 fix test 2024-04-16 14:39:06 +02:00
02c3d6b265 finish work 2024-04-16 14:39:06 +02:00
b5e4a55af6 refactor faceted and searchable pipeline 2024-04-16 14:39:06 +02:00
a7e368aaa6 Create InnerIndexSettingsDiffs struct and populate it 2024-04-16 14:39:06 +02:00
893200ab87 Avoid clearing documents in transform 2024-04-16 14:39:06 +02:00
aabce52b1b Fix test 2024-04-16 14:39:06 +02:00
64079fc894 Do more iterations on the settings benchmarks 2024-04-16 14:39:06 +02:00
8fff5fc281 update tests 2024-04-16 14:39:06 +02:00
4089dd04a5 Merge #4568
4568: Fix some typos in comments r=curquiza a=yudrywet

# Pull Request

## Related issue
No

## What does this PR do?
fix some typos in comments

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: yudrywet <yudeyao@yeah.net>
2024-04-15 08:12:43 +00:00
cf864a1c2e chore: fix some typos in comments
Signed-off-by: yudrywet <yudeyao@yeah.net>
2024-04-14 20:11:34 +08:00
0661c86f16 Merge #4566
4566: Bring back changes from v1.7.6 to main r=irevoire a=dureuill



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: dureuill <dureuill@users.noreply.github.com>
2024-04-11 19:32:29 +00:00
a6c02f7684 Update version for the next release (v1.7.6) in Cargo.toml 2024-04-11 21:08:57 +02:00
89e72fab32 Update grenad to fix rare DB corruption 2024-04-11 21:06:59 +02:00
171b41be24 Merge #4560
4560: Bring back change from v1.7.5 to main r=curquiza a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: irevoire <irevoire@users.noreply.github.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
2024-04-09 16:58:30 +00:00
c26d356a35 Merge branch 'main' into release-v1.7.5-tmp 2024-04-09 14:46:15 +02:00
d6b6cd322c Update sprint_issue.md (#4556) 2024-04-05 18:40:28 +02:00
217fbc777f Merge #4554
4554: Update version for the next release (v1.7.5) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2024-04-04 18:03:04 +00:00
c2c73c1f25 Merge #4553
4553: update h2 r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4551


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-04 17:23:00 +00:00
7a49a056fa Update version for the next release (v1.7.5) in Cargo.toml 2024-04-04 16:33:45 +00:00
fd4be26718 update h2 2024-04-04 18:27:16 +02:00
b1844b0c27 Merge #4548
4548: v1.8 hybrid search changes r=dureuill a=dureuill

Implements the search changes from the [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#40f24df3da694428a39cc8043c9cfc64)

### ⚠️ Breaking changes in an experimental feature:

- Removed the `_semanticScore`. Use the `_rankingScore` instead.
- Removed `vector` in the response of the search (output was too big).
- Removed all the vectors from the `vectorSort` ranking score details
  - target vector appearing in the name of the rule
  - matched vector appearing in the details of the rule

### Other user-facing changes

- Added `semanticHitCount`, indicating how many hits were returned from the semantic search. This is especially useful in the hybrid search.
- Embed lazily: Meilisearch no longer generates an embedding when the keyword results are "good enough".
- Graceful embedding failure in hybrid search: when doing hybrid search (`semanticRatio in ]0.0, 1.0[`), an embedding failure no longer causes the search request to fail. Instead, only the keyword search is performed. When doing a full vector search (`semanticRatio==1.0`), a failure to embed will still result in failing that search.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-04-04 16:00:20 +00:00
a9013ed683 Fix comment mistake
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-04 17:21:47 +02:00
ca499a0302 Fix test after rebase 2024-04-04 16:04:07 +02:00
355e5282b2 Remove _semanticScore 2024-04-04 16:04:07 +02:00
7c27417a5d Add tests 2024-04-04 16:04:07 +02:00
1ff2a2d6fb Add semanticHitCount 2024-04-04 16:04:06 +02:00
3c6e9851a4 Correct error formatting 2024-04-04 15:58:19 +02:00
4564a38ae7 Bail earlier when the experimental feature is not enabled 2024-04-04 15:58:19 +02:00
466d718a05 Fix test 2024-04-04 15:58:19 +02:00
6ebb6b55a6 Lazily embed, don't fail hybrid search on embedding failure 2024-04-04 15:58:17 +02:00
fabc9cf14a milli: add Embedder::embed_one 2024-04-04 15:57:29 +02:00
00c4ed3bc2 milli: refactor getting embedder and embedder name 2024-04-04 15:57:29 +02:00
190933f6e1 Breaking: Remove vector from SearchResult 2024-04-04 15:57:29 +02:00
928e6e4c05 Breaking change: remove vector for score details 2024-04-04 15:57:29 +02:00
339a5e3431 Merge #4549
4549: Hugging Face embedder improvements r=dureuill a=dureuill

Architectural changes/Internal improvements

### 1. Prefer safetensors weights over pytorch weights when available

safetensors weights are memory mapped, which reduces memory usage of supported models.

### 2. Update candle

Updates candle to `0.4.1`, now targeting crates.io and the tokenizers to `v0.15.2` (still on github).

This might fix https://github.com/meilisearch/meilisearch/issues/4399 thanks to the now included https://github.com/huggingface/candle/issues/1454

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-04-04 13:47:18 +00:00
5509bafff8 Merge #4535
4535: Support Negative Keywords r=ManyTheFish a=Kerollmops

This PR fixes #4422 by supporting `-` before any word in the query.

The minus symbol `-`, from the ASCII table, is not the only character that can be considered the negative operator. You can see the two other matching characters under the `Based on "-" (U+002D)` section on [this unicode reference website](https://www.compart.com/en/unicode/U+002D).

It's important to notice the strange behavior when a query includes and excludes the same word; only the derivative ( synonyms and split) will be kept:
 - If you input `progamer -progamer`, the engine will still search for `pro gamer`.
 - If you have the synonym `like = love` and you input `like -like`, it will still search for `love`.

## TODO
 - [x] Add analytics
 - [x] Add support to the `-` operator
 - [x] Make sure to support spaces around `-` well
 - [x] Support phrase negation
 - [x] Add tests


Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-04-04 13:10:27 +00:00
90e812fc0b Add some tests 2024-04-04 15:08:37 +02:00
58cafcc824 Update candle 2024-04-03 13:11:56 +02:00
56bf8503db Merge #4537
4537: Expose distribution shift in settings r=ManyTheFish a=dureuill

See [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#d652adc0890445658aaf36352dbc8802)

# Changes

- Distribution shift added to all embedders.
- Exposed in settings
- Changed the reindexing logic to not trigger a reindex operation when only the distribution shift or API key change

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-04-03 09:08:58 +00:00
a1eccc762a Prefer safetensors to pytorch when both are available 2024-04-03 11:05:59 +02:00
75f81a0bab Merge #4547
4547: Fix milli/Cargo.toml for usage as dependency via git r=dureuill a=Toromyx

# Pull Request

## Related issues/discussions
This enables th usage of `milli` [via git repository](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) as mentioned in <https://github.com/meilisearch/meilisearch/issues/3367#issuecomment-1422613815>, <https://github.com/meilisearch/meilisearch/discussions/1523#discussioncomment-1039338>, and <https://github.com/meilisearch/meilisearch/discussions/1981#discussioncomment-1771568>

## What does this PR do?
Trying to depend on `milli` like

```
[dependencies.milli]
git = "https://github.com/meilisearch/meilisearch.git"
tag = "v1.7.4"
```

leads to the following error:

```
error: failed to select a version for the requirement `candle-core = "^0.3.1"`
candidate versions found which didn't match: 0.4.2
location searched: Git repository https://github.com/huggingface/candle.git
required by package `milli v1.7.4 (https://github.com/meilisearch/meilisearch.git?tag=v1.7.4#0259ad60)`
```

because the default branch of <https://github.com/huggingface/candle> does not contain the correct version.

To fix this, i added a `rev="..."` entry in the relevant dependencies, specifiyng the commit already present in the `Cargo.lock` file.
I also updated the version to the one in the Cargo.lock. This also updated `candle-kernels` sub-dependency from 0.3.1 to 0.3.3 which is probably correct?

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Thomas Gauges <thomas.gauges@gmail.com>
2024-04-03 07:31:36 +00:00
d55d496250 Fix milli/Cargo.toml for usage as dependency via git 2024-04-02 15:19:30 +02:00
5080bef0d6 Merge #4546
4546: Fix some typos in conments r=curquiza a=redistay

# Pull Request



## What does this PR do?
- fix some typos in conments 

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: redistay <wujunjing@outlook.com>
2024-04-02 12:07:09 +00:00
182cb42953 chore: fix some typos in conments
Signed-off-by: redistay <wujunjing@outlook.com>
2024-04-02 19:37:55 +08:00
92a049c2dd Merge #4543
4543: Bring back changes from v1.7.4 into main r=Kerollmops a=dureuill



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: dureuill <dureuill@users.noreply.github.com>
2024-03-28 16:53:51 +00:00
78668584cd Merge #4533
4533: Hide api key in settings and task queue r=dureuill a=dureuill

# Pull Request

See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#117f5ff7b19f4d95bb3ae0005f6c6633)

## Motivation

See [slack discussion (internal link)](https://meilisearch.slack.com/archives/C06GQP7FQ6P/p1709804022298749)


## Changes

- The value of the `apiKey` parameter is now hidden in the settings and the details of the task queue.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-28 16:02:53 +00:00
fa9748cc99 Merge #4536
4536: Limit concurrent search requests r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4489

## What does this PR do?
- Adds a « search queue » that limits the number of search requests we can process at the same time and stores search requests to be processed
- Process only one search request per core/thread (we use available_parallelism)
- When the search queue is full, new search requests replace old ones **randomly**. The reason is that:
  - If we serve the oldest one first, like Typesense, we give the worst performances to everyone
  - If we serve the latest one, it gets too easy to DoS us (you just need to fill the queue with as many search requests as we can process simultaneously to ensure no other request will ever be processed)
  - By picking the search request randomly, we give a chance to recent search requests to be processed while ensuring that we can't be owned unless they fill our queue entirely and we start returning errors 5xx
- Adds an experimental parameter to control the size of the queue
- Adds a bunch of tests to ensure the search queue works correctly
- Ensure the loop consuming the search queue is running in the health route and crashes if it’s not the case

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-28 15:01:52 +00:00
877f4b1045 Support negative phrases 2024-03-28 15:51:43 +01:00
781e2d7750 Merge #4532
4532: Add `url` and `api_key` to ollama r=ManyTheFish a=dureuill

See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#5c77ef49e78e43388c1d3d5429151357)

### Motivation

- Before this PR, the url for ollama is only read from the environment. This is a needless restriction that will be troublesome in settings where passing an environment variable is complex or impossible (e.g., the Cloud)
- Before this PR, ollama did not support an api_key. While ollama does not natively support API keys, [a common practice](https://github.com/ollama/ollama/issues/849) is to put a publicly accessible ollama server behind a proxy to support authentication.

### Skip changelog

ollama embedder was added to v1.8

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-28 12:35:19 +00:00
796213af9a Merge branch 'main' into tmp-release-v1.7.4 2024-03-28 10:51:49 +01:00
69f8b2730d Fix the tests 2024-03-28 10:47:04 +01:00
7385067c42 Merge #4542
4542: fixes typos r=irevoire a=brunoocasali

Just fix a typo 😬 

Co-authored-by: Bruno Casali <brunoocasali@gmail.com>
2024-03-27 18:21:48 +00:00
d1021c0f0d Merge #4520
4520: Add automation to create openAPI issue r=dureuill a=curquiza

Create automatically an issue to remind us to update open-api file when opening a milestone

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-03-27 17:33:22 +00:00
8f2606d79d fixes typos 2024-03-27 14:26:47 -03:00
0259ad6082 Merge #4541
4541: Update version for the next release (v1.7.4) in Cargo.toml r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: dureuill <dureuill@users.noreply.github.com>
2024-03-27 16:49:40 +00:00
06a11b5b21 Improve error message 2024-03-27 17:34:49 +01:00
b50f518764 Update version for the next release (v1.7.4) in Cargo.toml 2024-03-27 16:12:54 +00:00
94b7afcc55 Merge #4539
4539: Don't optimize reindexing when fields contain dots r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4525

## What does this PR do?
- Don't try to optimize the amount of reindexing operation when nested fields are used anywhere in:
    - the field distribution (e.g. a key actually contains a `.`)
    - the old faceted fields
    - the new faceted fields

This is because the facet distribution is not reporting on existing nested fields.



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-27 16:07:49 +00:00
ee8cbea810 Don't optimize reindexing when fields contain dots 2024-03-27 17:04:45 +01:00
b7c582e4f3 connect the search queue with the health route 2024-03-27 15:49:43 +01:00
03c886ac1b adds a bit of documentation 2024-03-27 15:38:36 +01:00
cde7ce4f44 Add test 2024-03-27 14:02:09 +01:00
92224f109a Fix tests 2024-03-27 12:19:10 +01:00
0d27d50740 Merge #4516
4516: Update sprint_issue.md r=Kerollmops a=curquiza

Following decision made about specification

Also
- removed useless parts of the template
- add automatic labels -> better to forget to remove them rather than forgetting to add them (some mistakes happened in the past)

Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2024-03-27 11:04:06 +00:00
572fb3a51d Finer granularity for embedder needs reindex 2024-03-27 12:01:34 +01:00
4ff0255783 remove unused function 2024-03-27 11:51:14 +01:00
a25456120d Expose distribution in settings 2024-03-27 11:51:04 +01:00
168ded3b9d Deserr for distribution 2024-03-27 11:50:33 +01:00
afd1da5642 Add distribution to all embedders 2024-03-27 11:50:22 +01:00
087a96d22e fix flaky test 2024-03-27 11:05:37 +01:00
34dfea72cc Merge #4509
4509: Rest embedder r=ManyTheFish a=dureuill

Fixes #4531 

See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42?pvs=25#e6f58c3b742c4effb4ddc625ce12ee16)

### Implementation changes

- Remove tokio, futures, reqwests
- Add a new `milli::vector::rest::Embedder` embedder
- Update OpenAI and Ollama embedders to use the REST embedder internally
- Make Embedder::embed a sync method
- Add the new embedder source as described in the usage


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-27 09:27:46 +00:00
3a1f458139 fix a flaky test 2024-03-26 21:06:55 +01:00
55df9daaa0 adds a comment about the safety of an operation 2024-03-26 19:34:55 +01:00
2e36f069c2 fmt imports 2024-03-26 19:23:55 +01:00
8f5d9f501a update the discussion link 2024-03-26 19:18:32 +01:00
8127c9a115 handle the case of a queue of zero elements 2024-03-26 19:04:39 +01:00
e7704f1fc1 add a test to ensure we effectively returns a retry-after when the search queue is full 2024-03-26 18:08:59 +01:00
34262c7a0d Add analytics for the negative operator 2024-03-26 18:01:27 +01:00
e2a1bbae37 simplify and improve the http error 2024-03-26 17:53:37 +01:00
1da9e0f246 Better support space around the negative operator (-) 2024-03-26 17:47:13 +01:00
e4a3e603b3 Expose a first working version of the negative keyword 2024-03-26 17:47:13 +01:00
e433fd53e6 rename the method to get a permit and use it in all search requests 2024-03-26 17:28:03 +01:00
3f23fbb46d create the experimental CLI argument 2024-03-26 16:43:40 +01:00
c41e1274dc push and test the search queue datastructure 2024-03-26 15:56:43 +01:00
9a95ed619d Add tests 2024-03-26 10:36:56 +01:00
f82d056072 Hide secrets in settings and task queue 2024-03-26 10:36:24 +01:00
5ea017b922 Merge #4530
4530: fix: set the histogram bucket boundaries to follow the otel spec r=curquiza a=rohankmr414

# Pull Request

## What does this PR do?
- Fixes the http request duration histogram bucket boundaries to follow the opentelemetry spec, currently the bucket boundaries are too granular and only track latencies below 1s.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Rohan Kumar <rohankmr414@gmail.com>
2024-03-25 12:23:31 +00:00
817ccc089a also allow api_key 2024-03-25 11:50:00 +01:00
2ddd872ce6 Merge #4373
4373: feat: add status code label to prometheus http request counter r=irevoire a=rohankmr414

# Pull Request

## What does this PR do?
- This PR adds the `status` label (the value is http status code) to the `meilisearch_http_requests_total` metric.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Rohan Kumar <rohankmr414@gmail.com>
2024-03-25 10:40:50 +00:00
4136630ea5 Use constants instead of raw strings in set_*set() 2024-03-25 11:39:33 +01:00
58972f35cb Allow url parameter for ollama embedder 2024-03-25 11:32:55 +01:00
dfa5e41ea6 Check validity of the URL setting 2024-03-25 11:23:16 +01:00
a1db342f01 Expose REST embedder to the API 2024-03-25 11:23:15 +01:00
f87747f4d3 Remove unwraps 2024-03-25 11:23:04 +01:00
b6b4b6bab7 Remove the tokio and the reqwests 2024-03-25 11:23:03 +01:00
f649f58013 embed no longer async 2024-03-25 11:23:03 +01:00
ac52c857e8 Update ollama and openai impls to use the rest embedder internally 2024-03-25 11:23:03 +01:00
8708cbef25 Add RestEmbedder 2024-03-25 11:23:03 +01:00
c3d02f092d OpenAI sync 2024-03-25 11:23:03 +01:00
bc58e8a310 Documentation for the vector module 2024-03-25 11:23:03 +01:00
ec81c2bf1a Merge #4511
4511: Bump charabia to 0.8.8 r=ManyTheFish a=6543

... and update lock file

this will add the fix (https://github.com/meilisearch/charabia/pull/275) to support markdown formatted codeblocks

Co-authored-by: 6543 <6543@obermui.de>
2024-03-25 09:26:11 +00:00
13a84ae557 fix: set the histogram bucket boundaries to follow the otel spec 2024-03-25 11:20:30 +05:30
325435ad43 feat: add request rate and error rate panels to grafana dashboard 2024-03-25 10:49:40 +05:30
5833070358 feat: add status code label to prometheus http request counter 2024-03-25 10:49:40 +05:30
ae3c31a82c Merge #4526
4526: chore: remove repetitive word r=curquiza a=availhang

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: availhang <mayangang@outlook.com>
2024-03-22 16:06:54 +00:00
9865c58046 chore: remove repetitive words
Signed-off-by: availhang <mayangang@outlook.com>
2024-03-22 15:23:13 +08:00
bf95438ea8 Merge #4522
4522: Brings back change to main r=curquiza a=irevoire

# Pull Request

Bring back changes to main

Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: irevoire <irevoire@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-03-21 15:57:50 +00:00
48d012c3e2 Merge branch 'main' into tmp-release-v1.7.3 2024-03-21 16:39:38 +01:00
8394be9484 Add automation to create openAPI issue 2024-03-21 15:52:11 +01:00
414fc14426 Merge #4519
4519: Update version for the next release (v1.7.3) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-03-21 11:21:56 +00:00
3b8e8b7f1a Update version for the next release (v1.7.3) in Cargo.toml 2024-03-21 11:20:30 +00:00
c67f04c746 Update sprint_issue.md 2024-03-20 18:45:56 +01:00
fc1c3f4a29 Merge #4466
4466: Implements the search cutoff r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4488

## What does this PR do?
- Adds a cutoff to the bucket sort after 150ms has been spent
- Adds a new setting to customize the default value of 150ms
- When the time is exceeded, we exit early with what we had the time to sort
- If the cutoff has been reached, the search details are updated with a new `Skip` ranking details for the ranking rules that were skipped
- Adds analytics to measure the total number of degraded search requests
- Adds the number of degraded search requests to the Prometheus metrics and Grafana dashboard
- The cutoff **must not** skip the filters; otherwise, we would leak documents to people who don’t have the right to see them


Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-20 13:06:53 +00:00
f2f1367ec3 add a timeout to the webhook 2024-03-20 13:59:43 +01:00
18f17ed728 Update version for the next release (v1.7.2) in Cargo.toml 2024-03-20 13:59:42 +01:00
4628b7b7bd bump charabia to 0.8.8
and update lock file
2024-03-20 13:39:00 +01:00
d49250358d Merge #4513
4513: Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" r=Kerollmops a=irevoire

This reverts commit bd74cce86a, reversing changes made to d2f77e88bd.

This commit wasn’t supposed to be merged on the `release-v1.7.1` branch


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-20 09:57:24 +00:00
5046ffdf54 Merge #4512
4512: Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"" r=Kerollmops a=irevoire

Reverts meilisearch/meilisearch#4510

This PR was supposed to be merged on `release-v1.7.1` not main 🤦 

Co-authored-by: Tamo <irevoire@protonmail.ch>
2024-03-20 09:14:43 +00:00
c5322df519 Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"" 2024-03-20 10:08:28 +01:00
6079141ea6 snapshot the scores side by side with the score details 2024-03-19 18:30:14 +01:00
2c3af8e513 query the detailed score detail in the test 2024-03-19 18:09:02 +01:00
098ab594eb A score of 0.0 is now lesser than a sort result
handles the niche case 🐩 in the hybrid search where:
1. a sort ranking rule is the first rule.
2. the keyword search is skipped at the first rule.
3. the semantic search is not skipped at the first rule.

Previously, we would have the skipped search winning, whereas we want the non skipped one winning.
2024-03-19 17:32:32 +01:00
c495c8eb33 Merge #4510
4510: Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" r=Kerollmops a=irevoire

In https://github.com/meilisearch/meilisearch/pull/4502 we merged main into release-v1.7.1 instead of a temporary branch thus we now need to revert this merge commit.

This reverts commit bd74cce86a, reversing changes made to d2f77e88bd.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-19 16:02:24 +00:00
567194b925 Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"
This reverts commit bd74cce86a, reversing
changes made to d2f77e88bd.
2024-03-19 16:56:21 +01:00
d8fe4fe49d return the order in the score details 2024-03-19 15:45:04 +01:00
7b9e0d2944 forward the degraded parameter to the hybrid search 2024-03-19 15:11:21 +01:00
0ae39644f7 fix the facet search 2024-03-19 15:07:06 +01:00
bfec9468d4 Update milli/src/search/mod.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-19 14:49:15 +01:00
5233534dc0 Merge #4477
4477: Add documentation for benchmarks r=dureuill a=dureuill

See [CONTRIBUTING.md](https://github.com/meilisearch/meilisearch/blob/benchmark-docs/CONTRIBUTING.md#logging)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-19 13:23:48 +00:00
fced2ff9ab Merge #4502
4502: Release v1.7.1 r=dureuill a=Kerollmops

Bring the v1.7.1 changes back to main.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
2024-03-19 12:41:28 +00:00
bd74cce86a Merge remote-tracking branch 'origin/main' into release-v1.7.1 2024-03-19 13:39:17 +01:00
f85c80d059 Merge #4503
4503: Add settings diff indexing benchmarks r=dureuill a=ManyTheFish

Add several benchmarks targetting settings diff-indexing enhancements

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-03-19 10:35:46 +00:00
2a92c04100 Adding new assets 2024-03-19 11:31:32 +01:00
4369e9e97c add an error code test on the setting 2024-03-19 11:14:28 +01:00
e8516f00c4 move settings workload in root workload directory 2024-03-19 10:41:30 +01:00
7bd881b9bc adds the degraded searches to the prometheus dashboard 2024-03-19 10:35:47 +01:00
6a0c399c2f rename the search_cutoff parameter to search_cutoff_ms 2024-03-19 10:35:47 +01:00
038c26c118 stop returning the degraded boolean when a search was cutoff 2024-03-19 10:35:47 +01:00
ad9192fbbf reduce the size of an integration test 2024-03-19 10:35:47 +01:00
b8cda6c300 fix the search cutoff and add a test 2024-03-19 10:35:47 +01:00
b72495eb58 fix the settings tests 2024-03-19 10:28:23 +01:00
d1db495119 add a settings for the search cutoff 2024-03-19 10:28:23 +01:00
4a467739cd implements a first version of the cutoff without settings 2024-03-19 10:28:21 +01:00
29e71eedc7 Add benchmarks 2024-03-18 18:31:28 +01:00
10d053cd2f Merge #4500
4500: Don't display dimensions as 0 when it is not set r=ManyTheFish a=dureuill

Fixes regression in embedders where `dimensions: 0` was displayed when it hadn't be set for the `openAi` source.

Was breaking a PHP SDK integration test: cbaecb8c55/tests/Settings/EmbeddersTest.php (L28)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-18 15:21:24 +00:00
a302e258bd Don't display dimensions as 0 when it is not set 2024-03-18 16:10:12 +01:00
29840473b4 Merge #4499
4499: Fix milli link in contributing doc r=curquiza a=mohsen-alizadeh

# Pull Request

## Related issue
Fixes #4498

## What does this PR do?
 The milli link in CONTRIBUTING.md targeted the archived milli repository. it has to be changed to target to the milli crate in the main repo

## PR checklist
Please check if your PR fulfills the following requirements:
- [X] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [X] Have you read the contributing guidelines?
- [X] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Mohsen Alizadeh <mohsen@alizadeh.us>
Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2024-03-18 14:39:26 +00:00
f4037c1a95 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2024-03-18 15:39:01 +01:00
13cc62728b Fix milli link in contributing doc 2024-03-17 19:29:42 -07:00
f84bcb09e1 Merge #4491
4491: chore: remove repetitive words r=curquiza a=shuangcui

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: shuangcui <fliter@qq.com>
2024-03-14 17:44:01 +00:00
5c95b5c933 chore: remove repetitive words
Signed-off-by: shuangcui <fliter@qq.com>
2024-03-14 21:28:55 +08:00
0b7bebeeb6 Merge #4483
4483: Workflows: Fix reason param when benches are triggered from a comment. r=irevoire a=dureuill



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-13 17:05:30 +00:00
d2f77e88bd Merge #4479
4479: Skip reindexing when modifying unknown faceted fields r=dureuill a=Kerollmops

This PR improves Meilisearch's decision to reindex when a faceted field is added to the settings, but not a single document contains this field. It is effectively a waste of time to reindex documents when the engine needs to know a field.

This is related to a conversation [we have with our biggest customer (internal link)](https://discord.com/channels/1006923006964154428/1101213808627830794/1217112918857089187). They have 170 million documents, so reindexing this amount would be problematic.

---

The image is available by using the following Docker command. You can see the advancement of the image's build [on the GitHub CI page](https://github.com/meilisearch/meilisearch/actions/runs/8251688778).

```
docker pull getmeili/meilisearch:prototype-no-reindex-unknown-fields-0
```

Here is the hand-made test that shows that when modifying unknown filterable attributes, here `lol`, it doesn't reindex. However, when modifying the known `genre` field, it does reindex. You can see all that by looking at the time spent processing the update.

```json
{
  "uid": 3,
  "indexUid": "movies",
  "status": "succeeded",
  "type": "settingsUpdate",
  "canceledBy": null,
  "details": {
    "filterableAttributes": [
      "genres"
    ]
  },
  "error": null,
  "duration": "PT9.237703S",
  "enqueuedAt": "2024-03-12T15:34:26.836083Z",
  "startedAt": "2024-03-12T15:34:26.836374Z",
  "finishedAt": "2024-03-12T15:34:36.074077Z"
},
{
  "uid": 2,
  "indexUid": "movies",
  "status": "succeeded",
  "type": "settingsUpdate",
  "canceledBy": null,
  "details": {
    "filterableAttributes": [
      "lol"
    ]
  },
  "error": null,
  "duration": "PT0.000751S",
  "enqueuedAt": "2024-03-12T15:33:53.563923Z",
  "startedAt": "2024-03-12T15:33:53.565259Z",
  "finishedAt": "2024-03-12T15:33:53.56601Z"
},
{
  "uid": 0,
  "indexUid": "movies",
  "status": "succeeded",
  "type": "documentAdditionOrUpdate",
  "canceledBy": null,
  "details": {
    "receivedDocuments": 31944,
    "indexedDocuments": 31944
  },
  "error": null,
  "duration": "PT3.120723S",
  "enqueuedAt": "2024-02-17T10:35:55.042864Z",
  "startedAt": "2024-02-17T10:35:55.043505Z",
  "finishedAt": "2024-02-17T10:35:58.164228Z"
}
```

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-03-13 16:23:32 +00:00
1d8c13f595 Merge #4487
4487: Update version for the next release (v1.7.1) in Cargo.toml r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
2024-03-13 15:41:10 +00:00
7f3c495f5c Update version for the next release (v1.7.1) in Cargo.toml 2024-03-13 14:49:21 +00:00
abd954755d Merge #4476
4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops

This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting.

Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly.

### TODO
 - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings.
         This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one.
- [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by.
  - [x] Use it when there is no search query.
  - [x] Use it when there is a search query with and without allowed typos.
  - [x] Do not change the original logic, only use a wrapper.
- [x] Add tests

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-03-13 14:36:14 +00:00
f3fc2bd01f Address some issues with preallocations 2024-03-13 15:22:14 +01:00
6fa3872268 Workflows: Fix reason param when benches are triggered from a comment. 2024-03-13 13:46:43 +01:00
6c9823d7bb Add tests to sortFacetValuesBy count 2024-03-13 11:59:39 +01:00
e0dac5a22f Simplify the algorithm by using the new facet values collection wrapper 2024-03-13 11:31:34 +01:00
b918b55c6b Introduce a new facet value collection wrapper to simply the usage 2024-03-13 11:31:34 +01:00
07b1d0edaf Merge #4475
4475: Allow running benchmarks without sending results to the dashboard r=irevoire a=dureuill

Adds a `--no-dashboard` option to avoid sending results to the dashboard.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-13 09:59:52 +00:00
306b25ad3a Move the searchForFacetValues struct into a dedicated module 2024-03-13 10:24:21 +01:00
9f7a4fbfeb Return the facets of a placeholder facet-search sorted by count 2024-03-13 10:09:01 +01:00
5ed7b6a0b2 Merge #4456
4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm

# Pull Request

## Related issue
[Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305)

## What does this PR do?
- Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama`
- Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set
- Changes some of the structs and functions in `openai.rs` to be public so that they can be shared.
- Added more error variants for Ollama specific errors
- It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model

Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Jakob Klemm <jakob@jeykey.net>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
2024-03-13 08:48:47 +00:00
ae67d5eef0 Update milli/src/vector/error.rs
Fix Meilisearch capitalization
2024-03-13 09:45:04 +01:00
88bc9556a9 Add Ollama dimension inference and add clearer errors
Instead of the user manually specifying the model dimensions it will now automatically get determined
Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output
Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user
2024-03-12 19:59:11 +01:00
ca4876fd10 Do not reindex when modifying unknown faceted field 2024-03-12 16:18:58 +01:00
d3a95ea2f6 Introduce a new OrderByMap struct to simplify the sort by usage 2024-03-12 13:56:56 +01:00
88d27949cd Add documentation for benchmarks 2024-03-12 10:56:16 +01:00
69c118ef76 Extract the facet order before extracting the facets values 2024-03-12 10:35:39 +01:00
d44e20aa89 Merge #4474
4474: Update cargo version r=irevoire a=curquiza

Fixes #4417

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-03-12 09:27:22 +00:00
7b670a4afa Allow dry runs for benchmarks where reports are generated but not sent to the dashboard 2024-03-12 10:26:13 +01:00
fde209b7b6 Update cargo version 2024-03-12 10:20:07 +01:00
904b82a61d Merge #4473
4473: Bring back changes from v1.7.0 to main r=curquiza a=curquiza



Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
2024-03-11 15:02:47 +00:00
8ec3e30d2b Merge branch 'main' into tmp-release-v1.7.0 2024-03-11 15:39:51 +01:00
0a59cb9734 Merge #4463
4463: Add tests when the field limit is reached r=Kerollmops a=irevoire

# Pull Request

## Related issue
Related to https://github.com/meilisearch/meilisearch/discussions/4429#discussioncomment-8689101

This user found out that the error message we’re supposed to return when the maximum number of attributes is reached is _not_ returned in some cases

## What does this PR do?
- This PR adds four tests around the maximum number of attributes:
  1. Add a document with u16::MAX + 1 fields - Meilisearch panics
  2. Add two documents which together adds up to u16::MAX + 1 fields - Meilisearch returns the expected error 
  3. Add a document with u16::MAX + 1 **nested fields** - No error message but the document isn’t indexed
  4. Add two documents which together add up to u16::MAX + 1 nested fields - Meilisearch doesn’t return any error but doesn’t index the document

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-07 10:36:54 +00:00
f053c280e1 add tests when the field limit is reached 2024-03-06 18:42:41 +01:00
ee3076d5ba Merge #4462
4462: Divide threshold by ten r=dureuill a=ManyTheFish

Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-03-06 13:05:38 +00:00
ab1224bfa7 Merge #4458
4458: Replace logging timer by spans r=Kerollmops a=dureuill

- Remove logging timer dependency.
- Remplace last uses in search by spans

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 16:43:23 +00:00
eefc1c421e Merge #4459
4459: Put a bound on OpenAI timeout r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4460 

## What does this PR do?
- Makes sure that the timeout of the openai embedder is limited to max 1min, rather than the prior 15min+



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 15:18:51 +00:00
4d42a7af7c Merge #4445
4445: Add subcommand to run benchmarks r=irevoire a=dureuill

# Pull Request

## Related issue
Not user-facing, no issue

## What does this PR do?
- Adds a new `cargo xtask bench` subcommand that can run one or multiple workload files and report the results to a server
- A workload file is a JSON file with a specific schema
- Refactor our use of the `vergen` crate:
  - update to the beta `vergen-git2` crate
  - VERGEN_GIT_SEMVER_LIGHTWEIGHT => VERGEN_GIT_DESCRIBE
  - factor logic in a single `build-info` crate that is used both by meilisearch and xtask (prevents vergen variables from overriding themselves)
  - checked that defining the variables by hand when no git repo is available (docker build case) still works.
- Add CI to run `cargo xtask bench`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 14:03:57 +00:00
7408db2a46 Meilisearch: fix date formatting 2024-03-05 14:56:48 +01:00
663629a9d6 Remove unused build dependency from xtask
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 14:45:06 +01:00
15c38dca78 Output RFC 3339 dates where we can
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 14:44:48 +01:00
7ee20b0895 Refactor xtask bench 2024-03-05 14:42:06 +01:00
0c216048b5 Cap timeout duration 2024-03-05 12:19:25 +01:00
36d17110d8 openai: Handle BAD_GETAWAY, be more resilient to failure 2024-03-05 12:18:54 +01:00
bdd428c22e Merge #4450
4450: Add the content type in the webhook + improve the test r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4436

## What does this PR do?
- Specify the content type of the webhook
- Ensure it’s the case in the test

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 10:36:53 +00:00
b130917933 add the content type in the webhook + improve the test 2024-03-05 11:22:29 +01:00
25f64ce7df Replace logging timer by spans 2024-03-05 11:05:42 +01:00
adcd848809 CI: Add bench workflows 2024-03-05 11:02:05 +01:00
84ae0cd456 Merge #4457
4457: Bump mio from 0.8.9 to 0.8.11 r=Kerollmops a=dependabot[bot]

Bumps [mio](https://github.com/tokio-rs/mio) from 0.8.9 to 0.8.11.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/tokio-rs/mio/blob/master/CHANGELOG.md">mio's changelog</a>.</em></p>
<blockquote>
<h1>0.8.11</h1>
<ul>
<li>Fix receiving IOCP events after deregistering a Windows named pipe
(<a href="https://redirect.github.com/tokio-rs/mio/pull/1760">tokio-rs/mio#1760</a>, backport pr:
<a href="https://redirect.github.com/tokio-rs/mio/pull/1761">tokio-rs/mio#1761</a>).</li>
</ul>
<h1>0.8.10</h1>
<h2>Added</h2>
<ul>
<li>Solaris support
(<a href="https://redirect.github.com/tokio-rs/mio/pull/1724">tokio-rs/mio#1724</a>).</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="0328bdef90"><code>0328bde</code></a> Release v0.8.11</li>
<li><a href="7084498512"><code>7084498</code></a> Fix warnings</li>
<li><a href="90d4fe00df"><code>90d4fe0</code></a> named-pipes: fix receiving IOCP events after deregister</li>
<li><a href="c710a307f8"><code>c710a30</code></a> Add v0.8.x to the CI</li>
<li><a href="c29e21c244"><code>c29e21c</code></a> Release v0.8.10</li>
<li><a href="f6a20da1c8"><code>f6a20da</code></a> Add Solaris operating system support (<a href="https://redirect.github.com/tokio-rs/mio/issues/1724">#1724</a>)</li>
<li>See full diff in <a href="https://github.com/tokio-rs/mio/compare/v0.8.9...v0.8.11">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mio&package-manager=cargo&previous-version=0.8.9&new-version=0.8.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-05 09:35:17 +00:00
eee46b7537 Add first workloads 2024-03-05 10:13:11 +01:00
55f60a3638 Update .gitignore
- Ignore `/bench` directory for git purposes
- Ignore benchmark DB
2024-03-05 10:12:52 +01:00
c608b3f9b5 Factor vergen stuff to a build-info crate 2024-03-05 10:11:43 +01:00
86ce843f3d Add cargo xtask bench 2024-03-05 10:11:43 +01:00
b11df7ec34 Meilisearch: fix some wrong spans 2024-03-05 10:11:43 +01:00
6862caef64 Span Stats compute self-time 2024-03-05 10:11:43 +01:00
f75c7ac979 Compile xtask in --release 2024-03-05 10:11:43 +01:00
f07069094b Bump mio from 0.8.9 to 0.8.11
Bumps [mio](https://github.com/tokio-rs/mio) from 0.8.9 to 0.8.11.
- [Release notes](https://github.com/tokio-rs/mio/releases)
- [Changelog](https://github.com/tokio-rs/mio/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/mio/compare/v0.8.9...v0.8.11)

---
updated-dependencies:
- dependency-name: mio
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-04 22:03:25 +00:00
eada6de261 Divide threshold by ten 2024-03-04 18:02:54 +01:00
d3004d8040 Implemented Ollama as an embeddings provider
Initial prototype of Ollama embeddings actually working, error handlign / retries still missing.

Allow model to be any String and require dimensions parameter

Fixed rustfmt formatting issues

There were some formatting issues in the initial PR and this should not make the changes comply with the Rust style guidelines

Because I accidentally didn't follow the style guide for commits in my commit messages I squashed them into one to comply
2024-03-04 15:09:43 +01:00
f4a6261dea Merge #4453
4453: Don't test on nightly r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4441 better 😅 

## What does this PR do?
- No longer run tests on nightly

The motivation for this change is that we are now updating Rust at fixed points in time, and so no longer need nightly runs to ensure that a change won't get into stable and break our build at the worst possible moment.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-29 14:41:59 +00:00
9806a3e5f6 Don't test on nightly 2024-02-29 14:24:50 +01:00
a96b45dda7 Merge #4451
4451: Fix nightly build r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4441 

## What does this PR do?
- Change imports following https://github.com/rust-lang/rust/pull/117772

## Note

This one is going to be annoying a bit until the lint stabilizes:

- We only get the warning on nightly, so we will discover them when it runs in the CI that uses the nightly compiler (not on regular PRs)
- There's the case of `TryInto`/`TryFrom` traits. They have been added to the prelude in Rust edition 2021, so it means that `use`ing them is a warning on nightly for 2021 edition crates (most crates), but not `use`ing them is an error anywhere for 2018 Rust edition crates, such as `milli`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-29 07:20:22 +00:00
452a343a2b Fix imports 2024-02-28 18:09:40 +01:00
b87485e80d Merge #4433
4433: Enhance facet incremental r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #4367
Fixes #4409

## What does this PR do?

- Add a test reproducing #4409
- Fix #4409 by removing a document from a level only if it is no more present in all the linked sub-level nodes
- Optimize facet Incremental indexing by creating or deleting a complete level once per field id instead of for each facet value
- Optimize facet Incremental indexing by doing the additions and the deletions in the same process instead of doing them separately


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-28 15:28:46 +00:00
147a67dc82 Merge #4446
4446: Do not omit vectors when importing a dump r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4447 

## What does this PR do?
- Correctly populate the maps of embedders before starting the indexing operations, while importing a dump


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-27 09:11:00 +00:00
716ffc07ee Build the embedders when importing a dump 2024-02-26 22:15:57 +01:00
b005eb3289 Merge #4435
4435: Make update file deletion atomic r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4432
Fixes https://github.com/meilisearch/meilisearch/issues/4438 by adding the logs the user asked

## What does this PR do?
- Adds a bunch of logs to help debug this kind of issue in the future
- Delete the update files AFTER committing the update in the `index-scheduler` (thus, if a restart happens, we are able to re-process the batch successfully)
- Multi-thread the deletion of all update files.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 17:54:40 +00:00
9e664d87eb Merge #4443
4443: Add GPU analytics r=dureuill a=dureuill

# Pull Request

## Related issue

Adds analytics indicating whether Meilisearch  was compiled with the `milli/cuda` feature.

Cc `@macraig` 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-26 17:13:45 +00:00
6dcb5219a0 Merge #4442
4442: Send custom task r=ManyTheFish a=irevoire

This PR has already been merged on main but was supposed to be merged on `release-v1.7.0` thus we need to merge it a second time; sorry 😓 

### This PR implements the necessary parameters for the High Availability

Introduce a new CLI flag called `--experimental-replication-parameters` that changes a few behaviors in the engine:
- [The auto-deletion of tasks is disabled](https://specs.meilisearch.com/specifications/text/0060-tasks-api.html#_2-technical-details)
- Upon registering a task, you can choose its task ID by sending a new header: `TaskId: 456645`. It must be a valid number, which must be superior to the last task id ever seen.
- Add the ability to « dry-register » a task. That means meilisearch will answer to you with a valid task ID like everything went well, but won’t actually write anything in the database. To do that, you need to use the `DryRun: true` header.
- Specification’s here: https://github.com/meilisearch/specifications/pull/266

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 15:20:16 +00:00
5e83bac448 Fix PR comments 2024-02-26 15:40:15 +01:00
0562818c2a fix and remove the file-store hack of /dev/null 2024-02-26 13:59:41 +01:00
a478392b7a create a test with the dry-run parameter enabled 2024-02-26 13:59:41 +01:00
bbf3fb88ca rename the cli parameter 2024-02-26 13:59:40 +01:00
60510e037b update the discussion link 2024-02-26 13:58:04 +01:00
36c27a18a1 implement the dry run ha parameter 2024-02-26 13:58:04 +01:00
1eb1c043b5 disable the auto deletion of tasks when the ha mode is enabled 2024-02-26 13:58:04 +01:00
507739bd98 add an experimental cli parameter to allow specifying your task id 2024-02-26 13:58:03 +01:00
eb25b07390 let you specify your task id 2024-02-26 13:56:31 +01:00
938149f814 Merge #4042
4042: Implements the new replication parameters r=ManyTheFish a=irevoire

### This PR implements the necessary parameters for the High Availability

- [ ] Update the spec

Introduce a new CLI flag called `--experimental-replication-parameters` that changes a few behaviors in the engine:
- [The auto-deletion of tasks is disabled](https://specs.meilisearch.com/specifications/text/0060-tasks-api.html#_2-technical-details)
- Upon registering a task, you can choose its task ID by sending a new header: `TaskId: 456645`. It must be a valid number, which must be superior to the last task id ever seen.
- Add the ability to « dry-register » a task. That means meilisearch will answer to you with a valid task ID like everything went well, but won’t actually write anything in the database. To do that, you need to use the `DryRun: true` header.

----

Old prototype `prototype-custom-task-id-0`:
-  Adds the capability to specify your own task ID via the `TaskId` http header
- Make the task IDs a u64 instead of a u32


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 11:37:34 +00:00
066a7a3cde takes only one read transaction per thread 2024-02-26 10:43:04 +01:00
55796406c5 Add GPU analytics 2024-02-26 10:41:47 +01:00
eb90f0b4fb fix and remove the file-store hack of /dev/null 2024-02-26 10:19:07 +01:00
c2e2003a80 create a test with the dry-run parameter enabled 2024-02-22 15:51:47 +01:00
91cdd502f8 When processing tasks, make the update file deletion atomic 2024-02-22 14:56:22 +01:00
a493a50825 Fix clippy 2024-02-22 14:53:33 +01:00
9d1f489a37 Fix facet incremental indexing 2024-02-21 18:42:16 +01:00
693ba8dd15 rename the cli parameter 2024-02-21 14:33:40 +01:00
e1a3eed1eb update the discussion link 2024-02-21 12:30:28 +01:00
05ae291989 implement the dry run ha parameter 2024-02-21 11:21:26 +01:00
6ba9994916 disable the auto deletion of tasks when the ha mode is enabled 2024-02-20 12:23:39 +01:00
01ae46dd80 add an experimental cli parameter to allow specifying your task id 2024-02-20 11:24:44 +01:00
12f5389ba7 Merge #4416
4416: Create automation when creating Milestone to create update-version issue r=curquiza a=curquiza

Following our discussion `@irevoire` -> we miss reminder to update cargo version BEFORE rc0

Issue template [here](https://github.com/meilisearch/engine-team/blob/main/issue-templates/update-version-issue.md)

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-02-20 08:47:29 +00:00
9ee4f55e6c let you specify your task id 2024-02-19 14:29:33 +01:00
865b415b3f Add test rerpoducing bug 2024-02-15 16:00:48 +01:00
5ee6aaddc4 Merge #4418
4418: Output logs to stderr r=dureuill a=irevoire

Output the logs to `stderr` instead of `stdout`. This was introduced in the `v1.7.0-rc.0` and is a bug; logs should always be outputted to stderr.

Fix #4419

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 14:31:37 +00:00
4148d391b8 move logs to stderr 2024-02-15 15:24:16 +01:00
88c6165e20 Merge #4410
4410: Implement the experimental log mode cli flag and log level updates at runtime r=dureuill a=irevoire

# Pull Request
This PR fixes two issues at once because they’re highly correlated in the codebase.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4415
Fixes https://github.com/meilisearch/meilisearch/issues/4413

## What does this PR do?
- It makes the fmt logger configurable to output json or human-readable logs (like we already do today)
- It moves the fmt logger under a `reload` layer so we can update its targets at runtime
- Add the possibility to stream logs in the json mode
- Adds an analytics for the new CLI flag

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 10:01:06 +00:00
d097431113 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:58:43 +01:00
1f8af81ba9 update the log mode discussion link 2024-02-15 10:32:48 +01:00
5d3bad4120 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:31:23 +01:00
d34692e30b Merge #4365
4365: Update charabia r=dureuill a=ManyTheFish

Update Charabia v0.8.7,

- Add Vietnamese Normalization (Ð and Đ into d)

Fixes #4357

Charabia versions:
- https://github.com/meilisearch/charabia/releases/tag/v0.8.6
- https://github.com/meilisearch/charabia/releases/tag/v0.8.7

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-14 16:57:25 +00:00
024de0dcf8 Create automation when creating Milestone to create update-version issue 2024-02-14 17:36:47 +01:00
a081da0d90 add support for the json format in the stream route 2024-02-14 15:34:39 +01:00
78e04520fc Update charabia version 2024-02-14 15:16:16 +01:00
72c1674a31 Merge #4350
4350: Make several indexing optimizations r=Kerollmops a=ManyTheFish

# Summary

Implement several enhancements to reduce the indexing time.

# Steps

- Compute the indexing chunk size dynamically based on the available threads and the data size
- Remove the merging step before the writing step and merge at the writing time
- Remove append function
- Make Facet search indexing incremental

# Running Indexing process

## `main`
Each type of data is written after a merging phase:
![Capture d’écran 2024-01-23 à 10 18 08](https://github.com/meilisearch/meilisearch/assets/6482087/6203c3ce-407c-46b4-8b83-04282da1bb16)

> Highlighted parts are the writings

## `remove-merging-phase-from-indexing`
When the extraction of a chunk is finished, the data is written:
![Capture d’écran 2024-01-23 à 10 18 18](https://github.com/meilisearch/meilisearch/assets/6482087/ab1307b4-d0a9-42ac-abbb-fdeb27ddf0d4)

> Highlighted parts are the writings

## Related

This PR removes the appending writes on several indexing parts, which may fix https://github.com/meilisearch/meilisearch/issues/4300. However, all of the appending writes are not removed. There are 2 remaining calls that could trigger this bug:
- When [putting embedders in the settings](b6fc181993/milli/src/update/settings.rs (L996))
- when [bulk indexing the facets](b6fc181993/milli/src/update/facet/bulk.rs (L150))


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-02-14 14:12:48 +00:00
03bb6372af Change is_batchable_with by mergeable_with 2024-02-14 11:50:22 +01:00
3beda8833d Fix and add logs 2024-02-14 11:46:30 +01:00
3b6544db6d Implement the experimental log mode cli flag 2024-02-13 18:09:15 +01:00
55e942cd45 buggy 2024-02-13 15:26:30 +01:00
48026aa75c fix PR comments 2024-02-13 15:19:01 +01:00
e5e811e2c9 Update milli/src/update/index_documents/extract/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:21 +01:00
55de96f74e Update milli/src/update/facet/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:10 +01:00
82b43e9a7f Merge #4400
4400: Upgrade rustls to 0.21.10 and ring to 0.17 r=curquiza a=hack3ric

# Pull Request

## What does this PR do?
- Upgrade dependencies that uses ring 0.16 so that they rely on ring 0.17 instead
- Use rustls 0.21 for actix-{http,tls}, since newer versions of rustls uses ring 0.17
- Fix some trivial breaking API changes caused by above

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Eric Long <i@hack3r.moe>
2024-02-12 13:17:40 +00:00
15dafde21d Merge #4401
4401: Update version for the next release (v1.7.0) in Cargo.toml r=irevoire a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2024-02-12 10:17:10 +00:00
290f6d15e7 Update version for the next release (v1.7.0) in Cargo.toml 2024-02-12 10:15:00 +00:00
39c83cb3d9 fix clippy 2024-02-12 09:12:54 +01:00
7efb1cae11 yield in loop when the channel is not disconnected 2024-02-12 09:12:54 +01:00
7877788510 fix logs 2024-02-12 09:12:54 +01:00
c02d585f5b Upgrade rustls to 0.21.10 and ring to 0.17 2024-02-12 14:32:29 +08:00
be1b054b05 Compute chunk size based on the input data size ant the number of indexing threads 2024-02-08 17:28:37 +01:00
023c2d755f Merge #4391
4391: Tracing r=dureuill a=irevoire

# Pull Request

- [ ] Hide the parameters of the process batch
- [x] Make actix-web trace every call on every route
- [x] Remove all `env_logger`/`logs` dependencies
- [x] Be able to enable or disable the memory measurement using the `/logs` route parameters

See the following product discussion: https://github.com/orgs/meilisearch/discussions/721

Supersedes https://github.com/meilisearch/meilisearch/pull/4338

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4317

## What does this PR do?

Update the format of the logs from:
```
[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers
```

to

```
2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers
```

First, run meilisearch with the route enabled via the feature flag:
- `cargo run --experimental-enable-logs-route`
- Or at runtime by sending the following payload:
```
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "logsRoute": true
  }'
```

Then gather data from meilisearch by calling for example:
```
curl \
	-X POST http://localhost:7700/logs \
	-H 'Content-Type: application/json' \
	--data-binary '{
	    "mode": "fmt",
            "target": "milli=trace"
    }'
```

Once your operation is over, tell meilisearch to stop the route:
```
curl \
	-X DELETE http://localhost:7700/logs
```

----

In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand.

```bash
cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json
```

Then go to https://profiler.firefox.com and load it.
Note that we can also share the profiles using the https://share.firefox.dev website.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-08 14:16:56 +00:00
407ad753ed rust fmt 2024-02-08 15:11:42 +01:00
285aa15d2f make the mode camelCase instead of lowercase 2024-02-08 15:04:06 +01:00
bf43a3f60a fix typo 2024-02-08 15:04:06 +01:00
2c88131bb1 rename the fmt mode to human 2024-02-08 15:04:06 +01:00
35aa9d5904 fix an error message 2024-02-08 15:04:06 +01:00
cfb3e6b51f update the actix-web trace 2024-02-08 15:04:06 +01:00
1502382316 use debug instead of debug_span 2024-02-08 15:04:06 +01:00
ef994d84d0 Change error messages and fix tests 2024-02-08 15:04:06 +01:00
1b74010e9e Remove "with_line_numbers" 2024-02-08 15:04:06 +01:00
08af0e690c Structures a bunch of logs 2024-02-08 15:04:06 +01:00
d71b77f18b Add panic hook to log panics 2024-02-08 15:04:06 +01:00
c443ed7e3f delete inner .gitignore 2024-02-08 15:04:06 +01:00
db722d201a Write entries into database downgraded to trace level 2024-02-08 15:04:05 +01:00
91eb67e981 logs route: make memory profiling toggling usable 2024-02-08 15:04:05 +01:00
902d700a24 Tracing trace: toggle the profiling of memory at runtime 2024-02-08 15:04:05 +01:00
f70a615ed9 update the github discussion links 2024-02-08 15:04:05 +01:00
7ff722b72e get rids of the log dependencies everywhere 2024-02-08 15:04:05 +01:00
bcf7909bba add a profile_memory parameter disabled by default 2024-02-08 15:04:05 +01:00
ceb211c515 move the /logs route to the /logs/stream route 2024-02-08 15:04:05 +01:00
f3c34d5b8c Simplify MemoryStats fetching 2024-02-08 15:04:05 +01:00
4de2db6786 add back the actix-web logs 2024-02-08 15:04:05 +01:00
661baa716b logs route profile mode: don't barf bytes if the buffer is not empty 2024-02-08 15:04:05 +01:00
02dcaf07db Replace the procfs by libproc 2024-02-08 15:04:05 +01:00
d78ada07b5 spanstats: change field names 2024-02-08 15:04:05 +01:00
bc097d90cb tracing-trace: Spanstats deserializable + public fields 2024-02-08 15:04:05 +01:00
b393823f36 Replace stats_alloc with procfs 2024-02-08 15:04:05 +01:00
e773dfa9ba get rids of log in milli and add logs for the bucket sort 2024-02-08 15:04:05 +01:00
f158e96fe7 fix the auth 2024-02-08 15:04:05 +01:00
e23ec4886d fix the tests and add tests on the experimental features 2024-02-08 15:04:03 +01:00
7793ba67a4 hide the route logs behind a feature flag 2024-02-08 15:03:33 +01:00
80774148fd handle and tests errors 2024-02-08 15:03:33 +01:00
bf5cea8b10 add a test 2024-02-08 15:03:33 +01:00
38e1c40f38 meilisearch: logs route disconnects in profile mode 2024-02-08 15:03:33 +01:00
afc0585c1c meilisearch: don't spawn a report everytime Meilisearch starts 2024-02-08 15:03:33 +01:00
0e7a411d4d tracing-trace: introduce TraceWriter, trace now only exposes the channel 2024-02-08 15:03:33 +01:00
0f327f2821 tracing-trace: implement Error on Error 2024-02-08 15:03:33 +01:00
77254765e8 get rids of env loggegr and fix the tests 2024-02-08 15:03:33 +01:00
ce6e6ec2c5 stops profiling in a file by default 2024-02-08 15:03:32 +01:00
91a8f74763 Add cancel log route 2024-02-08 15:03:32 +01:00
abaa72e2bf start handling reloads with profiling 2024-02-08 15:03:32 +01:00
3c3a258a22 start exposing the profiling layer 2024-02-08 15:03:32 +01:00
73e66d5a97 Add dummy log when calling tasks 2024-02-08 15:03:32 +01:00
b8da117b9c Simplify stream implementation 2024-02-08 15:03:32 +01:00
5e52107474 better than before??? 2024-02-08 15:03:32 +01:00
bcf1c4dae5 make it compile and runtime error 2024-02-08 15:03:32 +01:00
50f84d43f5 init commit 2024-02-08 15:03:32 +01:00
f76cc0806e WIP: first draft at introducing a new log route 2024-02-08 15:03:32 +01:00
2f1abd2c03 nelson is not used anymore 2024-02-08 15:03:32 +01:00
dedc91e2cf use json lines 2024-02-08 15:03:32 +01:00
a61d8c59ff Add span stats processor 2024-02-08 15:03:32 +01:00
6e23040464 Use with tokio channel in Meilisearch 2024-02-08 15:03:32 +01:00
8febbf64ce Switch to tokio channel 2024-02-08 15:03:32 +01:00
b141c82a04 Support Events in trace layer 2024-02-08 15:03:32 +01:00
cc79cd0b04 Switch to a single view indicating current usage 2024-02-08 15:03:32 +01:00
256538ccb9 Refactor memory handling and add markers 2024-02-08 15:03:31 +01:00
ca8990394e Remove the stats_alloc from the default features 2024-02-08 15:03:31 +01:00
83fb2949c3 Give the allocator to the tracer when necessary 2024-02-08 15:03:31 +01:00
6cf703387d Format the bytes as human readable bytes
Uses the same `byte_unit` version as `meilisearch`
2024-02-08 15:03:31 +01:00
771861599b Logging the memory usage over time 2024-02-08 15:03:31 +01:00
7e47cea0c4 Add tracing to Meilisearch 2024-02-08 15:03:31 +01:00
5d7061682e Add tracing to milli 2024-02-08 15:03:31 +01:00
02e6c8a440 Add tracing to index-scheduler 2024-02-08 15:03:31 +01:00
89401d097b Add tracing-trace 2024-02-08 15:03:30 +01:00
72ebac1fbb Merge #4388
4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops

This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available.

Fixes #4152.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-08 13:19:28 +00:00
a616a1d37b Merge #4389
4389: Stabilize scoreDetails r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4359

## What does this PR do?

### User standpoint

- Users no longer need to enable the `scoreDetails` experimental feature to use `showRankingScoreDetails` in search queries.
- ⚠️ **Breaking change**: sending an object containing the key `"scoreDetails"` to the `/experimental-features` route is now an error. However, importing a dump of a database where that feature was enabled completes successfully.

### Implementation standpoint
- remove `scoreDetails` from the experimental features
- remove check on the experimental feature `scoreDetails` before accepting `showRankingScoreDetails`
- remove `scoreDetails` from the accepted fields in the `/experimental-features` route
- fix tests accordingly

## Manual tests

1. exported a dump with the `scoreDetails` feature enabled on `main`
    - tried to import the dump after the changes in this PR
    - the dump imported successfully
2. tried to make a search with `showRankingScoreDetails: true`
    - the ranking score details are displayed
    - an automated test case also exists and passes
3. tried to enable the `scoreDetails` in `/experimental-features`
    - get error message 
      ```
       Unknown field `scoreDetails`: expected one of `vectorStore`, `metrics`, `exportPuffinReports`
      ```

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-08 10:40:00 +00:00
3e120619fa Merge #4375
4375: Feat: add new OpenAI models and ability to override dimensions r=dureuill a=Gosti

# Pull Request

Fixes #4394 

## Related discussion
https://github.com/orgs/meilisearch/discussions/677#discussioncomment-8306384

## What does this PR do?
- Add text-embedding-3-small
- Add text-embedding-3-large
- Add optional dimensions parameter for both new models


## Note
As the dimensions option is not available for text-embedding-ada-002 I've added a manual check to prevent, but I feel it could be implemented in a more idiomatic rust 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Gosti <gostitsog@gmail.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-07 16:20:15 +00:00
a1caac9bfb Correct distribution shifts for new models 2024-02-07 15:09:16 +01:00
88d03c56ab Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model 2024-02-07 11:52:09 +01:00
32ee05ccef Fix default dimensions for models 2024-02-07 11:52:09 +01:00
74c180267e pass dimensions only when defined 2024-02-07 11:52:08 +01:00
517f5332d6 Allow actually passing dimensions for OpenAI source
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
9ac5750096 Retrieve the overriden dimensions from the configuration when fetching settings 2024-02-07 11:51:44 +01:00
7ae4013478 Make sure the overriden dimensions are always used when embedding 2024-02-07 11:51:44 +01:00
fb705116a6 feat: add new models and ability to override dimensions 2024-02-07 11:51:42 +01:00
053306c0e7 Try with 500MiB 2024-02-07 11:24:43 +01:00
84235a63df Merge #4360
4360: fix readme broken links r=curquiza a=Elliot67

# Pull Request

## What does this PR do?
- fix some links in the readme

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Elliot Lintz <45725915+Elliot67@users.noreply.github.com>
Co-authored-by: gui machiavelli <hey@guimachiavelli.com>
2024-02-06 16:00:16 +00:00
29f8300ac7 Update README.md 2024-02-06 16:49:29 +01:00
05edd85d75 Stabilize scoreDetails 2024-02-06 11:15:19 +01:00
9eeb75d501 Clamp the max memory of the grenad sorters to a reasonable maximum 2024-02-06 10:47:04 +01:00
4792651462 Merge #4384
4384: Bump peter-evans/repository-dispatch from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [peter-evans/repository-dispatch](https://github.com/peter-evans/repository-dispatch) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/peter-evans/repository-dispatch/releases">peter-evans/repository-dispatch's releases</a>.</em></p>
<blockquote>
<h2>Repository Dispatch v3.0.0</h2>
<p>⚙️  Updated runtime to Node.js 20</p>
<ul>
<li>The action now requires a minimum version of <a href="https://github.com/actions/runner/releases/tag/v2.308.0">v2.308.0</a> for the Actions runner. Update self-hosted runners to v2.308.0 or later to ensure compatibility.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Bump prettier to fix deps by <a href="https://github.com/peter-evans"><code>`@​peter-evans</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/255">peter-evans/repository-dispatch#255</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.12 to 18.17.14 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/257">peter-evans/repository-dispatch#257</a></li>
<li>build(deps-dev): bump <code>`@​vercel/ncc</code>` from 0.36.1 to 0.38.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/258">peter-evans/repository-dispatch#258</a></li>
<li>build(deps): bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/259">peter-evans/repository-dispatch#259</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.14 to 18.17.16 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/261">peter-evans/repository-dispatch#261</a></li>
<li>build(deps): bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/262">peter-evans/repository-dispatch#262</a></li>
<li>build(deps-dev): bump jest-circus from 29.6.4 to 29.7.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/263">peter-evans/repository-dispatch#263</a></li>
<li>build(deps-dev): bump eslint from 8.48.0 to 8.49.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/264">peter-evans/repository-dispatch#264</a></li>
<li>Update distribution by <a href="https://github.com/actions-bot"><code>`@​actions-bot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/265">peter-evans/repository-dispatch#265</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.16 to 18.17.18 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/266">peter-evans/repository-dispatch#266</a></li>
<li>build(deps-dev): bump eslint-plugin-github from 4.10.0 to 4.10.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/267">peter-evans/repository-dispatch#267</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.18 to 18.18.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/268">peter-evans/repository-dispatch#268</a></li>
<li>build(deps-dev): bump eslint from 8.49.0 to 8.50.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/269">peter-evans/repository-dispatch#269</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.0 to 18.18.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/271">peter-evans/repository-dispatch#271</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.0.0 to 5.0.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/275">peter-evans/repository-dispatch#275</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.3 to 18.18.5 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/274">peter-evans/repository-dispatch#274</a></li>
<li>build(deps-dev): bump eslint from 8.50.0 to 8.51.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/276">peter-evans/repository-dispatch#276</a></li>
<li>build(deps-dev): bump <code>`@​babel/traverse</code>` from 7.16.3 to 7.23.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/278">peter-evans/repository-dispatch#278</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.5 to 18.18.6 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/279">peter-evans/repository-dispatch#279</a></li>
<li>build(deps-dev): bump <code>`@​vercel/ncc</code>` from 0.38.0 to 0.38.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/280">peter-evans/repository-dispatch#280</a></li>
<li>build(deps-dev): bump eslint from 8.51.0 to 8.52.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/281">peter-evans/repository-dispatch#281</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.6 to 18.18.7 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/282">peter-evans/repository-dispatch#282</a></li>
<li>build(deps): bump actions/setup-node from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/283">peter-evans/repository-dispatch#283</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.7 to 18.18.8 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/284">peter-evans/repository-dispatch#284</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.8 to 18.18.9 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/285">peter-evans/repository-dispatch#285</a></li>
<li>build(deps-dev): bump eslint from 8.52.0 to 8.53.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/286">peter-evans/repository-dispatch#286</a></li>
<li>build(deps-dev): bump prettier from 3.0.3 to 3.1.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/287">peter-evans/repository-dispatch#287</a></li>
<li>build(deps-dev): bump eslint from 8.53.0 to 8.54.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/289">peter-evans/repository-dispatch#289</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.9 to 18.18.13 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/290">peter-evans/repository-dispatch#290</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.13 to 18.19.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/291">peter-evans/repository-dispatch#291</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.0 to 18.19.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/292">peter-evans/repository-dispatch#292</a></li>
<li>build(deps-dev): bump eslint from 8.54.0 to 8.55.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/293">peter-evans/repository-dispatch#293</a></li>
<li>build(deps-dev): bump prettier from 3.1.0 to 3.1.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/296">peter-evans/repository-dispatch#296</a></li>
<li>build(deps): bump actions/upload-artifact from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/295">peter-evans/repository-dispatch#295</a></li>
<li>build(deps-dev): bump eslint from 8.55.0 to 8.56.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/297">peter-evans/repository-dispatch#297</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.0.1 to 5.1.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/298">peter-evans/repository-dispatch#298</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.1.1 to 5.1.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/299">peter-evans/repository-dispatch#299</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.3 to 18.19.4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/300">peter-evans/repository-dispatch#300</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.1.2 to 5.1.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/301">peter-evans/repository-dispatch#301</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.4 to 18.19.6 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/302">peter-evans/repository-dispatch#302</a></li>
<li>build(deps-dev): bump prettier from 3.1.1 to 3.2.4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/303">peter-evans/repository-dispatch#303</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.6 to 18.19.8 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/304">peter-evans/repository-dispatch#304</a></li>
<li>feat: update runtime to node 20 by <a href="https://github.com/peter-evans"><code>`@​peter-evans</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/305">peter-evans/repository-dispatch#305</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="ff45666b94"><code>ff45666</code></a> feat: update runtime to node 20 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/305">#305</a>)</li>
<li><a href="a4a90276d0"><code>a4a9027</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.6 to 18.19.8 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/304">#304</a>)</li>
<li><a href="2605253283"><code>2605253</code></a> build(deps-dev): bump prettier from 3.1.1 to 3.2.4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/303">#303</a>)</li>
<li><a href="ab3258eeef"><code>ab3258e</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.4 to 18.19.6 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/302">#302</a>)</li>
<li><a href="240bc73193"><code>240bc73</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.1.2 to 5.1.3 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/301">#301</a>)</li>
<li><a href="8aa15c54a0"><code>8aa15c5</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.3 to 18.19.4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/300">#300</a>)</li>
<li><a href="22aa07cf23"><code>22aa07c</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.1.1 to 5.1.2 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/299">#299</a>)</li>
<li><a href="ba0298574b"><code>ba02985</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.0.1 to 5.1.1 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/298">#298</a>)</li>
<li><a href="accfd7b5bf"><code>accfd7b</code></a> build(deps-dev): bump eslint from 8.55.0 to 8.56.0 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/297">#297</a>)</li>
<li><a href="3c7d964ae9"><code>3c7d964</code></a> build(deps): bump actions/upload-artifact from 3 to 4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/295">#295</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/peter-evans/repository-dispatch/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=peter-evans/repository-dispatch&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-05 14:13:35 +00:00
58c3501b54 Bump peter-evans/repository-dispatch from 2 to 3
Bumps [peter-evans/repository-dispatch](https://github.com/peter-evans/repository-dispatch) from 2 to 3.
- [Release notes](https://github.com/peter-evans/repository-dispatch/releases)
- [Commits](https://github.com/peter-evans/repository-dispatch/compare/v2...v3)

---
updated-dependencies:
- dependency-name: peter-evans/repository-dispatch
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-01 17:05:50 +00:00
ff76d8f21a Merge #4382
4382: Bring back changes from `release-v1.6.1` into `main` r=curquiza a=dureuill

Bring back changes from release-v1.6.1 into main

Supersedes https://github.com/meilisearch/meilisearch/pull/4380 and #4381 

Third time's the charm

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2024-02-01 11:16:31 +00:00
698ea5139d Update Cargo.lock 2024-02-01 10:40:23 +01:00
880e790bff Update Cargo.toml 2024-02-01 10:33:27 +01:00
fbf5f2a392 Don't use a runtime in extract_embedder, use it only for OpenAI 2024-02-01 10:33:27 +01:00
1555870088 Truncate HuggingFace vectors that are too long 2024-02-01 10:33:27 +01:00
9f8f3105d5 make clippy happy 2024-02-01 10:33:27 +01:00
318843aacd add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-02-01 10:33:27 +01:00
6d111139b5 Add test 2024-02-01 10:33:27 +01:00
dff2707471 Use MatchingWords from keyword search instead of the one from vector search 2024-02-01 10:33:27 +01:00
c57f7f7379 Update version for the next release (v1.6.1) in Cargo.toml 2024-02-01 10:33:26 +01:00
b968616a99 Merge #4364
4364: Revert "Remove panic on the geosearch" r=curquiza a=irevoire

After more thought about it, we want to fix this bug in a patch release instead of `main`.
I revert this PR for now, but the fix will still land on `main` once we bring back the change of the `v1.6.1` on `main`.

Reverts meilisearch/meilisearch#4337

Co-authored-by: Tamo <irevoire@protonmail.ch>
2024-01-25 18:01:08 +00:00
c1bf33a112 Revert "Remove panic on the geosearch" 2024-01-25 18:51:19 +01:00
ddc2b7129a fix readme broken links 2024-01-24 22:50:18 +01:00
b6fc181993 Merge #4304
4304: Add CUDA GPU support for Hugging Face embedders r=Kerollmops a=dureuill

Adds a "cuda" feature to `milli`.

Compiling with this feature requires that the CUDA support library be installed (see "with CUDA support" paragraph in https://huggingface.github.io/candle/guide/installation.html), and adds CUDA support to the `huggingFace` embedder.

To enable GPU support, users will need to:

1. Have a compatible NVidia GPU under Linux
2. Follow [the guide](https://huggingface.github.io/candle/guide/installation.html) to install the CUDA dependencies
3. Compile Meilisearch with the `cuda` feature: `cargo build --release --features cuda`

# Impact

Enabling the CUDA feature allows to use an available GPU to compute embeddings with a `huggingFace` embedder. 
On an AWS Graviton 2, this yields a x3 - x5 improvement on indexing time.

# Technical details

- I had to change the CI so that the cuda feature is not included in the `Tests all features` workflow
- To achieve that, I had to add a binary following the `cargo xtask` design pattern, to list all features excepted the cuda one.
- I then changed the workflow accordingly (renamed to "Tests almost all features" 😉)
- A test run of the new feature was done on a temporary version of this PR that had it enabled for PRs: [See the results here](https://github.com/meilisearch/meilisearch/actions/runs/7461331929/job/20301216732)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-22 13:55:04 +00:00
388fce9e46 Merge #4345
4345: Bump h2 from 0.3.20 to 0.3.24 r=curquiza a=dependabot[bot]

Bumps [h2](https://github.com/hyperium/h2) from 0.3.20 to 0.3.24.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/hyperium/h2/releases">h2's releases</a>.</em></p>
<blockquote>
<h2>v0.3.24</h2>
<h2>Fixed</h2>
<ul>
<li>Limit error resets for misbehaving connections.</li>
</ul>
<h2>v0.3.23</h2>
<h2>What's Changed</h2>
<ul>
<li>cherry-pick fix: streams awaiting capacity lockout in <a href="https://redirect.github.com/hyperium/h2/pull/734">hyperium/h2#734</a></li>
</ul>
<h2>v0.3.22</h2>
<h2>What's Changed</h2>
<ul>
<li>Add <code>header_table_size(usize)</code> option to client and server builders.</li>
<li>Improve throughput when vectored IO is not available.</li>
<li>Update indexmap to 2.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/tottoto"><code>`@​tottoto</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/714">hyperium/h2#714</a></li>
<li><a href="https://github.com/xiaoyawei"><code>`@​xiaoyawei</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/712">hyperium/h2#712</a></li>
<li><a href="https://github.com/Protryon"><code>`@​Protryon</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/719">hyperium/h2#719</a></li>
<li><a href="https://github.com/4JX"><code>`@​4JX</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/638">hyperium/h2#638</a></li>
<li><a href="https://github.com/vuittont60"><code>`@​vuittont60</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/724">hyperium/h2#724</a></li>
</ul>
<h2>v0.3.21</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix opening of new streams over peer's max concurrent limit.</li>
<li>Fix <code>RecvStream</code> to return data even if it has received a <code>CANCEL</code> stream error.</li>
<li>Update MSRV to 1.63.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/DDtKey"><code>`@​DDtKey</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/703">hyperium/h2#703</a></li>
<li><a href="https://github.com/jwilm"><code>`@​jwilm</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/707">hyperium/h2#707</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md">h2's changelog</a>.</em></p>
<blockquote>
<h1>0.3.24 (January 17, 2024)</h1>
<ul>
<li>Limit error resets for misbehaving connections.</li>
</ul>
<h1>0.3.23 (January 10, 2024)</h1>
<ul>
<li>Backport fix from 0.4.1 for stream capacity assignment.</li>
</ul>
<h1>0.3.22 (November 15, 2023)</h1>
<ul>
<li>Add <code>header_table_size(usize)</code> option to client and server builders.</li>
<li>Improve throughput when vectored IO is not available.</li>
<li>Update indexmap to 2.</li>
</ul>
<h1>0.3.21 (August 21, 2023)</h1>
<ul>
<li>Fix opening of new streams over peer's max concurrent limit.</li>
<li>Fix <code>RecvStream</code> to return data even if it has received a <code>CANCEL</code> stream error.</li>
<li>Update MSRV to 1.63.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="7243ab5854"><code>7243ab5</code></a> Prepare v0.3.24</li>
<li><a href="d919cd6fd8"><code>d919cd6</code></a> streams: limit error resets for misbehaving connections</li>
<li><a href="a7eb14a487"><code>a7eb14a</code></a> v0.3.23</li>
<li><a href="b668c7fbe2"><code>b668c7f</code></a> fix: streams awaiting capacity lockout (<a href="https://redirect.github.com/hyperium/h2/issues/730">#730</a>) (<a href="https://redirect.github.com/hyperium/h2/issues/734">#734</a>)</li>
<li><a href="0f412d8b9c"><code>0f412d8</code></a> v0.3.22</li>
<li><a href="c7ca62f69b"><code>c7ca62f</code></a> docs: fix typos (<a href="https://redirect.github.com/hyperium/h2/issues/724">#724</a>)</li>
<li><a href="ef743ecb22"><code>ef743ec</code></a> Add a setter for header_table_size (<a href="https://redirect.github.com/hyperium/h2/issues/638">#638</a>)</li>
<li><a href="56651e6e51"><code>56651e6</code></a> fix lint about unused import</li>
<li><a href="4aa7b16342"><code>4aa7b16</code></a> Fix documentation for max_send_buffer_size (<a href="https://redirect.github.com/hyperium/h2/issues/718">#718</a>)</li>
<li><a href="d03c54a80d"><code>d03c54a</code></a> chore(dependencies): update tracing minimal version to 0.1.35</li>
<li>Additional commits viewable in <a href="https://github.com/hyperium/h2/compare/v0.3.20...v0.3.24">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=h2&package-manager=cargo&previous-version=0.3.20&new-version=0.3.24)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-22 11:53:51 +00:00
d35fe43fd5 Update lock file 2024-01-22 10:49:17 +01:00
f692021bfc Implement PR comments 2024-01-22 10:25:56 +01:00
1b90778bf5 Change CI 2024-01-22 10:25:56 +01:00
66ae81a909 Make it so binary can be used with cargo xtask 2024-01-22 10:25:56 +01:00
4aa4a15dc9 Add to Cargo.lock 2024-01-22 10:25:54 +01:00
4b4e8ea2a4 Add binary to list features 2024-01-22 10:25:16 +01:00
84f49d76cd Add cuda feature 2024-01-22 10:25:16 +01:00
afb0e8eab9 Merge #4325
4325: Add Setting API reminder in issue template r=ManyTheFish a=ManyTheFish

When adding a new setting, several important points can be easily forgotten.
This PR adds a small reminder list of some of these points in the issue template.


Co-authored-by: Many the fish <many@meilisearch.com>
2024-01-22 09:02:27 +00:00
b5b2333a05 Bump h2 from 0.3.20 to 0.3.24
Bumps [h2](https://github.com/hyperium/h2) from 0.3.20 to 0.3.24.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md)
- [Commits](https://github.com/hyperium/h2/compare/v0.3.20...v0.3.24)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-19 16:20:22 +00:00
40fa0b4df6 Update .github/ISSUE_TEMPLATE/sprint_issue.md 2024-01-18 11:17:29 +01:00
ab4d614599 Update .github/ISSUE_TEMPLATE/sprint_issue.md
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-18 10:28:30 +01:00
262b20fdba Merge #4330
4330: Add job variable to grafana dashboard r=irevoire a=capJavert

# Pull Request

## Related issue
Fixes https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282

## What does this PR do?

"meilisearch" as [job_name](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#job_name) was hardcoded in the dashboard config so if user sets anything but "meilisearch" as job_name on prometheus side the dashboard does not work.

With this change dasboard will auto load the values from data source (much like instance variable) and show the correct data. This now also adds support for multiple meilisearch jobs in single dashboard. 

See: https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: capJavert <ante@kickass.website>
2024-01-17 15:48:24 +00:00
9020606c45 Merge #4337
4337: Remove panic on the geosearch r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes  #4333

## What does this PR do?
- Add tests for the enrich pipeline on malformed documents with `null` value
- Reproduce the issue when updating the settings while there is malformed documents in the DB
- Fix the bug


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-17 15:09:46 +00:00
0887186ecf make clippy happy 2024-01-17 16:07:10 +01:00
7d190d8078 add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-01-17 15:51:52 +01:00
3b8a9597e2 Merge #4332
4332: Update the dependencies r=irevoire a=Kerollmops

This PR upgrades the dependencies and fixes #4287.

 - ~We keep arroy at the current commit. We will release and use the latest version published when possible~
 - We also updated arroy to 0.2.0.
 - I rolled back the version of rustls has too many breaking changes.
 - I had to keep HTTP to 0.2.11 due to actix-cors.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-01-17 13:42:02 +00:00
f275554982 Make sure we override the default Rust version 2024-01-16 18:10:30 +01:00
d997ea1f01 Make Clippy happy 2024-01-16 17:10:48 +01:00
50e1d34c66 Rollback http to 0.2.11 2024-01-16 16:57:33 +01:00
406531c991 Fix sysinfo 2024-01-16 16:49:51 +01:00
01e2c3d6bb Bump arroy to v0.2.0 2024-01-16 16:45:55 +01:00
cfaa522d68 Bump the Rust version to 1.75.0 2024-01-16 16:36:54 +01:00
0c8d1644a6 Rollback rustls to 0.20.9 2024-01-16 15:55:16 +01:00
5e0268d40e Fix the sysinfo errors 2024-01-16 15:43:03 +01:00
9f9ad4cc05 Fix Clippy warnings 2024-01-16 15:27:24 +01:00
3ee7682fa7 Fix some integer comparisons 2024-01-16 15:22:23 +01:00
7f125bfb12 Update incompatible dependencies 2024-01-16 15:15:54 +01:00
5869ca7716 Upgrade all compatible dependencies 2024-01-16 15:05:03 +01:00
7a89abd2a0 Merge #4263
4263: Bump rustls-webpki from 0.101.3 to 0.101.7 r=irevoire a=dependabot[bot]

Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.101.3 to 0.101.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/rustls/webpki/releases">rustls-webpki's releases</a>.</em></p>
<blockquote>
<h2>0.101.7</h2>
<ul>
<li>Upgrades <code>*ring*</code> to 0.17, and <code>untrusted</code> to 0.9. Note: since <code>untrusted</code> appears in the <code>Error</code> API this may be a breaking change for applications using two <code>untrusted</code> versions.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Simplify tests for DER errors by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Upgrade to ring 0.17, untrusted 0.9 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Bump MSRV to 1.61 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Upgrade to rcgen 0.11.3 by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/189">rustls/webpki#189</a>, <a href="https://redirect.github.com/rustls/webpki/pull/195">rustls/webpki#195</a></li>
<li>v0.101.7 preparation by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/199">rustls/webpki#199</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.6...v/0.101.7">https://github.com/rustls/webpki/compare/v/0.101.6...v/0.101.7</a></p>
<h2>0.101.6</h2>
<ul>
<li>The <code>CertificateRevocationList</code> trait's <code>verify_signature</code> <code>Budget</code> argument was removed. This was a semver incompatible change mistakenly introduced in v0.101.5.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>crl: rm Budget from verify_signature fn by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/187">rustls/webpki#187</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.5...v/0.101.6">https://github.com/rustls/webpki/compare/v/0.101.5...v/0.101.6</a></p>
<h2>0.101.5</h2>
<ul>
<li>Path building complexity is now limited to a maximum budget of path finding operations, avoiding exponential processing time when encountering certificate chains containing many certificates with the same subject/issuer distinguished name but different subject public key information.</li>
<li>Name constraints evaluation is now limited to a maximum number of comparison operations, avoiding exponential processing time when encountering certificate chains containing many name constraints and subject alternate names.</li>
<li>Subject common names are no longer parsed for name iteration, or applying name constraints. Webpki only uses Subject Alternate Names when validating certificates, and the common name handling was buggy, producing <code>Error::BadDer</code> when iterating certificates with printable string subject common names, or omitted common names encoded as an empty sequence.</li>
</ul>
<h2>What's Changed</h2>
<p>The following PRs were backported to the rel-0.101 branch in <a href="https://redirect.github.com/rustls/webpki/issues/170">#170</a>:</p>
<ul>
<li>Further limits on expensive path building (<a href="https://redirect.github.com/rustls/webpki/issues/163">#163</a>)</li>
<li>Budget tweaks (<a href="https://redirect.github.com/rustls/webpki/issues/164">#164</a>)</li>
<li>Bound name constraint comparisons (<a href="https://redirect.github.com/rustls/webpki/issues/165">#165</a>)</li>
<li>Remove subject common name parsing (<a href="https://redirect.github.com/rustls/webpki/issues/169">#169</a>, thanks to <a href="https://github.com/hawkw"><code>`@​hawkw</code></a>)</li>`
<li>Correct handling of fatal errors (<a href="https://redirect.github.com/rustls/webpki/issues/168">#168</a>)</li>
</ul>
<p>Thanks to all who have contributed, on behalf of the rustls team (<a href="https://github.com/ctz"><code>`@​ctz</code></a>,` <a href="https://github.com/cpu"><code>`@​cpu</code></a>` and <a href="https://github.com/djc"><code>`@​djc</code></a>)!</p>`
<h2>0.101.4</h2>
<h2>Release notes</h2>
<ul>
<li>certificate path building and verification is now capped at 100 signature validation operations to avoid the risk of CPU usage denial-of-service attack when validating crafted certificate chains producing quadratic runtime. This risk affected both clients, as well as servers that verified client certificates.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>v0.101.4 prep by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/153">rustls/webpki#153</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.4">https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.4</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="ee5aab1dff"><code>ee5aab1</code></a> Cargo: v0.101.6 -&gt; v0.101.7</li>
<li><a href="4f721a901f"><code>4f721a9</code></a> Upgrade to rcgen 0.11.3</li>
<li><a href="3be3625584"><code>3be3625</code></a> Bump MSRV to 1.61</li>
<li><a href="bb7c7f47ab"><code>bb7c7f4</code></a> Upgrade to ring 0.17, untrusted 0.9</li>
<li><a href="2eeb2920cf"><code>2eeb292</code></a> Simplify tests for DER errors</li>
<li><a href="7956538ee7"><code>7956538</code></a> Cargo: v0.101.5 -&gt; v0.101.6</li>
<li><a href="7f8208ec06"><code>7f8208e</code></a> crl: rm <code>Budget</code> from <code>verify_signature</code> fn</li>
<li><a href="7cb6c646a0"><code>7cb6c64</code></a> Cargo: bump version 0.101.4 -&gt; 0.101.5</li>
<li><a href="2dd2a06016"><code>2dd2a06</code></a> verify_cert: use enum for build chain error</li>
<li><a href="c255d61a6a"><code>c255d61</code></a> verify_cert: correct handling of fatal errors</li>
<li>Additional commits viewable in <a href="https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.7">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rustls-webpki&package-manager=cargo&previous-version=0.101.3&new-version=0.101.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-16 13:55:49 +00:00
d9d0419845 Update the dependencies 2024-01-16 14:38:48 +01:00
5dc8d9e9bf feat: add job variable to dashboard
meilisearch job_name was hardcoded in the dashboard config

so if user sets anything but meilisearch as job_name on

prometheus side the dashboard does not work

see: https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282
2024-01-16 12:44:37 +01:00
9e12a91afb Update .github/ISSUE_TEMPLATE/sprint_issue.md 2024-01-16 11:04:50 +01:00
8e016fbfeb Merge #4319
4319: Update README r=curquiza a=codesmith-emmy

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: emmanuel <154705254+codesmith-emmy@users.noreply.github.com>
2024-01-15 18:41:14 +00:00
1ccde9bf0b Merge #4316
4316: Autobatch the task deletions r=curquiza a=irevoire

# Pull Request

## Related issue
Fix part of https://github.com/meilisearch/meilisearch-support/issues/69
Fix #4315 

## What does this PR do?
- Autobatch the task deletions

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-15 17:54:50 +00:00
34e814f400 Merge #4327
4327: Bring back changes from `release-v1.6.0` to `main` r=dureuill a=curquiza



Co-authored-by: Paul Sanders <psanders1@gmail.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2024-01-15 16:52:05 +00:00
857cd09285 Add Setting API reminder in issue template
When adding a new setting, there are several important points that can be easily forgotten.
This PR adds a small reminder list of some of these points.
2024-01-15 11:19:13 +01:00
552127021f Update 2024-01-12 16:03:23 +01:00
b4d7d80ad9 autobatch the task deletions 2024-01-11 14:58:07 +01:00
5204c0b60b Merge #4297
4297: Update license for 2024 r=curquiza a=meili-bot

_This PR is auto-generated._


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2024-01-03 13:54:19 +00:00
e73cd692db Update LICENSE 2024-01-03 14:32:41 +01:00
29b453346b Merge #4293
4293: Update SDK test dependencies r=curquiza a=curquiza

Replace dependabot updates

The changes are really un-impactful for the engine team velocity because is about a CI
- that does not run during release deployment
- that does not run to merge a PR

It's only a weekly scheduled CI to check the breaking we introduced in the integrations.

I updated the dependencies based on what we do on the integration CIs
For example for dart, I looked at what we have in the [Dart CI](63fd758882/.github/workflows/tests.yml (L16-L54)) and I updated our CI in this repo accordingly. I did the same for each repository. This ensures we test the same things.


Co-authored-by: curquiza <clementine@meilisearch.com>
2024-01-03 13:26:50 +00:00
c4bb435374 Merge #4295
4295: fix compilation warnings on main r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4292

## What does this PR do?
- Removed unused imports

#4294 fixes the issue for the release v1.6

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-02 15:33:06 +00:00
2bcff2ea46 fix warning 2024-01-02 15:19:00 +01:00
1275e72e0b Update SDK test dependencies 2024-01-02 09:59:46 +01:00
d868131bb7 Bump rustls-webpki from 0.101.3 to 0.101.7
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.101.3 to 0.101.7.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.7)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-18 14:57:38 +00:00
393 changed files with 57078 additions and 12941 deletions

2
.cargo/config.toml Normal file
View File

@ -0,0 +1,2 @@
[alias]
xtask = "run --release --package xtask --"

View File

@ -2,14 +2,13 @@
name: New sprint issue
about: ⚠️ Should only be used by the engine team ⚠️
title: ''
labels: ''
labels: 'missing usage in PRD, impacts docs'
assignees: ''
---
Related product team resources: [PRD]() (_internal only_)
Related product discussion:
Related spec: WIP
## Motivation
@ -21,12 +20,25 @@ Related spec: WIP
## TODO
<!---Feel free to adapt this list with more technical/product steps-->
<!---If necessary, create a list with technical/product steps-->
- [ ] Release a prototype
- [ ] If prototype validated, merge changes into `main`
- [ ] Update the spec
### Reminders when modifying the Setting API
<!--- Special steps to remind when adding a new index setting -->
- [ ] Ensure the new setting route is at least tested by the [`test_setting_routes` macro](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/tests/settings/get_settings.rs#L276)
- [ ] Ensure Analytics are fully implemented
- [ ] `/settings/my-new-setting` configurated in the [`make_setting_routes` macro](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/src/routes/indexes/settings.rs#L141-L165)
- [ ] global `/settings` route configurated in the [`update_all` function](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/src/routes/indexes/settings.rs#L655-L751)
- [ ] Ensure the dump serializing is consistent with the `/settings` route serializing, e.g., enums case can be different (`camelCase` in route and `PascalCase` in the dump)
#### Special cases when adding a setting for an experimental feature
- [ ] ⚠️ API stability: The setting does not appear on the main settings route when the feature has never been enabled (e.g. mark it `Unset` when returned from the index in this situation. See [an example](https://github.com/meilisearch/meilisearch/blob/7a89abd2a025606a42f8b219e539117eb2eb029f/meilisearch-types/src/settings.rs#L608))
- [ ] The setting cannot be set when the feature is disabled, either by the main settings route or the subroute (see [`validate_settings` function](https://github.com/meilisearch/meilisearch/blob/7a89abd2a025606a42f8b219e539117eb2eb029f/meilisearch/src/routes/indexes/settings.rs#L811))
- [ ] If possible, the setting is reset when the feature is disabled (hard if it requires reindexing)
## Impacted teams
<!---Ping the related teams. Ask for the engine manager if any hesitation-->
<!---@meilisearch/docs-team when there is any API change, e.g. settings addition-->

28
.github/workflows/bench-manual.yml vendored Normal file
View File

@ -0,0 +1,28 @@
name: Bench (manual)
on:
workflow_dispatch:
inputs:
workload:
description: 'The path to the workloads to execute (workloads/...)'
required: true
default: 'workloads/movies.json'
env:
WORKLOAD_NAME: ${{ github.event.inputs.workload }}
jobs:
benchmarks:
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}

82
.github/workflows/bench-pr.yml vendored Normal file
View File

@ -0,0 +1,82 @@
name: Bench (PR)
on:
issue_comment:
types: [created]
permissions:
issues: write
env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs:
run-benchmarks-on-comment:
if: startsWith(github.event.comment.body, '/bench')
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- name: Check permissions
id: permission
env:
PR_AUTHOR: ${{github.event.issue.user.login }}
COMMENT_AUTHOR: ${{github.event.comment.user.login }}
REPOSITORY: ${{github.repository}}
PR_ID: ${{github.event.issue.number}}
run: |
PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name)
if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push)
then
echo "::notice title=Authentication success::PR author authenticated"
else
echo "::error title=Authentication error::PR author doesn't have push permission on this repository"
exit 1
fi
if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push)
then
echo "::notice title=Authentication success::Comment author authenticated"
else
echo "::error title=Authentication error::Comment author doesn't have push permission on this repository"
exit 1
fi
if [ "$PR_REPOSITORY" = "$REPOSITORY" ]
then
echo "::notice title=Authentication success::PR started from main repository"
else
echo "::error title=Authentication error::PR started from a fork"
exit 1
fi
- name: Check for Command
id: command
uses: xt0rted/slash-command-action@v2
with:
command: bench
reaction-type: "rocket"
repo-token: ${{ env.GH_TOKEN }}
- uses: xt0rted/pull-request-comment-branch@v2
id: comment-branch
with:
repo_token: ${{ env.GH_TOKEN }}
- uses: actions/checkout@v3
if: success()
with:
fetch-depth: 0 # fetch full history to be able to get main commit sha
ref: ${{ steps.comment-branch.outputs.head_ref }}
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
- name: Run benchmarks on PR ${{ github.event.issue.id }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" \
--dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" \
--reason "[Comment](${{ github.event.comment.html_url }}) on [#${{ github.event.issue.number }}](${{ github.event.issue.html_url }})" \
-- ${{ steps.command.outputs.command-arguments }} > benchlinks.txt
- name: Send comment in PR
run: |
gh pr comment ${{github.event.issue.number}} --body-file benchlinks.txt

View File

@ -0,0 +1,23 @@
name: Indexing bench (push)
on:
push:
branches:
- main
jobs:
benchmarks:
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- uses: actions/checkout@v3
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
# Run benchmarks
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json

View File

@ -18,11 +18,9 @@ jobs:
timeout-minutes: 4320 # 72h
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Set variables
- name: Set current branch name

View File

@ -13,11 +13,40 @@ jobs:
runs-on: benchmarks
timeout-minutes: 4320 # 72h
steps:
- uses: actions-rs/toolchain@v1
- name: Check permissions
id: permission
env:
PR_AUTHOR: ${{github.event.issue.user.login }}
COMMENT_AUTHOR: ${{github.event.comment.user.login }}
REPOSITORY: ${{github.repository}}
PR_ID: ${{github.event.issue.number}}
run: |
PR_REPOSITORY=$(gh api /repos/"$REPOSITORY"/pulls/"$PR_ID" --jq .head.repo.full_name)
if $(gh api /repos/"$REPOSITORY"/collaborators/"$PR_AUTHOR"/permission --jq .user.permissions.push)
then
echo "::notice title=Authentication success::PR author authenticated"
else
echo "::error title=Authentication error::PR author doesn't have push permission on this repository"
exit 1
fi
if $(gh api /repos/"$REPOSITORY"/collaborators/"$COMMENT_AUTHOR"/permission --jq .user.permissions.push)
then
echo "::notice title=Authentication success::Comment author authenticated"
else
echo "::error title=Authentication error::Comment author doesn't have push permission on this repository"
exit 1
fi
if [ "$PR_REPOSITORY" = "$REPOSITORY" ]
then
echo "::notice title=Authentication success::PR started from main repository"
else
echo "::error title=Authentication error::PR started from a fork"
exit 1
fi
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
- name: Check for Command
id: command

View File

@ -16,11 +16,9 @@ jobs:
timeout-minutes: 4320 # 72h
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Set variables
- name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Set variables
- name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Set variables
- name: Set current branch name

View File

@ -15,11 +15,9 @@ jobs:
runs-on: benchmarks
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Set variables
- name: Set current branch name

View File

@ -1,4 +1,5 @@
name: Look for flaky tests
on:
workflow_dispatch:
schedule:
@ -8,18 +9,15 @@ jobs:
flaky:
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Install cargo-flaky
run: cargo install cargo-flaky
- name: Run cargo flaky in the dumps

View File

@ -12,11 +12,9 @@ jobs:
timeout-minutes: 4320 # 72h
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
# Run benchmarks
- name: Run the fuzzer

View File

@ -110,6 +110,44 @@ jobs:
--milestone $MILESTONE_VERSION \
--assignee curquiza
create-update-version-issue:
needs: get-release-version
# Create the update-version issue even if the release is a patch release
if: github.event.action == 'created'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-version-issue.md > $ISSUE_TEMPLATE
- name: Create the issue
run: |
gh issue create \
--title "Update version in Cargo.toml for $MILESTONE_VERSION" \
--label 'maintenance' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION
create-update-openapi-issue:
needs: get-release-version
# Create the openAPI issue if the release is not only a patch release
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-openapi-issue.md > $ISSUE_TEMPLATE
- name: Create the issue
run: |
gh issue create \
--title "Update Open API file for $MILESTONE_VERSION" \
--label 'maintenance' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION
# ----------------
# MILESTONE CLOSED
# ----------------

View File

@ -18,17 +18,14 @@ jobs:
runs-on: ubuntu-latest
needs: check-version
container:
# Use ubuntu-18.04 to compile with glibc 2.27
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Install cargo-deb
run: cargo install cargo-deb
- uses: actions/checkout@v3

View File

@ -37,18 +37,15 @@ jobs:
runs-on: ubuntu-latest
needs: check-version
container:
# Use ubuntu-18.04 to compile with glibc 2.27
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
@ -78,10 +75,7 @@ jobs:
asset_name: meilisearch-windows-amd64.exe
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Build
run: cargo build --release --locked
# No need to upload binaries for dry run (cron)
@ -107,12 +101,10 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v3
- name: Installing Rust toolchain
uses: actions-rs/toolchain@v1
uses: dtolnay/rust-toolchain@1.79
with:
toolchain: stable
profile: minimal
target: ${{ matrix.target }}
override: true
- name: Cargo build
uses: actions-rs/cargo@v1
with:
@ -132,9 +124,11 @@ jobs:
name: Publish binary for aarch64
runs-on: ubuntu-latest
needs: check-version
env:
DEBIAN_FRONTEND: noninteractive
container:
# Use ubuntu-18.04 to compile with glibc 2.27
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
strategy:
matrix:
include:
@ -154,12 +148,10 @@ jobs:
add-apt-repository "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update -y && apt-get install -y docker-ce
- name: Installing Rust toolchain
uses: actions-rs/toolchain@v1
uses: dtolnay/rust-toolchain@1.79
with:
toolchain: stable
profile: minimal
target: ${{ matrix.target }}
override: true
- name: Configure target aarch64 GNU
## Environment variable is not passed using env:
## LD gold won't work with MUSL
@ -170,6 +162,9 @@ jobs:
echo '[target.aarch64-unknown-linux-gnu]' >> ~/.cargo/config
echo 'linker = "aarch64-linux-gnu-gcc"' >> ~/.cargo/config
echo 'JEMALLOC_SYS_WITH_LG_PAGE=16' >> $GITHUB_ENV
- name: Install a default toolchain that will be used to build cargo cross
run: |
rustup default stable
- name: Cargo build
uses: actions-rs/cargo@v1
with:

View File

@ -80,10 +80,11 @@ jobs:
type=ref,event=tag
type=raw,value=nightly,enable=${{ github.event_name != 'push' }}
type=semver,pattern=v{{major}}.{{minor}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }}
type=semver,pattern=v{{major}},enable=${{ steps.check-tag-format.outputs.stable == 'true' }}
type=raw,value=latest,enable=${{ steps.check-tag-format.outputs.stable == 'true' && steps.check-tag-format.outputs.latest == 'true' }}
- name: Build and push
uses: docker/build-push-action@v5
uses: docker/build-push-action@v6
with:
push: true
platforms: linux/amd64,linux/arm64
@ -97,7 +98,7 @@ jobs:
- name: Send CI information to Cloud team
# Do not send if nightly build (i.e. 'schedule' or 'workflow_dispatch' event)
if: github.event_name == 'push'
uses: peter-evans/repository-dispatch@v2
uses: peter-evans/repository-dispatch@v3
with:
token: ${{ secrets.MEILI_BOT_GH_PAT }}
repository: meilisearch/meilisearch-cloud

View File

@ -22,7 +22,7 @@ jobs:
outputs:
docker-image: ${{ steps.define-image.outputs.docker-image }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Define the Docker image we need to use
id: define-image
run: |
@ -46,11 +46,11 @@ jobs:
MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-dotnet
- name: Setup .NET Core
uses: actions/setup-dotnet@v3
uses: actions/setup-dotnet@v4
with:
dotnet-version: "6.0.x"
- name: Install dependencies
@ -75,12 +75,12 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-dart
- uses: dart-lang/setup-dart@v1
with:
sdk: 3.1.1
sdk: 'latest'
- name: Install dependencies
run: dart pub get
- name: Run integration tests
@ -100,10 +100,10 @@ jobs:
- '7700:7700'
steps:
- name: Set up Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: stable
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-go
- name: Get dependencies
@ -129,11 +129,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-java
- name: Set up Java
uses: actions/setup-java@v3
uses: actions/setup-java@v4
with:
java-version: 8
distribution: 'zulu'
@ -156,7 +156,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-js
- name: Setup node
@ -191,7 +191,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-php
- name: Install PHP
@ -220,11 +220,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-python
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
- name: Install pipenv
uses: dschep/install-pipenv-action@v1
- name: Install dependencies
@ -245,7 +245,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-ruby
- name: Set up Ruby 3
@ -270,7 +270,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-rust
- name: Build
@ -291,7 +291,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-swift
- name: Run tests
@ -314,7 +314,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-js-plugins
- name: Setup node
@ -345,7 +345,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-rails
- name: Set up Ruby 3
@ -369,7 +369,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-symfony
- name: Install PHP

View File

@ -19,11 +19,11 @@ env:
jobs:
test-linux:
name: Tests on ubuntu-18.04
name: Tests on ubuntu-20.04
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
@ -31,17 +31,7 @@ jobs:
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- name: Setup test with Rust stable
if: github.event_name != 'schedule'
uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Setup test with Rust nightly
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
uses: actions-rs/toolchain@v1
with:
toolchain: nightly
override: true
uses: dtolnay/rust-toolchain@1.79
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1
- name: Run cargo check without any default features
@ -66,6 +56,7 @@ jobs:
- uses: actions/checkout@v3
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1
- uses: dtolnay/rust-toolchain@1.79
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -78,11 +69,11 @@ jobs:
args: --locked --release --all
test-all-features:
name: Tests all features
name: Tests almost all features
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
@ -90,26 +81,19 @@ jobs:
run: |
apt-get update
apt-get install --assume-yes build-essential curl
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Run cargo build with all features
uses: actions-rs/cargo@v1
with:
command: build
args: --workspace --locked --release --all-features
- name: Run cargo test with all features
uses: actions-rs/cargo@v1
with:
command: test
args: --workspace --locked --release --all-features
- uses: dtolnay/rust-toolchain@1.79
- name: Run cargo build with almost all features
run: |
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
- name: Run cargo test with almost all features
run: |
cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
test-disabled-tokenization:
name: Test disabled tokenization
runs-on: ubuntu-latest
container:
image: ubuntu:18.04
image: ubuntu:20.04
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v3
@ -117,13 +101,10 @@ jobs:
run: |
apt-get update
apt-get install --assume-yes build-essential curl
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Run cargo tree without default features and check lindera is not present
run: |
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -vqz lindera; then
if cargo tree -f '{p} {f}' -e normal --no-default-features | grep -qz lindera; then
echo "lindera has been found in the sources and it shouldn't"
exit 1
fi
@ -136,18 +117,15 @@ jobs:
name: Run tests in debug
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
image: ubuntu:18.04
# Use ubuntu-20.04 to compile with glibc 2.28
image: ubuntu:20.04
steps:
- uses: actions/checkout@v3
- name: Install needed dependencies
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- uses: dtolnay/rust-toolchain@1.79
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1
- name: Run tests in debug
@ -161,11 +139,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: 1.71.1
override: true
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.1
@ -180,10 +156,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: nightly
toolchain: nightly-2024-07-09
override: true
components: rustfmt
- name: Cache dependencies

View File

@ -18,11 +18,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@1.79
with:
profile: minimal
toolchain: stable
override: true
- name: Install sd
run: cargo install sd
- name: Update Cargo.toml file

2
.gitignore vendored
View File

@ -9,6 +9,8 @@
/data.ms
/snapshots
/dumps
/bench
/_xtask_benchmark.ms
# Snapshots
## ... large

371
BENCHMARKS.md Normal file
View File

@ -0,0 +1,371 @@
# Benchmarks
Currently this repository hosts two kinds of benchmarks:
1. The older "milli benchmarks", that use [criterion](https://github.com/bheisler/criterion.rs) and live in the "benchmarks" directory.
2. The newer "bench" that are workload-based and so split between the [`workloads`](./workloads/) directory and the [`xtask::bench`](./xtask/src/bench/) module.
This document describes the newer "bench" benchmarks. For more details on the "milli benchmarks", see [benchmarks/README.md](./benchmarks/README.md).
## Design philosophy for the benchmarks
The newer "bench" benchmarks are **integration** benchmarks, in the sense that they spawn an actual Meilisearch server and measure its performance end-to-end, including HTTP request overhead.
Since this is prone to fluctuating, the benchmarks regain a bit of precision by measuring the runtime of the individual spans using the [logging machinery](./CONTRIBUTING.md#logging) of Meilisearch.
A span roughly translates to a function call. The benchmark runner collects all the spans by name using the [logs route](https://github.com/orgs/meilisearch/discussions/721) and sums their runtime. The processed results are then sent to the [benchmark dashboard](https://bench.meilisearch.dev), which is in charge of storing and presenting the data.
## Running the benchmarks
Benchmarks can run locally or in CI.
### Locally
#### With a local benchmark dashboard
The benchmarks dashboard lives in its [own repository](https://github.com/meilisearch/benchboard). We provide binaries for Ubuntu/Debian, but you can build from source for other platforms (MacOS should work as it was developed under that platform).
Run the `benchboard` binary to create a fresh database of results. By default it will serve the results and the API to gather results on `http://localhost:9001`.
From the Meilisearch repository, you can then run benchmarks with:
```sh
cargo xtask bench -- workloads/my_workload_1.json ..
```
This command will build and run Meilisearch locally on port 7700, so make sure that this port is available.
To run benchmarks on a different commit, just use the usual git command to get back to the desired commit.
#### Without a local benchmark dashboard
To work with the raw results, you can also skip using a local benchmark dashboard.
Run:
```sh
cargo xtask bench --no-dashboard -- workloads/my_workload_1.json workloads/my_workload_2.json ..
```
For processing the results, look at [Looking at benchmark results/Without dashboard](#without-dashboard).
### In CI
We have dedicated runners to run workloads on CI. Currently, there are three ways of running the CI:
1. Automatically, on every push to `main`.
2. Manually, by clicking the [`Run workflow`](https://github.com/meilisearch/meilisearch/actions/workflows/bench-manual.yml) button and specifying the target reference (tag, commit or branch) as well as one or multiple workloads to run. The workloads must exist in the Meilisearch repository (conventionally, in the [`workloads`](./workloads/) directory) on the target reference. Globbing (e.g., `workloads/*.json`) works.
3. Manually on a PR, by posting a comment containing a `/bench` command, followed by one or multiple workloads to run. Globbing works. The workloads must exist in the Meilisearch repository in the branch of the PR.
```
/bench workloads/movies*.json /hackernews_1M.json
```
## Looking at benchmark results
### On the dashboard
Results are available on the global dashboard used by CI at <https://bench.meilisearch.dev> or on your [local dashboard](#with-a-local-benchmark-dashboard).
The dashboard homepage presents three sections:
1. The latest invocations (a call to `cargo xtask bench`, either local or by CI) with their reason (generally set to some helpful link in CI) and their status.
2. The latest workloads ran on `main`.
3. The latest workloads ran on other references.
By default, the workload shows the total runtime delta with the latest applicable commit on `main`. The latest applicable commit is the latest commit for workload invocations that do not originate on `main`, and the latest previous commit for workload invocations that originate on `main`.
You can explicitly request a detailed comparison by span with the `main` branch, the branch or origin, or any previous commit, by clicking the links at the bottom of the workload invocation.
In the detailed comparison view, the spans are sorted by improvements, regressions, stable (no statistically significant change) and unstable (the span runtime is comparable to its standard deviation).
You can click on the name of any span to get a box plot comparing the target commit with multiple commits of the selected branch.
### Without dashboard
After the workloads are done running, the reports will live in the Meilisearch repository, in the `bench/reports` directory (by default).
You can then convert these reports into other formats.
- To [Firefox profiler](https://profiler.firefox.com) format. Run:
```sh
cd bench/reports
cargo run --release --bin trace-to-firefox -- my_workload_1-0-trace.json
```
You can then upload the resulting `firefox-my_workload_1-0-trace.json` file to the online profiler.
## Designing benchmark workloads
Benchmark workloads conventionally live in the `workloads` directory of the Meilisearch repository.
They are JSON files with the following structure (comments are not actually supported, to make your own, remove them or copy some existing workload file):
```jsonc
{
// Name of the workload. Must be unique to the workload, as it will be used to group results on the dashboard.
"name": "hackernews.ndjson_1M,no-threads",
// Number of consecutive runs of the commands that should be performed.
// Each run uses a fresh instance of Meilisearch and a fresh database.
// Each run produces its own report file.
"run_count": 3,
// List of arguments to add to the Meilisearch command line.
"extra_cli_args": ["--max-indexing-threads=1"],
// An expression that can be parsed as a comma-separated list of targets and levels
// as described in [tracing_subscriber's documentation](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/targets/struct.Targets.html#examples).
// The expression is used to filter the spans that are measured for profiling purposes.
// Optional, defaults to "indexing::=trace" (for indexing workloads), common other values is
// "search::=trace"
"target": "indexing::=trace",
// List of named assets that can be used in the commands.
"assets": {
// name of the asset.
// Must be unique at the workload level.
// For better results, the same asset (same sha256) should have the same name accross workloads.
// Having multiple assets with the same name and distinct hashes is supported accross workloads,
// but will lead to superfluous downloads.
//
// Assets are stored in the `bench/assets/` directory by default.
"hackernews-100_000.ndjson": {
// If the assets exists in the local filesystem (Meilisearch repository or for your local workloads)
// Its file path can be specified here.
// `null` if the asset should be downloaded from a remote location.
"local_location": null,
// URL of the remote location where the asset can be downloaded.
// Use the `--assets-key` of the runner to pass an API key in the `Authorization: Bearer` header of the download requests.
// `null` if the asset should be imported from a local location.
// if both local and remote locations are specified, then the local one is tried first, then the remote one
// if the file is locally missing or its hash differs.
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-100_000.ndjson",
// SHA256 of the asset.
// Optional, the `sha256` of the asset will be displayed during a run of the workload if it is missing.
// If present, the hash of the asset in the `bench/assets/` directory will be compared against this hash before
// running the workload. If the hashes differ, the asset will be downloaded anew.
"sha256": "60ecd23485d560edbd90d9ca31f0e6dba1455422f2a44e402600fbb5f7f1b213",
// Optional, one of "Auto", "Json", "NdJson" or "Raw".
// If missing, assumed to be "Auto".
// If "Auto", the format will be determined from the extension in the asset name.
"format": "NdJson"
},
"hackernews-200_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-200_000.ndjson",
"sha256": "785b0271fdb47cba574fab617d5d332276b835c05dd86e4a95251cf7892a1685"
},
"hackernews-300_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-300_000.ndjson",
"sha256": "de73c7154652eddfaf69cdc3b2f824d5c452f095f40a20a1c97bb1b5c4d80ab2"
},
"hackernews-400_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-400_000.ndjson",
"sha256": "c1b00a24689110f366447e434c201c086d6f456d54ed1c4995894102794d8fe7"
},
"hackernews-500_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-500_000.ndjson",
"sha256": "ae98f9dbef8193d750e3e2dbb6a91648941a1edca5f6e82c143e7996f4840083"
},
"hackernews-600_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-600_000.ndjson",
"sha256": "b495fdc72c4a944801f786400f22076ab99186bee9699f67cbab2f21f5b74dbe"
},
"hackernews-700_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-700_000.ndjson",
"sha256": "4b2c63974f3dabaa4954e3d4598b48324d03c522321ac05b0d583f36cb78a28b"
},
"hackernews-800_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-800_000.ndjson",
"sha256": "cb7b6afe0e6caa1be111be256821bc63b0771b2a0e1fad95af7aaeeffd7ba546"
},
"hackernews-900_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-900_000.ndjson",
"sha256": "e1154ddcd398f1c867758a93db5bcb21a07b9e55530c188a2917fdef332d3ba9"
},
"hackernews-1_000_000.ndjson": {
"local_location": null,
"remote_location": "https://milli-benchmarks.fra1.digitaloceanspaces.com/bench/datasets/hackernews/hackernews-1_000_000.ndjson",
"sha256": "27e25efd0b68b159b8b21350d9af76938710cb29ce0393fa71b41c4f3c630ffe"
}
},
// Core of the workload.
// A list of commands to run sequentially.
// Optional: A precommand is a request to the Meilisearch instance that is executed before the profiling runs.
"precommands": [
{
// Meilisearch route to call. `http://localhost:7700/` will be prepended.
"route": "indexes/movies/settings",
// HTTP method to call.
"method": "PATCH",
// If applicable, body of the request.
// Optional, if missing, the body will be empty.
"body": {
// One of "empty", "inline" or "asset".
// If using "empty", you can skip the entire "body" key.
"inline": {
// when "inline" is used, the body is the JSON object that is the value of the `"inline"` key.
"displayedAttributes": [
"title",
"by",
"score",
"time"
],
"searchableAttributes": [
"title"
],
"filterableAttributes": [
"by"
],
"sortableAttributes": [
"score",
"time"
]
}
},
// Whether to wait before running the next request.
// One of:
// - DontWait: run the next command without waiting the response to this one.
// - WaitForResponse: run the next command as soon as the response from the server is received.
// - WaitForTask: run the next command once **all** the Meilisearch tasks created up to now have finished processing.
"synchronous": "WaitForTask"
}
],
// A command is a request to the Meilisearch instance that is executed while the profiling runs.
"commands": [
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
// When using "asset", use the name of an asset as value to use the content of that asset as body.
// the content type is derived of the format of the asset:
// "NdJson" => "application/x-ndjson"
// "Json" => "application/json"
// "Raw" => "application/octet-stream"
// See [AssetFormat::to_content_type](https://github.com/meilisearch/meilisearch/blob/7b670a4afadb132ac4a01b6403108700501a391d/xtask/src/bench/assets.rs#L30)
// for details and up-to-date list.
"asset": "hackernews-100_000.ndjson"
},
"synchronous": "WaitForTask"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-200_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-300_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-400_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-500_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-600_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-700_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-800_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-900_000.ndjson"
},
"synchronous": "WaitForResponse"
},
{
"route": "indexes/movies/documents",
"method": "POST",
"body": {
"asset": "hackernews-1_000_000.ndjson"
},
"synchronous": "WaitForTask"
}
]
}
```
### Adding new assets
Assets reside in our DigitalOcean S3 space. Assuming you have team access to the DigitalOcean S3 space:
1. go to <https://cloud.digitalocean.com/spaces/milli-benchmarks?i=d1c552&path=bench%2Fdatasets%2F>
2. upload your dataset:
1. if your dataset is a single file, upload that single file using the "upload" button,
2. otherwise, create a folder using the "create folder" button, then inside that folder upload your individual files.
## Upgrading `https://bench.meilisearch.dev`
The URL of the server is in our password manager (look for "benchboard").
1. Make the needed modifications on the [benchboard repository](https://github.com/meilisearch/benchboard) and merge them to main.
2. Publish a new release to produce the Ubuntu/Debian binary.
3. Download the binary locally, send it to the server:
```
scp -6 ~/Downloads/benchboard root@\[<ipv6-address>\]:/bench/new-benchboard
```
Note that the ipv6 must be between escaped square brackets for SCP.
4. SSH to the server:
```
ssh root@<ipv6-address>
```
Note the ipv6 must **NOT** be between escaped square brackets for SSH 🥲
5. On the server, set the correct permissions for the new binary:
```
chown bench:bench /bench/new-benchboard
chmod 700 /bench/new-benchboard
```
6. On the server, move the new binary to the location of the running binary (if unsure, start by making a backup of the running binary):
```
mv /bench/{new-,}benchboard
```
7. Restart the benchboard service.
```
systemctl restart benchboard
```
8. Check that the service runs correctly.
```
systemctl status benchboard
```
9. Check the availability of the service by going to <https://bench.meilisearch.dev> on your browser.

View File

@ -4,7 +4,7 @@ First, thank you for contributing to Meilisearch! The goal of this document is t
Remember that there are many ways to contribute other than writing code: writing [tutorials or blog posts](https://github.com/meilisearch/awesome-meilisearch), improving [the documentation](https://github.com/meilisearch/documentation), submitting [bug reports](https://github.com/meilisearch/meilisearch/issues/new?assignees=&labels=&template=bug_report.md&title=) and [feature requests](https://github.com/meilisearch/product/discussions/categories/feedback-feature-proposal)...
The code in this repository is only concerned with managing multiple indexes, handling the update store, and exposing an HTTP API. Search and indexation are the domain of our core engine, [`milli`](https://github.com/meilisearch/milli), while tokenization is handled by [our `charabia` library](https://github.com/meilisearch/charabia/).
Meilisearch can manage multiple indexes, handle the update store, and expose an HTTP API. Search and indexation are the domain of our core engine, [`milli`](https://github.com/meilisearch/meilisearch/tree/main/milli), while tokenization is handled by [our `charabia` library](https://github.com/meilisearch/charabia/).
If Meilisearch does not offer optimized support for your language, please consider contributing to `charabia` by following the [CONTRIBUTING.md file](https://github.com/meilisearch/charabia/blob/main/CONTRIBUTING.md) and integrating your intended normalizer/segmenter.
@ -52,6 +52,20 @@ cargo test
This command will be triggered to each PR as a requirement for merging it.
#### Faster build
You can set the `LINDERA_CACHE` environment variable to speed up your successive builds by up to 2 minutes.
It'll store some built artifacts in the directory of your choice.
We recommend using the standard `$HOME/.cache/lindera` directory:
```sh
export LINDERA_CACHE=$HOME/.cache/lindera
```
Furthermore, you can improve incremental compilation by setting the `MEILI_NO_VERGEN` environment variable.
Setting this variable will prevent the Meilisearch binary from being rebuilt each time the directory that hosts the Meilisearch repository changes.
Do not enable this environment variable for production builds (as it will break the `version` route, among other things).
#### Snapshot-based tests
We are using [insta](https://insta.rs) to perform snapshot-based testing.
@ -63,7 +77,7 @@ Furthermore, we provide some macros on top of insta, notably a way to use snapsh
To effectively debug snapshot-based hashes, we recommend you export the `MEILI_TEST_FULL_SNAPS` environment variable so that snapshot are fully created locally:
```
```sh
export MEILI_TEST_FULL_SNAPS=true # add this to your .bashrc, .zshrc, ...
```
@ -75,6 +89,36 @@ If you get a "Too many open files" error you might want to increase the open fil
ulimit -Sn 3000
```
#### Build tools
Meilisearch follows the [cargo xtask](https://github.com/matklad/cargo-xtask) workflow to provide some build tools.
Run `cargo xtask --help` from the root of the repository to find out what is available.
### Logging
Meilisearch uses [`tracing`](https://lib.rs/crates/tracing) for logging purposes. Tracing logs are structured and can be displayed as JSON to the end user, so prefer passing arguments as fields rather than interpolating them in the message.
Refer to the [documentation](https://docs.rs/tracing/0.1.40/tracing/index.html#using-the-macros) for the syntax of the spans and events.
Logging spans are used for 3 distinct purposes:
1. Regular logging
2. Profiling
3. Benchmarking
As a result, the spans should follow some rules:
- They should not be put on functions that are called too often. That is because opening and closing a span causes some overhead. For regular logging, avoid putting spans on functions that are taking less than a few hundred nanoseconds. For profiling or benchmarking, avoid putting spans on functions that are taking less than a few microseconds.
- For profiling and benchmarking, use the `TRACE` level.
- For profiling and benchmarking, use the following `target` prefixes:
- `indexing::` for spans meant when profiling the indexing operations.
- `search::` for spans meant when profiling the search operations.
### Benchmarking
See [BENCHMARKS.md](./BENCHMARKS.md)
## Git Guidelines
### Git Branches

3931
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -16,11 +16,17 @@ members = [
"json-depth-checker",
"benchmarks",
"fuzzers",
"tracing-trace",
"xtask",
"build-info",
]
[workspace.package]
version = "1.6.0"
authors = ["Quentin de Quelen <quentin@dequelen.me>", "Clément Renault <clement@meilisearch.com>"]
version = "1.11.1"
authors = [
"Quentin de Quelen <quentin@dequelen.me>",
"Clément Renault <clement@meilisearch.com>",
]
description = "Meilisearch HTTP server"
homepage = "https://meilisearch.com"
readme = "README.md"

View File

@ -1,14 +1,14 @@
# Compile
FROM rust:1.71.1-alpine3.18 AS compiler
FROM rust:1.79.0-alpine3.20 AS compiler
RUN apk add -q --update-cache --no-cache build-base openssl-dev
RUN apk add -q --no-cache build-base openssl-dev
WORKDIR /
ARG COMMIT_SHA
ARG COMMIT_DATE
ARG GIT_TAG
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE} VERGEN_GIT_SEMVER_LIGHTWEIGHT=${GIT_TAG}
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE} VERGEN_GIT_DESCRIBE=${GIT_TAG}
ENV RUSTFLAGS="-C target-feature=-crt-static"
COPY . .
@ -20,13 +20,12 @@ RUN set -eux; \
cargo build --release -p meilisearch -p meilitool
# Run
FROM alpine:3.16
FROM alpine:3.20
ENV MEILI_HTTP_ADDR 0.0.0.0:7700
ENV MEILI_SERVER_PROVIDER docker
RUN apk update --quiet \
&& apk add -q --no-cache libgcc tini curl
RUN apk add -q --no-cache libgcc tini curl
# add meilisearch and meilitool to the `/bin` so you can run it from anywhere
# and it's easy to find.

View File

@ -1,6 +1,6 @@
MIT License
Copyright (c) 2019-2022 Meili SAS
Copyright (c) 2019-2024 Meili SAS
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@ -1,9 +1,6 @@
<p align="center">
<a href="https://www.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=logo#gh-light-mode-only" target="_blank">
<img src="assets/meilisearch-logo-light.svg?sanitize=true#gh-light-mode-only">
</a>
<a href="https://www.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=logo#gh-dark-mode-only" target="_blank">
<img src="assets/meilisearch-logo-dark.svg?sanitize=true#gh-dark-mode-only">
<a href="https://www.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=logo" target="_blank">
<img src="assets/meilisearch-logo-kawaii.png">
</a>
</p>
@ -25,7 +22,7 @@
<p align="center">⚡ A lightning-fast search engine that fits effortlessly into your apps, websites, and workflow 🔍</p>
Meilisearch helps you shape a delightful search experience in a snap, offering features that work out-of-the-box to speed up your workflow.
[Meilisearch](https://www.meilisearch.com?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=intro) helps you shape a delightful search experience in a snap, offering features that work out of the box to speed up your workflow.
<p align="center" name="demo">
<a href="https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demo-gif#gh-light-mode-only" target="_blank">
@ -36,36 +33,41 @@ Meilisearch helps you shape a delightful search experience in a snap, offering f
</a>
</p>
🔥 [**Try it!**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demo-link) 🔥
## 🖥 Examples
- [**Movies**](https://where2watch.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=organization) — An application to help you find streaming platforms to watch movies using [hybrid search](https://www.meilisearch.com/solutions/hybrid-search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos).
- [**Ecommerce**](https://ecommerce.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Ecommerce website using disjunctive [facets](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos), range and rating filtering, and pagination.
- [**Songs**](https://music.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search through 47 million of songs.
- [**SaaS**](https://saas.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) — Search for contacts, deals, and companies in this [multi-tenant](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=demos) CRM application.
See the list of all our example apps in our [demos repository](https://github.com/meilisearch/demos).
## ✨ Features
- **Search-as-you-type:** find search results in less than 50 milliseconds
- **[Typo tolerance](https://www.meilisearch.com/docs/learn/getting_started/customizing_relevancy?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features#typo-tolerance):** get relevant matches even when queries contain typos and misspellings
- **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your user's search experience with custom filters and build a faceted search interface in a few lines of code
- **Hybrid search:** Combine the best of both [semantic](https://www.meilisearch.com/docs/learn/experimental/vector_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) & full-text search to get the most relevant results
- **Search-as-you-type:** Find & display results in less than 50 milliseconds to provide an intuitive experience
- **[Typo tolerance](https://www.meilisearch.com/docs/learn/relevancy/typo_tolerance_settings?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** get relevant matches even when queries contain typos and misspellings
- **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your users' search experience with custom filters and build a faceted search interface in a few lines of code
- **[Sorting](https://www.meilisearch.com/docs/learn/fine_tuning_results/sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** sort results based on price, date, or pretty much anything else your users need
- **[Synonym support](https://www.meilisearch.com/docs/learn/getting_started/customizing_relevancy?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features#synonyms):** configure synonyms to include more relevant content in your search results
- **[Synonym support](https://www.meilisearch.com/docs/learn/relevancy/synonyms?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** configure synonyms to include more relevant content in your search results
- **[Geosearch](https://www.meilisearch.com/docs/learn/fine_tuning_results/geosearch?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** filter and sort documents based on geographic data
- **[Extensive language support](https://www.meilisearch.com/docs/learn/what_is_meilisearch/language?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** search datasets in any language, with optimized support for Chinese, Japanese, Hebrew, and languages using the Latin alphabet
- **[Security management](https://www.meilisearch.com/docs/learn/security/master_api_keys?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** control which users can access what data with API keys that allow fine-grained permissions handling
- **[Multi-Tenancy](https://www.meilisearch.com/docs/learn/security/tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** personalize search results for any number of application tenants
- **[Multi-Tenancy](https://www.meilisearch.com/docs/learn/security/multitenancy_tenant_tokens?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** personalize search results for any number of application tenants
- **Highly Customizable:** customize Meilisearch to your specific needs or use our out-of-the-box and hassle-free presets
- **[RESTful API](https://www.meilisearch.com/docs/reference/api/overview?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** integrate Meilisearch in your technical stack with our plugins and SDKs
- **Easy to install, deploy, and maintain**
## 📖 Documentation
You can consult Meilisearch's documentation at [https://www.meilisearch.com/docs](https://www.meilisearch.com/docs/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=docs).
You can consult Meilisearch's documentation at [meilisearch.com/docs](https://www.meilisearch.com/docs/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=docs).
## 🚀 Getting started
For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our [Quick Start](https://www.meilisearch.com/docs/learn/getting_started/quick_start?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) guide.
For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our [documentation](https://www.meilisearch.com/docs?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) guide.
You may also want to check out [Meilisearch 101](https://www.meilisearch.com/docs/learn/getting_started/filtering_and_sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) for an introduction to some of Meilisearch's most popular features.
## 🌍 Supercharge your Meilisearch experience
## ⚡ Supercharge your Meilisearch experience
Say goodbye to server deployment and manual updates with [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=oss&utm_source=github&utm_medium=meilisearch). No credit card required.
Say goodbye to server deployment and manual updates with [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=oss&utm_source=github&utm_medium=meilisearch). Additional features include analytics & monitoring in many regions around the world. No credit card is required.
## 🧰 SDKs & integration tools
@ -85,15 +87,15 @@ Finally, for more in-depth information, refer to our articles explaining fundame
## 📊 Telemetry
Meilisearch collects **anonymized** data from users to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want.
Meilisearch collects **anonymized** user data to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want.
To request deletion of collected data, please write to us at [privacy@meilisearch.com](mailto:privacy@meilisearch.com). Don't forget to include your `Instance UID` in the message, as this helps us quickly find and delete your data.
To request deletion of collected data, please write to us at [privacy@meilisearch.com](mailto:privacy@meilisearch.com). Remember to include your `Instance UID` in the message, as this helps us quickly find and delete your data.
If you want to know more about the kind of data we collect and what we use it for, check the [telemetry section](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) of our documentation.
## 📫 Get in touch!
Meilisearch is a search engine created by [Meili](https://www.welcometothejungle.com/en/companies/meilisearch), a software development company based in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
Meilisearch is a search engine created by [Meili]([https://www.welcometothejungle.com/en/companies/meilisearch](https://www.meilisearch.com/careers)), a software development company headquartered in France and with team members all over the world. Want to know more about us? [Check out our blog!](https://blog.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
🗞 [Subscribe to our newsletter](https://meilisearch.us2.list-manage.com/subscribe?u=27870f7b71c908a8b359599fb&id=79582d828e) if you don't want to miss any updates! We promise we won't clutter your mailbox: we only send one edition every two months.
@ -101,17 +103,17 @@ Meilisearch is a search engine created by [Meili](https://www.welcometothejungle
- For feature requests, please visit our [product repository](https://github.com/meilisearch/product/discussions)
- Found a bug? Open an [issue](https://github.com/meilisearch/meilisearch/issues)!
- Want to be part of our Discord community? [Join us!](https://discord.gg/meilisearch)
- Want to be part of our Discord community? [Join us!](https://discord.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
Thank you for your support!
## 👩‍💻 Contributing
Meilisearch is, and will always be, open-source! If you want to contribute to the project, please take a look at [our contribution guidelines](CONTRIBUTING.md).
Meilisearch is, and will always be, open-source! If you want to contribute to the project, please look at [our contribution guidelines](CONTRIBUTING.md).
## 📦 Versioning
Meilisearch releases and their associated binaries are available [in this GitHub page](https://github.com/meilisearch/meilisearch/releases).
Meilisearch releases and their associated binaries are available on the project's [releases page](https://github.com/meilisearch/meilisearch/releases).
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](https://github.com/meilisearch/engine-team/blob/main/resources/versioning-policy.md).

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

View File

@ -11,24 +11,24 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
csv = "1.2.1"
anyhow = "1.0.86"
csv = "1.3.0"
milli = { path = "../milli" }
mimalloc = { version = "0.1.37", default-features = false }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
mimalloc = { version = "0.1.43", default-features = false }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
rand = "0.8.5"
rand_chacha = "0.3.1"
roaring = "0.10.1"
roaring = "0.10.6"
[build-dependencies]
anyhow = "1.0.70"
bytes = "1.4.0"
anyhow = "1.0.86"
bytes = "1.6.0"
convert_case = "0.6.0"
flate2 = "1.0.25"
reqwest = { version = "0.11.16", features = ["blocking", "rustls-tls"], default-features = false }
flate2 = "1.0.30"
reqwest = { version = "0.12.5", features = ["blocking", "rustls-tls"], default-features = false }
[features]
default = ["milli/all-tokenizations"]

View File

@ -1,5 +1,5 @@
status = [
'Tests on ubuntu-18.04',
'Tests on ubuntu-20.04',
'Tests on macos-12',
'Tests on windows-2022',
'Run Clippy',

18
build-info/Cargo.toml Normal file
View File

@ -0,0 +1,18 @@
[package]
name = "build-info"
version.workspace = true
authors.workspace = true
description.workspace = true
homepage.workspace = true
readme.workspace = true
edition.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
time = { version = "0.3.36", features = ["parsing"] }
[build-dependencies]
anyhow = "1.0.86"
vergen-git2 = "1.0.0"

29
build-info/build.rs Normal file
View File

@ -0,0 +1,29 @@
fn main() {
if let Err(err) = emit_git_variables() {
println!("cargo:warning=vergen: {}", err);
}
}
fn emit_git_variables() -> anyhow::Result<()> {
println!("cargo::rerun-if-env-changed=MEILI_NO_VERGEN");
let has_vergen =
!matches!(std::env::var_os("MEILI_NO_VERGEN"), Some(x) if x != "false" && x != "0");
anyhow::ensure!(has_vergen, "disabled via `MEILI_NO_VERGEN`");
// Note: any code that needs VERGEN_ environment variables should take care to define them manually in the Dockerfile and pass them
// in the corresponding GitHub workflow (publish_docker.yml).
// This is due to the Dockerfile building the binary outside of the git directory.
let mut builder = vergen_git2::Git2Builder::default();
builder.branch(true);
builder.commit_timestamp(true);
builder.commit_message(true);
builder.describe(true, true, None);
builder.sha(false);
let git2 = builder.build()?;
vergen_git2::Emitter::default().fail_on_error().add_instructions(&git2)?.emit()
}

203
build-info/src/lib.rs Normal file
View File

@ -0,0 +1,203 @@
use time::format_description::well_known::Iso8601;
#[derive(Debug, Clone)]
pub struct BuildInfo {
pub branch: Option<&'static str>,
pub describe: Option<DescribeResult>,
pub commit_sha1: Option<&'static str>,
pub commit_msg: Option<&'static str>,
pub commit_timestamp: Option<time::OffsetDateTime>,
}
impl BuildInfo {
pub fn from_build() -> Self {
let branch: Option<&'static str> = option_env!("VERGEN_GIT_BRANCH");
let describe = DescribeResult::from_build();
let commit_sha1 = option_env!("VERGEN_GIT_SHA");
let commit_msg = option_env!("VERGEN_GIT_COMMIT_MESSAGE");
let commit_timestamp = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP");
let commit_timestamp = commit_timestamp.and_then(|commit_timestamp| {
time::OffsetDateTime::parse(commit_timestamp, &Iso8601::DEFAULT).ok()
});
Self { branch, describe, commit_sha1, commit_msg, commit_timestamp }
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum DescribeResult {
Prototype { name: &'static str },
Release { version: &'static str, major: u64, minor: u64, patch: u64 },
Prerelease { version: &'static str, major: u64, minor: u64, patch: u64, rc: u64 },
NotATag { describe: &'static str },
}
impl DescribeResult {
pub fn new(describe: &'static str) -> Self {
if let Some(name) = prototype_name(describe) {
Self::Prototype { name }
} else if let Some(release) = release_version(describe) {
release
} else if let Some(prerelease) = prerelease_version(describe) {
prerelease
} else {
Self::NotATag { describe }
}
}
pub fn from_build() -> Option<Self> {
let describe: &'static str = option_env!("VERGEN_GIT_DESCRIBE")?;
Some(Self::new(describe))
}
pub fn as_tag(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { version, .. } => Some(version),
DescribeResult::Prerelease { version, .. } => Some(version),
DescribeResult::NotATag { describe: _ } => None,
}
}
pub fn as_prototype(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { .. }
| DescribeResult::Prerelease { .. }
| DescribeResult::NotATag { .. } => None,
}
}
}
/// Parses the input as a prototype name.
///
/// Returns `Some(prototype_name)` if the following conditions are met on this value:
///
/// 1. starts with `prototype-`,
/// 2. ends with `-<some_number>`,
/// 3. does not end with `<some_number>-<some_number>`.
///
/// Otherwise, returns `None`.
fn prototype_name(describe: &'static str) -> Option<&'static str> {
if !describe.starts_with("prototype-") {
return None;
}
let mut rsplit_prototype = describe.rsplit('-');
// last component MUST be a number
rsplit_prototype.next()?.parse::<u64>().ok()?;
// before than last component SHALL NOT be a number
rsplit_prototype.next()?.parse::<u64>().err()?;
Some(describe)
}
fn release_version(describe: &'static str) -> Option<DescribeResult> {
if !describe.starts_with('v') {
return None;
}
// full release version don't contain a `-`
if describe.contains('-') {
return None;
}
// full release version parse as vX.Y.Z, with X, Y, Z numbers.
let mut dots = describe[1..].split('.');
let major: u64 = dots.next()?.parse().ok()?;
let minor: u64 = dots.next()?.parse().ok()?;
let patch: u64 = dots.next()?.parse().ok()?;
if dots.next().is_some() {
return None;
}
Some(DescribeResult::Release { version: describe, major, minor, patch })
}
fn prerelease_version(describe: &'static str) -> Option<DescribeResult> {
// prerelease version is in the shape vM.N.P-rc.C
let mut hyphen = describe.rsplit('-');
let prerelease = hyphen.next()?;
if !prerelease.starts_with("rc.") {
return None;
}
let rc: u64 = prerelease[3..].parse().ok()?;
let release = hyphen.next()?;
let DescribeResult::Release { version: _, major, minor, patch } = release_version(release)?
else {
return None;
};
Some(DescribeResult::Prerelease { version: describe, major, minor, patch, rc })
}
#[cfg(test)]
mod test {
use super::DescribeResult;
fn assert_not_a_tag(describe: &'static str) {
assert_eq!(DescribeResult::NotATag { describe }, DescribeResult::new(describe))
}
fn assert_proto(describe: &'static str) {
assert_eq!(DescribeResult::Prototype { name: describe }, DescribeResult::new(describe))
}
fn assert_release(describe: &'static str, major: u64, minor: u64, patch: u64) {
assert_eq!(
DescribeResult::Release { version: describe, major, minor, patch },
DescribeResult::new(describe)
)
}
fn assert_prerelease(describe: &'static str, major: u64, minor: u64, patch: u64, rc: u64) {
assert_eq!(
DescribeResult::Prerelease { version: describe, major, minor, patch, rc },
DescribeResult::new(describe)
)
}
#[test]
fn not_a_tag() {
assert_not_a_tag("whatever-fuzzy");
assert_not_a_tag("whatever-fuzzy-5-ggg-dirty");
assert_not_a_tag("whatever-fuzzy-120-ggg-dirty");
// technically a tag, but not a proto nor a version, so not parsed as a tag
assert_not_a_tag("whatever");
// dirty version
assert_not_a_tag("v1.7.0-1-ggga-dirty");
assert_not_a_tag("v1.7.0-rc.1-1-ggga-dirty");
// after version
assert_not_a_tag("v1.7.0-1-ggga");
assert_not_a_tag("v1.7.0-rc.1-1-ggga");
// after proto
assert_not_a_tag("protoype-tag-0-1-ggga");
assert_not_a_tag("protoype-tag-0-1-ggga-dirty");
}
#[test]
fn prototype() {
assert_proto("prototype-tag-0");
assert_proto("prototype-tag-10");
assert_proto("prototype-long-name-tag-10");
}
#[test]
fn release() {
assert_release("v1.7.2", 1, 7, 2);
}
#[test]
fn prerelease() {
assert_prerelease("v1.7.2-rc.3", 1, 7, 2, 3);
}
}

View File

@ -11,22 +11,21 @@ readme.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
flate2 = "1.0.25"
http = "0.2.9"
log = "0.4.17"
meilisearch-auth = { path = "../meilisearch-auth" }
anyhow = "1.0.86"
flate2 = "1.0.30"
http = "1.1.0"
meilisearch-types = { path = "../meilisearch-types" }
once_cell = "1.17.1"
regex = "1.7.3"
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
tar = "0.4.38"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
uuid = { version = "1.3.1", features = ["serde", "v4"] }
once_cell = "1.19.0"
regex = "1.10.5"
roaring = { version = "0.10.6", features = ["serde"] }
serde = { version = "1.0.204", features = ["derive"] }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
tar = "0.4.41"
tempfile = "3.10.1"
thiserror = "1.0.61"
time = { version = "0.3.36", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tracing = "0.1.40"
uuid = { version = "1.10.0", features = ["serde", "v4"] }
[dev-dependencies]
big_s = "1.0.2"

View File

@ -104,6 +104,11 @@ pub enum KindDump {
DocumentDeletionByFilter {
filter: serde_json::Value,
},
DocumentEdition {
filter: Option<serde_json::Value>,
context: Option<serde_json::Map<String, serde_json::Value>>,
function: String,
},
Settings {
settings: Box<meilisearch_types::settings::Settings<Unchecked>>,
is_deletion: bool,
@ -172,6 +177,9 @@ impl From<KindWithContent> for KindDump {
KindWithContent::DocumentDeletionByFilter { filter_expr, .. } => {
KindDump::DocumentDeletionByFilter { filter: filter_expr }
}
KindWithContent::DocumentEdition { filter_expr, context, function, .. } => {
KindDump::DocumentEdition { filter: filter_expr, context, function }
}
KindWithContent::DocumentClear { .. } => KindDump::DocumentClear,
KindWithContent::SettingsUpdate {
new_settings,
@ -256,8 +264,8 @@ pub(crate) mod test {
pub fn create_test_settings() -> Settings<Checked> {
let settings = Settings {
displayed_attributes: Setting::Set(vec![S("race"), S("name")]),
searchable_attributes: Setting::Set(vec![S("name"), S("race")]),
displayed_attributes: Setting::Set(vec![S("race"), S("name")]).into(),
searchable_attributes: Setting::Set(vec![S("name"), S("race")]).into(),
filterable_attributes: Setting::Set(btreeset! { S("race"), S("age") }),
sortable_attributes: Setting::Set(btreeset! { S("age") }),
ranking_rules: Setting::NotSet,
@ -277,6 +285,8 @@ pub(crate) mod test {
}),
pagination: Setting::NotSet,
embedders: Setting::NotSet,
search_cutoff_ms: Setting::NotSet,
localized_attributes: Setting::NotSet,
_kind: std::marker::PhantomData,
};
settings.check()

View File

@ -120,7 +120,7 @@ impl From<v1::settings::Settings> for v2::Settings<v2::Unchecked> {
criterion.as_ref().map(ToString::to_string)
}
Err(()) => {
log::warn!(
tracing::warn!(
"Could not import the following ranking rule: `{}`.",
ranking_rule
);
@ -152,11 +152,11 @@ impl From<v1::update::UpdateStatus> for Option<v2::updates::UpdateStatus> {
use v2::updates::UpdateStatus as UpdateStatusV2;
Some(match source {
UpdateStatusV1::Enqueued { content } => {
log::warn!(
tracing::warn!(
"Cannot import task {} (importing enqueued tasks from v1 dumps is unsupported)",
content.update_id
);
log::warn!("Task will be skipped in the queue of imported tasks.");
tracing::warn!("Task will be skipped in the queue of imported tasks.");
return None;
}
@ -229,7 +229,7 @@ impl From<v1::update::UpdateType> for Option<v2::updates::UpdateMeta> {
Some(match source {
v1::update::UpdateType::ClearAll => v2::updates::UpdateMeta::ClearDocuments,
v1::update::UpdateType::Customs => {
log::warn!("Ignoring task with type 'Customs' that is no longer supported");
tracing::warn!("Ignoring task with type 'Customs' that is no longer supported");
return None;
}
v1::update::UpdateType::DocumentsAddition { .. } => {
@ -296,7 +296,7 @@ impl From<v1::settings::RankingRule> for Option<v2::settings::Criterion> {
v1::settings::RankingRule::Proximity => Some(v2::settings::Criterion::Proximity),
v1::settings::RankingRule::Attribute => Some(v2::settings::Criterion::Attribute),
v1::settings::RankingRule::WordsPosition => {
log::warn!("Removing the 'WordsPosition' ranking rule that is no longer supported, please check the resulting ranking rules of your indexes");
tracing::warn!("Removing the 'WordsPosition' ranking rule that is no longer supported, please check the resulting ranking rules of your indexes");
None
}
v1::settings::RankingRule::Exactness => Some(v2::settings::Criterion::Exactness),

View File

@ -1,4 +1,3 @@
use std::convert::TryInto;
use std::str::FromStr;
use time::OffsetDateTime;
@ -146,8 +145,8 @@ impl From<v2::updates::UpdateStatus> for v3::updates::UpdateStatus {
started_processing_at: processing.started_processing_at,
}),
Err(e) => {
log::warn!("Error with task {}: {}", processing.from.update_id, e);
log::warn!("Task will be marked as `Failed`.");
tracing::warn!("Error with task {}: {}", processing.from.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
@ -172,8 +171,8 @@ impl From<v2::updates::UpdateStatus> for v3::updates::UpdateStatus {
enqueued_at: enqueued.enqueued_at,
}),
Err(e) => {
log::warn!("Error with task {}: {}", enqueued.update_id, e);
log::warn!("Task will be marked as `Failed`.");
tracing::warn!("Error with task {}: {}", enqueued.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
@ -353,7 +352,7 @@ impl From<String> for v3::Code {
"malformed_payload" => v3::Code::MalformedPayload,
"missing_payload" => v3::Code::MissingPayload,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v3::Code::UnretrievableErrorCode
}
}
@ -426,7 +425,7 @@ pub(crate) mod test {
let mut dump = v2::V2Reader::open(dir).unwrap().to_v3();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -76,20 +76,20 @@ impl CompatV3ToV4 {
let index_uid = match index_uid {
Some(uid) => uid,
None => {
log::warn!(
tracing::warn!(
"Error while importing the update {}.",
task.update.id()
);
log::warn!(
tracing::warn!(
"The index associated to the uuid `{}` could not be retrieved.",
task.uuid.to_string()
);
if task.update.is_finished() {
// we're fucking with his history but not his data, that's ok-ish.
log::warn!("The index-uuid will be set as `unknown`.");
tracing::warn!("The index-uuid will be set as `unknown`.");
String::from("unknown")
} else {
log::warn!("The task will be ignored.");
tracing::warn!("The task will be ignored.");
return None;
}
}
@ -358,7 +358,7 @@ pub(crate) mod test {
let mut dump = v3::V3Reader::open(dir).unwrap().to_v4();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -305,7 +305,7 @@ impl From<v4::ResponseError> for v5::ResponseError {
"invalid_api_key_expires_at" => v5::Code::InvalidApiKeyExpiresAt,
"invalid_api_key_description" => v5::Code::InvalidApiKeyDescription,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v5::Code::UnretrievableErrorCode
}
};
@ -394,8 +394,8 @@ pub(crate) mod test {
let mut dump = v4::V4Reader::open(dir).unwrap().to_v5();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -304,7 +304,7 @@ impl From<v5::ResponseError> for v6::ResponseError {
"immutable_field" => v6::Code::BadRequest,
"api_key_already_exists" => v6::Code::ApiKeyAlreadyExists,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v6::Code::UnretrievableErrorCode
}
};
@ -315,8 +315,8 @@ impl From<v5::ResponseError> for v6::ResponseError {
impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
fn from(settings: v5::Settings<T>) -> Self {
v6::Settings {
displayed_attributes: settings.displayed_attributes.into(),
searchable_attributes: settings.searchable_attributes.into(),
displayed_attributes: v6::Setting::from(settings.displayed_attributes).into(),
searchable_attributes: v6::Setting::from(settings.searchable_attributes).into(),
filterable_attributes: settings.filterable_attributes.into(),
sortable_attributes: settings.sortable_attributes.into(),
ranking_rules: {
@ -329,7 +329,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
new_ranking_rules.push(new_rule);
}
Err(_) => {
log::warn!("Error while importing settings. The ranking rule `{rule}` does not exist anymore.")
tracing::warn!("Error while importing settings. The ranking rule `{rule}` does not exist anymore.")
}
}
}
@ -379,6 +379,8 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
v5::Setting::NotSet => v6::Setting::NotSet,
},
embedders: v6::Setting::NotSet,
localized_attributes: v6::Setting::NotSet,
search_cutoff_ms: v6::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}
@ -441,8 +443,8 @@ pub(crate) mod test {
let mut dump = v5::V5Reader::open(dir).unwrap().to_v6();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();

View File

@ -197,13 +197,149 @@ pub(crate) mod test {
use super::*;
use crate::reader::v6::RuntimeTogglableFeatures;
#[test]
fn import_dump_v6_with_vectors() {
// dump containing two indexes
//
// "vector", configured with an embedder
// contains:
// - one document with an overriden vector,
// - one document with a natural vector
// - one document with a _vectors map containing one additional embedder name and a natural vector
// - one document with a _vectors map containing one additional embedder name and an overriden vector
//
// "novector", no embedder
// contains:
// - a document without vector
// - a document with a random _vectors field
let dump = File::open("tests/assets/v6-with-vectors.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2024-05-16 15:51:34.151044 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
let (tasks, update_files): (Vec<_>, Vec<_>) = tasks.into_iter().unzip();
meili_snap::snapshot_hash!(meili_snap::json_string!(tasks), @"278f63325ef06ca04d01df98d8207b94");
assert_eq!(update_files.len(), 10);
assert!(update_files[0].is_none()); // the dump creation
assert!(update_files[1].is_none());
assert!(update_files[2].is_none());
assert!(update_files[3].is_none());
assert!(update_files[4].is_none());
assert!(update_files[5].is_none());
assert!(update_files[6].is_none());
assert!(update_files[7].is_none());
assert!(update_files[8].is_none());
assert!(update_files[9].is_none());
// indexes
let mut indexes = dump.indexes().unwrap().collect::<Result<Vec<_>>>().unwrap();
// the index are not ordered in any way by default
indexes.sort_by_key(|index| index.metadata().uid.to_string());
let mut vector_index = indexes.pop().unwrap();
let mut novector_index = indexes.pop().unwrap();
assert!(indexes.is_empty());
// vector
insta::assert_json_snapshot!(vector_index.metadata(), @r###"
{
"uid": "vector",
"primaryKey": "id",
"createdAt": "2024-05-16T15:33:17.240962Z",
"updatedAt": "2024-05-16T15:40:55.723052Z"
}
"###);
insta::assert_json_snapshot!(vector_index.settings().unwrap());
{
let documents: Result<Vec<_>> = vector_index.documents().unwrap().collect();
let mut documents = documents.unwrap();
assert_eq!(documents.len(), 4);
documents.sort_by_key(|doc| doc.get("id").unwrap().to_string());
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document);
}
}
// novector
insta::assert_json_snapshot!(novector_index.metadata(), @r###"
{
"uid": "novector",
"primaryKey": "id",
"createdAt": "2024-05-16T15:33:03.568055Z",
"updatedAt": "2024-05-16T15:33:07.530217Z"
}
"###);
insta::assert_json_snapshot!(novector_index.settings().unwrap().embedders, @"null");
{
let documents: Result<Vec<_>> = novector_index.documents().unwrap().collect();
let mut documents = documents.unwrap();
assert_eq!(documents.len(), 2);
documents.sort_by_key(|doc| doc.get("id").unwrap().to_string());
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document, @r###"
{
"id": "e1",
"other": "random1",
"_vectors": "toto"
}
"###);
}
{
let document = documents.pop().unwrap();
insta::assert_json_snapshot!(document, @r###"
{
"id": "e0",
"other": "random0"
}
"###);
}
}
assert_eq!(
dump.features().unwrap().unwrap(),
RuntimeTogglableFeatures { vector_store: true, ..Default::default() }
);
}
#[test]
fn import_dump_v6_experimental() {
let dump = File::open("tests/assets/v6-with-experimental.dump").unwrap();
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2023-07-06 7:10:27.21958 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @"None");
// tasks
@ -249,8 +385,8 @@ pub(crate) mod test {
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
@ -329,8 +465,8 @@ pub(crate) mod test {
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().unwrap().collect::<Result<Vec<_>>>().unwrap();
@ -406,7 +542,7 @@ pub(crate) mod test {
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
@ -499,7 +635,7 @@ pub(crate) mod test {
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks
@ -592,7 +728,7 @@ pub(crate) mod test {
let mut dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
assert_eq!(dump.instance_uid().unwrap(), None);
// tasks

View File

@ -0,0 +1,56 @@
---
source: dump/src/reader/mod.rs
expression: vector_index.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
],
"stopWords": [],
"nonSeparatorTokens": [],
"separatorTokens": [],
"dictionary": [],
"synonyms": {},
"distinctAttribute": null,
"proximityPrecision": "byWord",
"typoTolerance": {
"enabled": true,
"minWordSizeForTypos": {
"oneTypo": 5,
"twoTypos": 9
},
"disableOnWords": [],
"disableOnAttributes": []
},
"faceting": {
"maxValuesPerFacet": 100,
"sortFacetValuesBy": {
"*": "alpha"
}
},
"pagination": {
"maxTotalHits": 1000
},
"embedders": {
"default": {
"source": "huggingFace",
"model": "BAAI/bge-base-en-v1.5",
"revision": "617ca489d9e86b49b8167676d8220688b99db36e",
"documentTemplate": "{% for field in fields %} {{ field.name }}: {{ field.value }}\n{% endfor %}"
}
},
"searchCutoffMs": null
}

View File

@ -0,0 +1,783 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e3",
"desc": "overriden vector + map",
"_vectors": {
"default": [
0.2,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1
],
"toto": [
0.1
]
}
}

View File

@ -0,0 +1,786 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e2",
"desc": "natural vector + map",
"_vectors": {
"toto": [],
"default": {
"embeddings": [
[
-0.05189208313822746,
-0.9273212552070618,
0.1443813145160675,
0.0932632014155388,
0.2665371894836426,
0.36266782879829407,
0.6402910947799683,
0.32014018297195435,
0.030915971845388412,
-0.9312191605567932,
-0.3718109726905823,
-0.2700554132461548,
-1.1014580726623535,
0.9154956936836244,
-0.3406888246536255,
1.0077725648880005,
0.6577560901641846,
-0.3955195546150207,
-0.4148270785808563,
0.1855088472366333,
0.5062315464019775,
-0.3632686734199524,
-0.2277890294790268,
0.2560805082321167,
-0.3853609561920166,
-0.1604762226343155,
-0.13947471976280212,
-0.20147813856601715,
-0.4466346800327301,
-0.3761846721172333,
0.1443382054567337,
0.18205296993255615,
0.49359792470932007,
-0.22538000345230105,
-0.4996317625045776,
-0.22734887897968292,
-0.6034309267997742,
-0.7857939600944519,
-0.34923747181892395,
-0.3466345965862274,
0.21176661550998688,
-0.5101462006568909,
-0.3403083384037018,
0.000315118464641273,
0.236465722322464,
-0.10246097296476364,
-1.3013339042663574,
0.3419138789176941,
-0.32963496446609497,
-0.0901619717478752,
-0.5426247119903564,
0.22656650841236117,
-0.44758284091949463,
0.14151698350906372,
-0.1089438870549202,
0.5500766634941101,
-0.670711100101471,
-0.6227269768714905,
0.3894464075565338,
-0.27609574794769287,
0.7028202414512634,
-0.19697771966457367,
0.328511506319046,
0.5063360929489136,
0.4065195322036743,
0.2614171802997589,
-0.30274391174316406,
1.0393824577331543,
-0.7742937207221985,
-0.7874112129211426,
-0.6749666929244995,
0.5190866589546204,
0.004123548045754433,
-0.28312963247299194,
-0.038731709122657776,
-1.0142987966537476,
-0.09519586712121964,
0.8755272626876831,
0.4876938760280609,
0.7811151742935181,
0.85174959897995,
0.11826585978269576,
0.5373436808586121,
0.3649002015590668,
0.19064077734947205,
-0.00287026260048151,
-0.7305403351783752,
-0.015206154435873032,
-0.7899249196052551,
0.19407285749912265,
0.08596625179052353,
-0.28976231813430786,
-0.1525907665491104,
0.3798313438892365,
0.050306469202041626,
-0.5697937607765198,
0.4219021201133728,
0.276252806186676,
0.1559903472661972,
0.10030482709407806,
-0.4043720066547394,
-0.1969818025827408,
0.5739826560020447,
0.2116064727306366,
-1.4620544910430908,
-0.7802462577819824,
-0.24739810824394223,
-0.09791352599859238,
-0.4413802027702331,
0.21549351513385773,
-0.9520436525344848,
-0.08762510865926743,
0.08154498040676117,
-0.6154940724372864,
-1.01079523563385,
0.885427713394165,
0.6967288851737976,
0.27186504006385803,
-0.43194177746772766,
-0.11248451471328735,
0.7576630711555481,
0.4998855590820313,
0.0264343973249197,
0.9872855544090272,
0.5634694695472717,
0.053698331117630005,
0.19410227239131927,
0.3570743501186371,
-0.23670297861099243,
-0.9114483594894408,
0.07884842902421951,
0.7318344116210938,
0.44630110263824463,
0.08745364099740982,
-0.347101628780365,
-0.4314247667789459,
-0.5060274004936218,
0.003706763498485088,
0.44320008158683777,
-0.00788921769708395,
-0.1368623524904251,
-0.17391923069953918,
0.14473655819892883,
0.10927865654230118,
0.6974599361419678,
0.005052129738032818,
-0.016953065991401672,
-0.1256176233291626,
-0.036742497235536575,
0.5591985583305359,
-0.37619709968566895,
0.22429119050502777,
0.5403043031692505,
-0.8603790998458862,
-0.3456307053565979,
0.9292937517166138,
0.5074859261512756,
0.6310645937919617,
-0.3091641068458557,
0.46902573108673096,
0.7891915440559387,
0.4499550759792328,
0.2744995653629303,
0.2712305784225464,
-0.04349074140191078,
-0.3638863265514374,
0.7839881777763367,
0.7352104783058167,
-0.19457511603832245,
-0.5957832932472229,
-0.43704694509506226,
-1.084769368171692,
0.4904985725879669,
0.5385226011276245,
0.1891629993915558,
0.12338479608297348,
0.8315675258636475,
-0.07830192148685455,
1.0916285514831543,
-0.28066861629486084,
-1.3585069179534912,
0.5203898549079895,
0.08678033947944641,
-0.2566044330596924,
0.09484415501356123,
-0.0180208683013916,
1.0264745950698853,
-0.023572135716676712,
0.5864979028701782,
0.7625196576118469,
-0.2543414533138275,
-0.8877770900726318,
0.7611982822418213,
-0.06220436468720436,
0.937336564064026,
0.2704363465309143,
-0.37733694911003113,
0.5076137781143188,
-0.30641937255859375,
0.6252772808074951,
-0.0823579877614975,
-0.03736555948853493,
0.4131673276424408,
-0.6514252424240112,
0.12918265163898468,
-0.4483584463596344,
0.6750786304473877,
-0.37008383870124817,
-0.02324833907186985,
0.38027650117874146,
-0.26374951004981995,
0.4346931278705597,
0.42882832884788513,
-0.48798441886901855,
1.1882442235946655,
0.5132288336753845,
0.5284568667411804,
-0.03538886830210686,
0.29620853066444397,
-1.0683696269989014,
0.25936177372932434,
0.10404160618782043,
-0.25796034932136536,
0.027896970510482788,
-0.09225251525640488,
1.4811025857925415,
0.641173779964447,
-0.13838383555412292,
-0.3437179923057556,
0.5667019486427307,
-0.5400741696357727,
0.31090837717056274,
0.6470608115196228,
-0.3747067153453827,
-0.7364534735679626,
-0.07431528717279434,
0.5173454880714417,
-0.6578747034072876,
0.7107478976249695,
-0.7918999791145325,
-0.0648345872759819,
0.609937846660614,
-0.7329513430595398,
0.9741371870040894,
0.17912346124649048,
-0.02658769302070141,
0.5162150859832764,
-0.3978803157806397,
-0.7833885550498962,
-0.6497276425361633,
-0.3898126780986786,
-0.0952848568558693,
0.2663288116455078,
-0.1604052186012268,
0.373076468706131,
-0.8357769250869751,
-0.05217683315277099,
-0.2680160701274872,
0.8389158248901367,
0.6833611130714417,
-0.6712407469749451,
0.7406917214393616,
-0.44522786140441895,
-0.34645363688468933,
-0.27384576201438904,
-0.9878405928611756,
-0.8166060447692871,
0.06268279999494553,
0.38567957282066345,
-0.3274703919887543,
0.5296315550804138,
-0.11810623109340668,
0.23029841482639313,
0.08616159111261368,
-0.2195747196674347,
0.09430307894945145,
0.4057176411151886,
0.4892159104347229,
-0.1636916548013687,
-0.6071445345878601,
0.41256585717201233,
0.622254490852356,
-0.41223976016044617,
-0.6686707139015198,
-0.7474371790885925,
-0.8509522080421448,
-0.16754287481307983,
-0.9078601002693176,
-0.29653599858283997,
-0.5020652413368225,
0.4692700505256653,
0.01281109917908907,
-0.16071580350399017,
0.03388889133930206,
-0.020511148497462273,
0.5027827024459839,
-0.20729811489582065,
0.48107290267944336,
0.33669769763946533,
-0.5275911688804626,
0.48271527886390686,
0.2738940715789795,
-0.033152539283037186,
-0.13629786670207977,
-0.05965912342071533,
-0.26200807094573975,
0.04002794995903969,
-0.34095603227615356,
-3.986898899078369,
-0.46819332242012024,
-0.422744482755661,
-0.169097900390625,
0.6008929014205933,
0.058016058057546616,
-0.11401277780532836,
-0.3077819049358368,
-0.09595538675785063,
0.6723822355270386,
0.19367831945419312,
0.28304359316825867,
0.1609862744808197,
0.7567598819732666,
0.6889985799789429,
0.06907720118761063,
-0.04188092052936554,
-0.7434936165809631,
0.13321782648563385,
0.8456063270568848,
-0.10364038497209548,
-0.45084846019744873,
-0.4758241474628449,
0.43882066011428833,
-0.6432598829269409,
0.7217311859130859,
-0.24189773201942444,
0.12737572193145752,
-1.1008601188659668,
-0.3305315673351288,
0.14614742994308472,
-0.7819333076477051,
0.5287120342254639,
-0.055538054555654526,
0.1877404749393463,
-0.6907662153244019,
0.5616975426673889,
-0.4611121714115143,
-0.26109233498573303,
-0.12898315489292145,
-0.3724522292613983,
-0.7191406488418579,
-0.4425233602523804,
-0.644108235836029,
0.8424481153488159,
0.17532426118850708,
-0.5121750235557556,
-0.6467239260673523,
-0.0008507720194756985,
0.7866212129592896,
-0.02644744887948036,
-0.005045140627771616,
0.015782782807946205,
0.16334445774555206,
-0.1913367658853531,
-0.13697923719882965,
-0.6684983372688293,
0.18346354365348816,
-0.341105580329895,
0.5427411198616028,
0.3779832422733307,
-0.6778115034103394,
-0.2931850254535675,
-0.8805161714553833,
-0.4212774932384491,
-0.5368952751159668,
-1.3937891721725464,
-1.225494146347046,
0.4276703894138336,
1.1205668449401855,
-0.6005299687385559,
0.15732505917549133,
-0.3914784789085388,
-1.357046604156494,
-0.4707142114639282,
-0.1497287154197693,
-0.25035548210144043,
-0.34328439831733704,
0.39083412289619446,
0.1623048633337021,
-0.9275814294815063,
-0.6430015563964844,
0.2973862886428833,
0.5580436587333679,
-0.6232585310935974,
-0.6611042022705078,
0.4015969038009643,
-1.0232892036437988,
-0.2585645020008087,
-0.5431421399116516,
0.5021264553070068,
-0.48601630330085754,
-0.010242084041237833,
0.5862035155296326,
0.7316920161247253,
0.4036808013916016,
0.4269520044326782,
-0.705938458442688,
0.7747307419776917,
0.10164368897676468,
0.7887958884239197,
-0.9612497091293336,
0.12755516171455383,
0.06812842190265656,
-0.022603651508688927,
0.14722754061222076,
-0.5588505268096924,
-0.20689940452575684,
0.3557641804218292,
-0.6812759637832642,
0.2860803008079529,
-0.38954633474349976,
0.1759403496980667,
-0.5678874850273132,
-0.1692986786365509,
-0.14578519761562347,
0.5711379051208496,
1.0208125114440918,
0.7759483456611633,
-0.372348427772522,
-0.5460885763168335,
0.7190321683883667,
-0.6914990544319153,
0.13365162909030914,
-0.4854792356491089,
0.4054908752441406,
0.4502798914909363,
-0.3041122555732727,
-0.06726965308189392,
-0.05570871382951737,
-0.0455719493329525,
0.4785125255584717,
0.8867972493171692,
0.4107886850833893,
0.6121342182159424,
-0.20477132499217987,
-0.5598517656326294,
-0.6443566679954529,
-0.5905212759971619,
-0.5571200251579285,
0.17573799192905426,
-0.28621870279312134,
0.1685224026441574,
0.09719007462263109,
-0.04223639518022537,
-0.28623101115226746,
-0.1449810117483139,
-0.3789580464363098,
-0.5227636098861694,
-0.049728814512491226,
0.7849089503288269,
0.16792525351047516,
0.9849340915679932,
-0.6559549570083618,
0.35723909735679626,
-0.6822739243507385,
1.2873116731643677,
0.19993330538272855,
0.03512010723352432,
-0.6972134113311768,
0.18453484773635864,
-0.2437680810689926,
0.2156416028738022,
0.5230382680892944,
0.22020135819911957,
0.8314080238342285,
0.15627102553844452,
-0.7330264449119568,
0.3888184726238251,
-0.22034703195095065,
0.5457669496536255,
-0.48084837198257446,
-0.45576658844947815,
-0.09287727624177931,
-0.06968110054731369,
0.35125672817230225,
-0.4278119504451752,
0.2038476765155792,
0.11392722278833388,
0.9433983564376832,
-0.4097744226455689,
0.035297419875860214,
-0.4274404048919678,
-0.25100165605545044,
1.0943366289138794,
-0.07634022831916809,
-0.2925529479980469,
-0.7512530088424683,
0.2649727463722229,
-0.4078235328197479,
-0.3372223973274231,
0.05190162733197212,
0.005654910113662481,
-0.0001571219472680241,
-0.35445958375930786,
-0.7837416529655457,
0.1500556766986847,
0.4383024573326111,
0.6099548935890198,
0.05951934307813645,
-0.21325334906578064,
0.0199207104742527,
-0.22704418003559113,
-0.6481077671051025,
0.37442275881767273,
-1.015955924987793,
0.38637226819992065,
-0.06489371508359909,
-0.494120329618454,
0.3469836115837097,
0.15402406454086304,
-0.7660972476005554,
-0.7053225040435791,
-0.25964751839637756,
0.014004424214363098,
-0.2860170006752014,
-0.17565494775772095,
-0.45117494463920593,
-0.0031954257283359766,
0.09676837921142578,
-0.514464259147644,
0.41698193550109863,
-0.21642713248729703,
-0.5398141145706177,
-0.3647628426551819,
0.37005379796028137,
0.239425927400589,
-0.08833975344896317,
0.934946596622467,
-0.48340797424316406,
0.6241437792778015,
-0.7253676652908325,
-0.04303571209311485,
1.1125205755233765,
-0.15692919492721558,
-0.2914651036262512,
-0.5117168426513672,
0.21365483105182648,
0.4924402534961701,
0.5269662141799927,
0.0352792888879776,
-0.149167999625206,
-0.6019760370254517,
0.08245442807674408,
0.4900692105293274,
0.518824577331543,
-0.00005570516441366635,
-0.553304135799408,
0.22217543423175812,
0.5047767758369446,
0.135724738240242,
1.1511540412902832,
-0.3541218340396881,
-0.9712511897087096,
0.8353699445724487,
-0.39227569103240967,
-0.9117669463157654,
-0.26349931955337524,
0.05597023293375969,
0.20695461332798004,
0.3178807199001312,
1.0663238763809204,
0.5062212347984314,
0.7288597822189331,
0.09899299591779707,
0.553720235824585,
0.675009548664093,
-0.20067055523395536,
0.3138423264026642,
-0.6886593103408813,
-0.2910398542881012,
-1.3186300992965698,
-0.4684459865093231,
-0.095743365585804,
-0.1257995069026947,
-0.4858281314373016,
-0.4935407340526581,
-0.3266896903514862,
-0.3928797245025635,
-0.40803104639053345,
-0.9975396394729614,
0.4229583740234375,
0.37309643626213074,
0.4431034922599793,
0.30364808440208435,
-0.3765178918838501,
0.5616499185562134,
0.16904796659946442,
-0.7343707084655762,
0.2560209631919861,
0.6166825294494629,
0.3200829327106476,
-0.4483652710914612,
0.16224201023578644,
-0.31495288014411926,
-0.42713335156440735,
0.7270734906196594,
0.7049484848976135,
-0.0571461021900177,
0.04477125033736229,
-0.6647796034812927,
1.183672308921814,
0.36199676990509033,
0.046881116926670074,
0.4515796303749085,
0.9278061985969543,
0.31471705436706543,
-0.7073333859443665,
-0.3443860113620758,
0.5440067052841187,
-0.15020819008350372,
-0.541202962398529,
0.5203295946121216,
1.2192286252975464,
-0.9983593225479126,
-0.18758884072303772,
0.2758221924304962,
-0.6511523723602295,
-0.1584404855966568,
-0.236241415143013,
0.2692437767982483,
-0.4941152036190033,
0.4987454116344452,
-0.3331359028816223,
0.3163745701313019,
0.745529294013977,
-0.2905873656272888,
0.13602906465530396,
0.4679684340953827,
1.0555986166000366,
1.075700044631958,
0.5368486046791077,
-0.5118206739425659,
0.8668332099914551,
-0.5726966857910156,
-0.7811751961708069,
0.1938626915216446,
-0.1929349899291992,
0.1757766306400299,
0.6384295225143433,
0.26462844014167786,
0.9542630314826964,
0.19313029944896695,
1.264248013496399,
-0.6304428577423096,
0.0487106591463089,
-0.16211535036563873,
-0.7894763350486755,
0.3582514822483063,
-0.04153040423989296,
0.635784387588501,
0.6554391980171204,
-0.47010496258735657,
-0.8302040696144104,
-0.1350124627351761,
0.2568812072277069,
0.13614831864833832,
-0.2563649117946625,
-1.0434694290161133,
0.3232482671737671,
0.47882452607154846,
0.4298652410507202,
1.0563770532608032,
-0.28917592763900757,
-0.8533256649971008,
0.10648339986801147,
0.6376127004623413,
-0.20832888782024384,
0.2370245456695557,
0.0018312990432605147,
-0.2034837007522583,
0.01051164511591196,
-1.105310082435608,
0.29724350571632385,
0.15604574978351593,
0.1973688006401062,
0.44394731521606445,
0.3974513411521912,
-0.13625948131084442,
0.9571986198425292,
0.2257384955883026,
0.2323588728904724,
-0.5583669543266296,
-0.7854922413825989,
0.1647188365459442,
-1.6098142862319946,
0.318587988615036,
-0.13399995863437653,
-0.2172701060771942,
-0.767514705657959,
-0.5813586711883545,
-0.3195130527019501,
-0.04894036799669266,
0.2929930090904236,
-0.8213384747505188,
0.07181350141763687,
0.7469993829727173,
0.6407455801963806,
0.16365697979927063,
0.7870153188705444,
0.6524736881256104,
0.6399973630905151,
-0.04992736503481865,
-0.03959266096353531,
-0.2512352466583252,
0.8448855876922607,
-0.1422702670097351,
0.1216789186000824,
-1.2647287845611572,
0.5931149125099182,
0.7186052203178406,
-0.06118432432413101,
-1.1942816972732544,
-0.17677085101604462,
0.31543800234794617,
-0.32252824306488037,
0.8255583047866821,
-0.14529970288276672,
-0.2695446312427521,
-0.33378756046295166,
-0.1653425395488739,
0.1454019844532013,
-0.3920115828514099,
0.912214994430542,
-0.7279734015464783,
0.7374742031097412,
0.933980405330658,
0.13429680466651917,
-0.514870285987854,
0.3989711999893189,
-0.11613689363002776,
0.4022413492202759,
-0.9990655779838562,
-0.33749932050704956,
-0.4334589838981629,
-1.376373291015625,
-0.2993924915790558,
-0.09454808384180068,
-0.01314175222069025,
-0.001090060803107917,
0.2137461006641388,
0.2938512861728668,
0.17508235573768616,
0.8260607123374939,
-0.7218498587608337,
0.2414487451314926,
-0.47296759486198425,
-0.3002610504627228,
-1.238540768623352,
0.08663805574178696,
0.6805586218833923,
0.5909030437469482,
-0.42807504534721375,
-0.22887496650218964,
0.47537800669670105,
-1.0474627017974854,
0.6338009238243103,
0.06548397243022919,
0.4971011281013489,
1.3484878540039063
]
],
"regenerate": true
}
}
}

View File

@ -0,0 +1,785 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e1",
"desc": "natural vector",
"_vectors": {
"default": {
"embeddings": [
[
-0.2979458272457123,
-0.5288640856742859,
-0.019957859069108963,
-0.18495318293571472,
0.7429973483085632,
0.5238497257232666,
0.432366281747818,
0.32744166254997253,
0.0020762972999364138,
-0.9507834911346436,
-0.35097137093544006,
0.08469701558351517,
-1.4176613092422483,
0.4647577106952667,
-0.69340580701828,
1.0372896194458008,
0.3716741800308227,
0.06031008064746857,
-0.6152024269104004,
0.007914665155112743,
0.7954924702644348,
-0.20773003995418549,
0.09376765787601472,
0.04508133605122566,
-0.2084471583366394,
-0.1518009901046753,
0.018195509910583496,
-0.07044368237257004,
-0.18119366466999057,
-0.4480230510234833,
0.3822529911994934,
0.1911812424659729,
0.4674372375011444,
0.06963984668254852,
-0.09341949224472046,
0.005675444379448891,
-0.6774799227714539,
-0.7066726684570313,
-0.39256376028060913,
0.04005039855837822,
0.2084812968969345,
-0.7872875928878784,
-0.8205880522727966,
0.2919981777667999,
-0.06004738807678223,
-0.4907574355602264,
-1.5937862396240234,
0.24249385297298431,
-0.14709846675395966,
-0.11860740929841997,
-0.8299489617347717,
0.472964346408844,
-0.497518390417099,
-0.22205302119255063,
-0.4196169078350067,
0.32697558403015137,
-0.360930860042572,
-0.9789686799049376,
0.1887447088956833,
-0.403737336397171,
0.18524253368377688,
0.3768732249736786,
0.3666233420372009,
0.3511938452720642,
0.6985810995101929,
0.41721710562705994,
0.09754953533411026,
0.6204307079315186,
-1.0762996673583984,
-0.06263761967420578,
-0.7376511693000793,
0.6849768161773682,
-0.1745152473449707,
-0.40449759364128113,
0.20757411420345304,
-0.8424443006515503,
0.330015629529953,
0.3489064872264862,
1.0954371690750122,
0.8487558960914612,
1.1076823472976685,
0.61430823802948,
0.4155903458595276,
0.4111340939998626,
0.05753209814429283,
-0.06429877132177353,
-0.765606164932251,
-0.41703930497169495,
-0.508820652961731,
0.19859947264194489,
-0.16607828438282013,
-0.28112146258354187,
0.11032675206661224,
0.38809511065483093,
-0.36498191952705383,
-0.48671194911003113,
0.6755134463310242,
0.03958442434668541,
0.4478721618652344,
-0.10335399955511092,
-0.9546685814857484,
-0.6087718605995178,
0.17498846352100372,
0.08320838958024979,
-1.4478336572647097,
-0.605027437210083,
-0.5867993235588074,
-0.14711688458919525,
-0.5447602272033691,
-0.026259321719408035,
-0.6997418403625488,
-0.07349082082509995,
0.10638900846242905,
-0.7133527398109436,
-0.9396815299987792,
1.087092399597168,
1.1885089874267578,
0.4011896848678589,
-0.4089202582836151,
-0.10938972979784012,
0.6726722121238708,
0.24576938152313232,
-0.24247920513153076,
1.1499971151351929,
0.47813335061073303,
-0.05331678315997124,
0.32338133454322815,
0.4870913326740265,
-0.23144258558750153,
-1.2023426294326782,
0.2349330335855484,
1.080536961555481,
0.29334118962287903,
0.391574501991272,
-0.15818795561790466,
-0.2948290705680847,
-0.024689948186278343,
0.06602869182825089,
0.5937030911445618,
-0.047901444137096405,
-0.512734591960907,
-0.35780075192451477,
0.28751692175865173,
0.4298716187477112,
0.9242428541183472,
-0.17208744585514069,
0.11515070497989656,
-0.0335976779460907,
-0.3422986567020416,
0.5344581604003906,
0.19895796477794647,
0.33001241087913513,
0.6390730142593384,
-0.6074934005737305,
-0.2553696632385254,
0.9644920229911804,
0.2699219584465027,
0.6403993368148804,
-0.6380003690719604,
-0.027310986071825027,
0.638815701007843,
0.27719101309776306,
-0.13553589582443237,
0.750195324420929,
0.1224869191646576,
-0.20613941550254825,
0.8444448709487915,
0.16200250387191772,
-0.24750925600528717,
-0.739950954914093,
-0.28443849086761475,
-1.176282525062561,
0.516107976436615,
0.3774825632572174,
0.10906043648719788,
0.07962015271186829,
0.7384604215621948,
-0.051241904497146606,
1.1730090379714966,
-0.4828610122203827,
-1.404372215270996,
0.8811132311820984,
-0.3839482367038727,
0.022516896948218346,
-0.0491158664226532,
-0.43027013540267944,
1.2049334049224854,
-0.27309560775756836,
0.6883630752563477,
0.8264574408531189,
-0.5020735263824463,
-0.4874092042446137,
0.6007202863693237,
-0.4965405762195587,
1.1302915811538696,
0.032572727650403976,
-0.3731859028339386,
0.658271849155426,
-0.9023059010505676,
0.7400162220001221,
0.014550759457051754,
-0.19699542224407196,
0.2319706380367279,
-0.789058268070221,
-0.14905710518360138,
-0.5826214551925659,
0.207652747631073,
-0.4507439732551574,
-0.3163885474205017,
0.3604124188423157,
-0.45119962096214294,
0.3428427278995514,
0.3005594313144684,
-0.36026081442832947,
1.1014249324798584,
0.40884315967559814,
0.34991952776908875,
-0.1806638240814209,
0.27440476417541504,
-0.7118373513221741,
0.4645499587059021,
0.214790478348732,
-0.2343102991580963,
0.10500429570674896,
-0.28034430742263794,
1.2267805337905884,
1.0561333894729614,
-0.497364342212677,
-0.6143305897712708,
0.24963727593421936,
-0.33136463165283203,
-0.01473914459347725,
0.495918869972229,
-0.6985538005828857,
-1.0033197402954102,
0.35937801003456116,
0.6325868368148804,
-0.6808838844299316,
1.0354058742523191,
-0.7214401960372925,
-0.33318862318992615,
0.874398410320282,
-0.6594992280006409,
0.6830640435218811,
-0.18534131348133087,
0.024834271520376205,
0.19901277124881744,
-0.5992477536201477,
-1.2126628160476685,
-0.9245557188987732,
-0.3898217976093292,
-0.1286519467830658,
0.4217943847179413,
-0.1143646091222763,
0.5630772709846497,
-0.5240639448165894,
0.21152715384960177,
-0.3792001008987427,
0.8266305327415466,
1.170984387397766,
-0.8072142004966736,
0.11382893472909927,
-0.17953898012638092,
-0.1789460331201553,
-0.15078622102737427,
-1.2082908153533936,
-0.7812382578849792,
-0.10903695970773696,
0.7303897142410278,
-0.39054441452026367,
0.19511254131793976,
-0.09121843427419662,
0.22400228679180145,
0.30143046379089355,
0.1141919493675232,
0.48112115263938904,
0.7307931780815125,
0.09701362252235413,
-0.2795647978782654,
-0.3997688889503479,
0.5540812611579895,
0.564578115940094,
-0.40065160393714905,
-0.3629159033298493,
-0.3789091110229492,
-0.7298538088798523,
-0.6996853351593018,
-0.4477842152118683,
-0.289089560508728,
-0.6430277824401855,
0.2344944179058075,
0.3742927014827728,
-0.5079357028007507,
0.28841453790664673,
0.06515737622976303,
0.707315981388092,
0.09498685598373412,
0.8365515470504761,
0.10002726316452026,
-0.7695478200912476,
0.6264724135398865,
0.7562043070793152,
-0.23112858831882477,
-0.2871039807796478,
-0.25010058283805847,
0.2783474028110504,
-0.03224996477365494,
-0.9119359850883484,
-3.6940200328826904,
-0.5099936127662659,
-0.1604711413383484,
0.17453284561634064,
0.41759559512138367,
0.1419190913438797,
-0.11362407356500626,
-0.33312007784843445,
0.11511333286762238,
0.4667884409427643,
-0.0031647447030991316,
0.15879854559898376,
0.3042248487472534,
0.5404849052429199,
0.8515422344207764,
0.06286454200744629,
0.43790125846862793,
-0.8682025074958801,
-0.06363756954669952,
0.5547921657562256,
-0.01483887154608965,
-0.07361344993114471,
-0.929947018623352,
0.3502565622329712,
-0.5080993175506592,
1.0380364656448364,
-0.2017953395843506,
0.21319580078125,
-1.0763001441955566,
-0.556368887424469,
0.1949922740459442,
-0.6445739269256592,
0.6791343688964844,
0.21188358962535855,
0.3736183941364288,
-0.21800459921360016,
0.7597446441650391,
-0.3732394874095917,
-0.4710160195827484,
0.025146087631583217,
0.05341297015547752,
-0.9522109627723694,
-0.6000866889953613,
-0.08469046652317047,
0.5966026186943054,
0.3444081246852875,
-0.461188405752182,
-0.5279349088668823,
0.10296865552663804,
0.5175143480300903,
-0.20671147108078003,
0.13392412662506104,
0.4812754988670349,
0.2993808686733246,
-0.3005635440349579,
0.5141698122024536,
-0.6239235401153564,
0.2877119481563568,
-0.4452739953994751,
0.5621107816696167,
0.5047508478164673,
-0.4226335883140564,
-0.18578553199768064,
-1.1967322826385498,
0.28178197145462036,
-0.8692031502723694,
-1.1812998056411743,
-1.4526212215423584,
0.4645712077617645,
0.9327932000160216,
-0.6560136675834656,
0.461549699306488,
-0.5621527433395386,
-1.328449010848999,
-0.08676894754171371,
0.00021918353741057217,
-0.18864136934280396,
0.1259666532278061,
0.18240638077259064,
-0.14919660985469818,
-0.8965857625007629,
-0.7539900541305542,
0.013973715715110302,
0.504276692867279,
-0.704748272895813,
-0.6428424119949341,
0.6303996443748474,
-0.5404738187789917,
-0.31176653504371643,
-0.21262824535369873,
0.18736739456653595,
-0.7998970746994019,
0.039946746081113815,
0.7390344738960266,
0.4283199906349182,
0.3795057237148285,
0.07204607129096985,
-0.9230587482452391,
0.9440426230430604,
0.26272690296173096,
0.5598306655883789,
-1.0520871877670288,
-0.2677186131477356,
-0.1888762265443802,
0.30426350235939026,
0.4746131896972656,
-0.5746733546257019,
-0.4197768568992615,
0.8565112948417664,
-0.6767723560333252,
0.23448683321475983,
-0.2010004222393036,
0.4112907350063324,
-0.6497949957847595,
-0.418667733669281,
-0.4950824975967407,
0.44438859820365906,
1.026281714439392,
0.482397586107254,
-0.26220494508743286,
-0.3640787005424499,
0.5907743573188782,
-0.8771642446517944,
0.09708411991596222,
-0.3671700060367584,
0.4331349730491638,
0.619417667388916,
-0.2684665620326996,
-0.5123821496963501,
-0.1502324342727661,
-0.012190685607492924,
0.3580845892429352,
0.8617186546325684,
0.3493645489215851,
1.0270192623138428,
0.18297909200191495,
-0.5881339311599731,
-0.1733516901731491,
-0.5040576457977295,
-0.340370237827301,
-0.26767754554748535,
-0.28570041060447693,
-0.032928116619586945,
0.6029254794120789,
0.17397655546665192,
0.09346921741962431,
0.27815181016921997,
-0.46699589490890503,
-0.8148876428604126,
-0.3964351713657379,
0.3812595009803772,
0.13547226786613464,
0.7126688361167908,
-0.3473474085330963,
-0.06573959439992905,
-0.6483767032623291,
1.4808889627456665,
0.30924928188323975,
-0.5085946917533875,
-0.8613000512123108,
0.3048902451992035,
-0.4241599142551422,
0.15909206867218018,
0.5764641761779785,
-0.07879110425710678,
1.015336513519287,
0.07599356025457382,
-0.7025855779647827,
0.30047643184661865,
-0.35094937682151794,
0.2522146999835968,
-0.2338722199201584,
-0.8326804637908936,
-0.13695412874221802,
-0.03452421352267265,
0.47974953055381775,
-0.18385636806488037,
0.32438594102859497,
0.1797013282775879,
0.787494957447052,
-0.12579888105392456,
-0.07507286965847015,
-0.4389670491218567,
0.2720070779323578,
0.8138866424560547,
0.01974171027541161,
-0.3057698905467987,
-0.6709924936294556,
0.0885881632566452,
-0.2862754464149475,
0.03475658595561981,
-0.1285519152879715,
0.3838353455066681,
-0.2944154739379883,
-0.4204859137535095,
-0.4416137933731079,
0.13426260650157928,
0.36733248829841614,
0.573428750038147,
-0.14928072690963745,
-0.026076916605234143,
0.33286052942276,
-0.5340145826339722,
-0.17279052734375,
-0.01154550164937973,
-0.6620771884918213,
0.18390542268753052,
-0.08265615254640579,
-0.2489682286977768,
0.2429984211921692,
-0.044153645634651184,
-0.986578404903412,
-0.33574509620666504,
-0.5387663841247559,
0.19767941534519196,
0.12540718913078308,
-0.3403128981590271,
-0.4154576361179352,
0.17275673151016235,
0.09407442808151244,
-0.5414086580276489,
0.4393929839134216,
0.1725579798221588,
-0.4998118281364441,
-0.6926208138465881,
0.16552448272705078,
0.6659538149833679,
-0.10949844866991044,
0.986426830291748,
0.01748848147690296,
0.4003709554672241,
-0.5430638194084167,
0.35347291827201843,
0.6887399554252625,
0.08274628221988678,
0.13407137989997864,
-0.591465950012207,
0.3446292281150818,
0.6069018244743347,
0.1935492902994156,
-0.0989871397614479,
0.07008486241102219,
-0.8503749370574951,
-0.09507356584072112,
0.6259510517120361,
0.13934025168418884,
0.06392545253038406,
-0.4112265408039093,
-0.08475656062364578,
0.4974113404750824,
-0.30606114864349365,
1.111435890197754,
-0.018766529858112335,
-0.8422622680664063,
0.4325508773326874,
-0.2832120656967163,
-0.4859798848628998,
-0.41498348116874695,
0.015977520495653152,
0.5292825698852539,
0.4538311660289765,
1.1328668594360352,
0.22632671892642975,
0.7918671369552612,
0.33401933312416077,
0.7306135296821594,
0.3548600673675537,
0.12506209313869476,
0.8573207855224609,
-0.5818327069282532,
-0.6953738927841187,
-1.6171947717666626,
-0.1699674427509308,
0.6318262815475464,
-0.05671752244234085,
-0.28145185112953186,
-0.3976689279079437,
-0.2041076272726059,
-0.5495951175689697,
-0.5152917504310608,
-0.9309796094894408,
0.101932130753994,
0.1367802917957306,
0.1490798443555832,
0.5304336547851563,
-0.5082434415817261,
0.06688683480024338,
0.14657628536224365,
-0.782435953617096,
0.2962816655635834,
0.6965363621711731,
0.8496337532997131,
-0.3042965829372406,
0.04343798756599426,
0.0330701619386673,
-0.5662598013877869,
1.1086925268173218,
0.756072998046875,
-0.204134538769722,
0.2404300570487976,
-0.47848284244537354,
1.3659011125564575,
0.5645433068275452,
-0.15836156904697418,
0.43395575881004333,
0.5944653749465942,
1.0043466091156006,
-0.49446743726730347,
-0.5954391360282898,
0.5341240763664246,
0.020598189905285835,
-0.4036853015422821,
0.4473709762096405,
1.1998231410980225,
-0.9317775368690492,
-0.23321466147899628,
0.2052552700042725,
-0.7423108816146851,
-0.19917210936546328,
-0.1722569614648819,
-0.034072667360305786,
-0.00671181408688426,
0.46396249532699585,
-0.1372445821762085,
0.053376372903585434,
0.7392690777778625,
-0.38447609543800354,
0.07497968524694443,
0.5197252631187439,
1.3746477365493774,
0.9060075879096984,
0.20000585913658145,
-0.4053704142570496,
0.7497360110282898,
-0.34087055921554565,
-1.101803183555603,
0.273650586605072,
-0.5125769376754761,
0.22472351789474487,
0.480757474899292,
-0.19845178723335263,
0.8857700824737549,
0.30752456188201904,
1.1109285354614258,
-0.6768012642860413,
0.524367094039917,
-0.22495046257972717,
-0.4224412739276886,
0.40753406286239624,
-0.23133376240730288,
0.3297771215438843,
0.4905449151992798,
-0.6813114285469055,
-0.7543983459472656,
-0.5599071383476257,
0.14351597428321838,
-0.029278717935085297,
-0.3970443606376648,
-0.303079217672348,
0.24161772429943085,
0.008353390730917454,
-0.0062365154735744,
1.0824860334396362,
-0.3704061508178711,
-1.0337258577346802,
0.04638749733567238,
1.163011074066162,
-0.31737643480300903,
0.013986887410283089,
0.19223114848136905,
-0.2260770797729492,
-0.210910826921463,
-1.0191949605941772,
0.22356095910072327,
0.09353553503751756,
0.18096882104873657,
0.14867214858531952,
0.43408671021461487,
-0.33312076330184937,
0.8173948526382446,
0.6428242921829224,
0.20215003192424777,
-0.6634518504142761,
-0.4132290482521057,
0.29815030097961426,
-1.579406976699829,
-0.0981958732008934,
-0.03941014781594277,
0.1709178239107132,
-0.5481140613555908,
-0.5338194966316223,
-0.3528362512588501,
-0.11561278253793716,
-0.21793591976165771,
-1.1570470333099363,
0.2157980799674988,
0.42083489894866943,
0.9639263153076172,
0.09747201204299928,
0.15671424567699432,
0.4034591615200043,
0.6728067994117737,
-0.5216875672340393,
0.09657668322324751,
-0.2416689097881317,
0.747975766658783,
0.1021689772605896,
0.11652665585279463,
-1.0484966039657593,
0.8489304780960083,
0.7169828414916992,
-0.09012343734502792,
-1.3173753023147583,
0.057890523225069046,
-0.006231260951608419,
-0.1018214002251625,
0.936040461063385,
-0.0502331368625164,
-0.4284322261810303,
-0.38209280371665955,
-0.22668412327766416,
0.0782942995429039,
-0.4881664514541626,
0.9268959760665894,
0.001867273123934865,
0.42261114716529846,
0.8283362984657288,
0.4256294071674347,
-0.7965338826179504,
0.4840078353881836,
-0.19861412048339844,
0.33977967500686646,
-0.4604192078113556,
-0.3107339143753052,
-0.2839638590812683,
-1.5734281539916992,
0.005220232997089624,
0.09239906817674635,
-0.7828494906425476,
-0.1397123783826828,
0.2576255202293396,
0.21372435986995697,
-0.23169949650764465,
0.4016408920288086,
-0.462497353553772,
-0.2186472862958908,
-0.5617868900299072,
-0.3649831712245941,
-1.1585862636566162,
-0.08222806453704834,
0.931126832962036,
0.4327389597892761,
-0.46451422572135925,
-0.5430706143379211,
-0.27434298396110535,
-0.9479129314422609,
0.1845661848783493,
0.3972720205783844,
0.4883299469947815,
1.04031240940094
]
],
"regenerate": true
}
}
}

View File

@ -0,0 +1,780 @@
---
source: dump/src/reader/mod.rs
expression: document
---
{
"id": "e0",
"desc": "overriden vector",
"_vectors": {
"default": [
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1,
0.1
]
}
}

View File

@ -252,7 +252,7 @@ pub(crate) mod test {
let mut dump = V2Reader::open(dir).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-09 20:27:59.904096267 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();
@ -349,7 +349,7 @@ pub(crate) mod test {
let mut dump = V2Reader::open(dir).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2023-01-30 16:26:09.247261 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -61,7 +61,7 @@ pub enum IndexDocumentsMethod {
#[cfg_attr(test, derive(serde::Serialize))]
#[non_exhaustive]
pub enum UpdateFormat {
/// The given update is a real **comma seperated** CSV with headers on the first line.
/// The given update is a real **comma separated** CSV with headers on the first line.
Csv,
/// The given update is a JSON array with documents inside.
Json,

View File

@ -1,5 +1,6 @@
use serde::{Deserialize, Serialize};
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error

View File

@ -267,7 +267,7 @@ pub(crate) mod test {
let mut dump = V3Reader::open(dir).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-07 11:39:03.709153554 +00:00:00");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -152,6 +152,7 @@ impl Settings<Unchecked> {
}
#[derive(Debug, Clone, Deserialize)]
#[allow(dead_code)] // otherwise rustc complains that the fields go unused
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]

View File

@ -95,6 +95,7 @@ impl fmt::Display for ErrorType {
}
}
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error

View File

@ -262,8 +262,8 @@ pub(crate) mod test {
let mut dump = V4Reader::open(dir).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-06 12:53:49.131989609 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -182,6 +182,7 @@ impl Settings<Unchecked> {
}
}
#[allow(dead_code)] // otherwise rustc complains that the fields go unused
#[derive(Debug, Clone, Deserialize)]
#[cfg_attr(test, derive(serde::Serialize))]
#[serde(deny_unknown_fields)]

View File

@ -31,6 +31,7 @@ impl ResponseError {
}
}
#[allow(clippy::enum_variant_names)]
#[derive(Deserialize, Debug, Clone, Copy)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum Code {

View File

@ -299,8 +299,8 @@ pub(crate) mod test {
let mut dump = V5Reader::open(dir).unwrap();
// top level infos
insta::assert_display_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_display_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
insta::assert_snapshot!(dump.date().unwrap(), @"2022-10-04 15:55:10.344982459 +00:00:00");
insta::assert_snapshot!(dump.instance_uid().unwrap().unwrap(), @"9e15e977-f2ae-4761-943f-1eaf75fd736d");
// tasks
let tasks = dump.tasks().collect::<Result<Vec<_>>>().unwrap();

View File

@ -200,6 +200,7 @@ impl std::ops::Deref for IndexUid {
}
}
#[allow(dead_code)] // otherwise rustc complains that the fields go unused
#[derive(Debug)]
#[cfg_attr(test, derive(serde::Serialize))]
#[cfg_attr(test, serde(rename_all = "camelCase"))]

View File

@ -2,10 +2,10 @@ use std::fs::{self, File};
use std::io::{BufRead, BufReader, ErrorKind};
use std::path::Path;
use log::debug;
pub use meilisearch_types::milli;
use tempfile::TempDir;
use time::OffsetDateTime;
use tracing::debug;
use uuid::Uuid;
use super::Document;

View File

@ -219,7 +219,7 @@ pub(crate) mod test {
fn _create_directory_hierarchy(dir: &Path, depth: usize) -> String {
let mut ret = String::new();
// the entries are not guarenteed to be returned in the same order thus we need to sort them.
// the entries are not guaranteed to be returned in the same order thus we need to sort them.
let mut entries =
fs::read_dir(dir).unwrap().collect::<std::result::Result<Vec<_>, _>>().unwrap();
@ -281,7 +281,7 @@ pub(crate) mod test {
let dump_path = dump.path();
// ==== checking global file hierarchy (we want to be sure there isn't too many files or too few)
insta::assert_display_snapshot!(create_directory_hierarchy(dump_path), @r###"
insta::assert_snapshot!(create_directory_hierarchy(dump_path), @r###"
.
├---- indexes/
│ └---- doggos/

Binary file not shown.

View File

@ -11,9 +11,7 @@ edition.workspace = true
license.workspace = true
[dependencies]
tempfile = "3.5.0"
thiserror = "1.0.40"
uuid = { version = "1.3.1", features = ["serde", "v4"] }
[dev-dependencies]
faux = "0.1.9"
tempfile = "3.10.1"
thiserror = "1.0.61"
tracing = "0.1.40"
uuid = { version = "1.10.0", features = ["serde", "v4"] }

View File

@ -1,5 +1,5 @@
use std::fs::File as StdFile;
use std::ops::{Deref, DerefMut};
use std::io::Write;
use std::path::{Path, PathBuf};
use std::str::FromStr;
@ -22,20 +22,6 @@ pub enum Error {
pub type Result<T> = std::result::Result<T, Error>;
impl Deref for File {
type Target = NamedTempFile;
fn deref(&self) -> &Self::Target {
&self.file
}
}
impl DerefMut for File {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.file
}
}
#[derive(Clone, Debug)]
pub struct FileStore {
path: PathBuf,
@ -56,7 +42,7 @@ impl FileStore {
let file = NamedTempFile::new_in(&self.path)?;
let uuid = Uuid::new_v4();
let path = self.path.join(uuid.to_string());
let update_file = File { file, path };
let update_file = File { file: Some(file), path };
Ok((uuid, update_file))
}
@ -67,7 +53,7 @@ impl FileStore {
let file = NamedTempFile::new_in(&self.path)?;
let uuid = Uuid::from_u128(uuid);
let path = self.path.join(uuid.to_string());
let update_file = File { file, path };
let update_file = File { file: Some(file), path };
Ok((uuid, update_file))
}
@ -75,7 +61,13 @@ impl FileStore {
/// Returns the file corresponding to the requested uuid.
pub fn get_update(&self, uuid: Uuid) -> Result<StdFile> {
let path = self.get_update_path(uuid);
let file = StdFile::open(path)?;
let file = match StdFile::open(path) {
Ok(file) => file,
Err(e) => {
tracing::error!("Can't access update file {uuid}: {e}");
return Err(e.into());
}
};
Ok(file)
}
@ -110,8 +102,12 @@ impl FileStore {
pub fn delete(&self, uuid: Uuid) -> Result<()> {
let path = self.path.join(uuid.to_string());
std::fs::remove_file(path)?;
Ok(())
if let Err(e) = std::fs::remove_file(path) {
tracing::error!("Can't delete file {uuid}: {e}");
Err(e.into())
} else {
Ok(())
}
}
/// List the Uuids of the files in the FileStore
@ -136,16 +132,40 @@ impl FileStore {
pub struct File {
path: PathBuf,
file: NamedTempFile,
file: Option<NamedTempFile>,
}
impl File {
pub fn dry_file() -> Result<Self> {
Ok(Self { path: PathBuf::new(), file: None })
}
pub fn persist(self) -> Result<()> {
self.file.persist(&self.path)?;
if let Some(file) = self.file {
file.persist(&self.path)?;
}
Ok(())
}
}
impl Write for File {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
if let Some(file) = self.file.as_mut() {
file.write(buf)
} else {
Ok(buf.len())
}
}
fn flush(&mut self) -> std::io::Result<()> {
if let Some(file) = self.file.as_mut() {
file.flush()
} else {
Ok(())
}
}
}
#[cfg(test)]
mod test {
use std::io::Write;

View File

@ -13,8 +13,8 @@ license.workspace = true
[dependencies]
nom = "7.1.3"
nom_locate = "4.1.0"
unescaper = "0.1.2"
nom_locate = "4.2.0"
unescaper = "0.1.5"
[dev-dependencies]
insta = "1.29.0"
insta = "1.39.0"

View File

@ -26,6 +26,8 @@ pub enum Condition<'a> {
LowerThan(Token<'a>),
LowerThanOrEqual(Token<'a>),
Between { from: Token<'a>, to: Token<'a> },
Contains { keyword: Token<'a>, word: Token<'a> },
StartsWith { keyword: Token<'a>, word: Token<'a> },
}
/// condition = value ("==" | ">" ...) value
@ -92,6 +94,62 @@ pub fn parse_not_exists(input: Span) -> IResult<FilterCondition> {
Ok((input, FilterCondition::Not(Box::new(FilterCondition::Condition { fid: key, op: Exists }))))
}
/// contains = value "CONTAINS" value
pub fn parse_contains(input: Span) -> IResult<FilterCondition> {
let (input, (fid, contains, value)) =
tuple((parse_value, tag("CONTAINS"), cut(parse_value)))(input)?;
Ok((
input,
FilterCondition::Condition {
fid,
op: Contains { keyword: Token { span: contains, value: None }, word: value },
},
))
}
/// contains = value "NOT" WS+ "CONTAINS" value
pub fn parse_not_contains(input: Span) -> IResult<FilterCondition> {
let keyword = tuple((tag("NOT"), multispace1, tag("CONTAINS")));
let (input, (fid, (_not, _spaces, contains), value)) =
tuple((parse_value, keyword, cut(parse_value)))(input)?;
Ok((
input,
FilterCondition::Not(Box::new(FilterCondition::Condition {
fid,
op: Contains { keyword: Token { span: contains, value: None }, word: value },
})),
))
}
/// starts with = value "CONTAINS" value
pub fn parse_starts_with(input: Span) -> IResult<FilterCondition> {
let (input, (fid, starts_with, value)) =
tuple((parse_value, tag("STARTS WITH"), cut(parse_value)))(input)?;
Ok((
input,
FilterCondition::Condition {
fid,
op: StartsWith { keyword: Token { span: starts_with, value: None }, word: value },
},
))
}
/// starts with = value "NOT" WS+ "CONTAINS" value
pub fn parse_not_starts_with(input: Span) -> IResult<FilterCondition> {
let keyword = tuple((tag("NOT"), multispace1, tag("STARTS WITH")));
let (input, (fid, (_not, _spaces, starts_with), value)) =
tuple((parse_value, keyword, cut(parse_value)))(input)?;
Ok((
input,
FilterCondition::Not(Box::new(FilterCondition::Condition {
fid,
op: StartsWith { keyword: Token { span: starts_with, value: None }, word: value },
})),
))
}
/// to = value value "TO" WS+ value
pub fn parse_to(input: Span) -> IResult<FilterCondition> {
let (input, (key, from, _, _, to)) =

View File

@ -146,7 +146,7 @@ impl<'a> Display for Error<'a> {
}
ErrorKind::InvalidPrimary => {
let text = if input.trim().is_empty() { "but instead got nothing.".to_string() } else { format!("at `{}`.", escaped_input) };
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` {}", text)?
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` {}", text)?
}
ErrorKind::InvalidEscapedNumber => {
writeln!(f, "Found an invalid escaped sequence number: `{}`.", escaped_input)?

View File

@ -48,8 +48,8 @@ use std::fmt::Debug;
pub use condition::{parse_condition, parse_to, Condition};
use condition::{
parse_exists, parse_is_empty, parse_is_not_empty, parse_is_not_null, parse_is_null,
parse_not_exists,
parse_contains, parse_exists, parse_is_empty, parse_is_not_empty, parse_is_not_null,
parse_is_null, parse_not_contains, parse_not_exists, parse_not_starts_with, parse_starts_with,
};
use error::{cut_with_err, ExpectedValueKind, NomErrorExt};
pub use error::{Error, ErrorKind};
@ -147,7 +147,38 @@ pub enum FilterCondition<'a> {
GeoBoundingBox { top_right_point: [Token<'a>; 2], bottom_left_point: [Token<'a>; 2] },
}
pub enum TraversedElement<'a> {
FilterCondition(&'a FilterCondition<'a>),
Condition(&'a Condition<'a>),
}
impl<'a> FilterCondition<'a> {
pub fn use_contains_operator(&self) -> Option<&Token> {
match self {
FilterCondition::Condition { fid: _, op } => match op {
Condition::GreaterThan(_)
| Condition::GreaterThanOrEqual(_)
| Condition::Equal(_)
| Condition::NotEqual(_)
| Condition::Null
| Condition::Empty
| Condition::Exists
| Condition::LowerThan(_)
| Condition::LowerThanOrEqual(_)
| Condition::Between { .. } => None,
Condition::Contains { keyword, word: _ }
| Condition::StartsWith { keyword, word: _ } => Some(keyword),
},
FilterCondition::Not(this) => this.use_contains_operator(),
FilterCondition::Or(seq) | FilterCondition::And(seq) => {
seq.iter().find_map(|filter| filter.use_contains_operator())
}
FilterCondition::GeoLowerThan { .. }
| FilterCondition::GeoBoundingBox { .. }
| FilterCondition::In { .. } => None,
}
}
/// Returns the first token found at the specified depth, `None` if no token at this depth.
pub fn token_at_depth(&self, depth: usize) -> Option<&Token> {
match self {
@ -452,6 +483,10 @@ fn parse_primary(input: Span, depth: usize) -> IResult<FilterCondition> {
parse_exists,
parse_not_exists,
parse_to,
parse_contains,
parse_not_contains,
parse_starts_with,
parse_not_starts_with,
// the next lines are only for error handling and are written at the end to have the less possible performance impact
parse_geo,
parse_geo_distance,
@ -534,6 +569,8 @@ impl<'a> std::fmt::Display for Condition<'a> {
Condition::LowerThan(token) => write!(f, "< {token}"),
Condition::LowerThanOrEqual(token) => write!(f, "<= {token}"),
Condition::Between { from, to } => write!(f, "{from} TO {to}"),
Condition::Contains { word, keyword: _ } => write!(f, "CONTAINS {word}"),
Condition::StartsWith { word, keyword: _ } => write!(f, "STARTS WITH {word}"),
}
}
}
@ -558,127 +595,142 @@ pub mod tests {
unsafe { Span::new_from_raw_offset(offset, lines as u32, value, "") }.into()
}
#[track_caller]
fn p(s: &str) -> impl std::fmt::Display + '_ {
Fc::parse(s).unwrap().unwrap()
}
#[test]
fn parse_escaped() {
insta::assert_display_snapshot!(p(r"title = 'foo\\'"), @r#"{title} = {foo\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\'"), @r#"{title} = {foo\\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\\\'"), @r#"{title} = {foo\\\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\\\\\'"), @r#"{title} = {foo\\\\}"#);
// but it also works with other sequencies
insta::assert_display_snapshot!(p(r#"title = 'foo\x20\n\t\"\'"'"#), @"{title} = {foo \n\t\"\'\"}");
insta::assert_snapshot!(p(r"title = 'foo\\'"), @r#"{title} = {foo\}"#);
insta::assert_snapshot!(p(r"title = 'foo\\\\'"), @r#"{title} = {foo\\}"#);
insta::assert_snapshot!(p(r"title = 'foo\\\\\\'"), @r#"{title} = {foo\\\}"#);
insta::assert_snapshot!(p(r"title = 'foo\\\\\\\\'"), @r#"{title} = {foo\\\\}"#);
// but it also works with other sequences
insta::assert_snapshot!(p(r#"title = 'foo\x20\n\t\"\'"'"#), @"{title} = {foo \n\t\"\'\"}");
}
#[test]
fn parse() {
// Test equal
insta::assert_display_snapshot!(p("channel = Ponce"), @"{channel} = {Ponce}");
insta::assert_display_snapshot!(p("subscribers = 12"), @"{subscribers} = {12}");
insta::assert_display_snapshot!(p("channel = 'Mister Mv'"), @"{channel} = {Mister Mv}");
insta::assert_display_snapshot!(p("channel = \"Mister Mv\""), @"{channel} = {Mister Mv}");
insta::assert_display_snapshot!(p("'dog race' = Borzoi"), @"{dog race} = {Borzoi}");
insta::assert_display_snapshot!(p("\"dog race\" = Chusky"), @"{dog race} = {Chusky}");
insta::assert_display_snapshot!(p("\"dog race\" = \"Bernese Mountain\""), @"{dog race} = {Bernese Mountain}");
insta::assert_display_snapshot!(p("'dog race' = 'Bernese Mountain'"), @"{dog race} = {Bernese Mountain}");
insta::assert_display_snapshot!(p("\"dog race\" = 'Bernese Mountain'"), @"{dog race} = {Bernese Mountain}");
insta::assert_snapshot!(p("channel = Ponce"), @"{channel} = {Ponce}");
insta::assert_snapshot!(p("subscribers = 12"), @"{subscribers} = {12}");
insta::assert_snapshot!(p("channel = 'Mister Mv'"), @"{channel} = {Mister Mv}");
insta::assert_snapshot!(p("channel = \"Mister Mv\""), @"{channel} = {Mister Mv}");
insta::assert_snapshot!(p("'dog race' = Borzoi"), @"{dog race} = {Borzoi}");
insta::assert_snapshot!(p("\"dog race\" = Chusky"), @"{dog race} = {Chusky}");
insta::assert_snapshot!(p("\"dog race\" = \"Bernese Mountain\""), @"{dog race} = {Bernese Mountain}");
insta::assert_snapshot!(p("'dog race' = 'Bernese Mountain'"), @"{dog race} = {Bernese Mountain}");
insta::assert_snapshot!(p("\"dog race\" = 'Bernese Mountain'"), @"{dog race} = {Bernese Mountain}");
// Test IN
insta::assert_display_snapshot!(p("colour IN[]"), @"{colour} IN[]");
insta::assert_display_snapshot!(p("colour IN[green]"), @"{colour} IN[{green}, ]");
insta::assert_display_snapshot!(p("colour IN[green,]"), @"{colour} IN[{green}, ]");
insta::assert_display_snapshot!(p("colour NOT IN[green,blue]"), @"NOT ({colour} IN[{green}, {blue}, ])");
insta::assert_display_snapshot!(p(" colour IN [ green , blue , ]"), @"{colour} IN[{green}, {blue}, ]");
insta::assert_snapshot!(p("colour IN[]"), @"{colour} IN[]");
insta::assert_snapshot!(p("colour IN[green]"), @"{colour} IN[{green}, ]");
insta::assert_snapshot!(p("colour IN[green,]"), @"{colour} IN[{green}, ]");
insta::assert_snapshot!(p("colour NOT IN[green,blue]"), @"NOT ({colour} IN[{green}, {blue}, ])");
insta::assert_snapshot!(p(" colour IN [ green , blue , ]"), @"{colour} IN[{green}, {blue}, ]");
// Test IN + OR/AND/()
insta::assert_display_snapshot!(p(" colour IN [green, blue] AND color = green "), @"AND[{colour} IN[{green}, {blue}, ], {color} = {green}, ]");
insta::assert_display_snapshot!(p("NOT (colour IN [green, blue]) AND color = green "), @"AND[NOT ({colour} IN[{green}, {blue}, ]), {color} = {green}, ]");
insta::assert_display_snapshot!(p("x = 1 OR NOT (colour IN [green, blue] OR color = green) "), @"OR[{x} = {1}, NOT (OR[{colour} IN[{green}, {blue}, ], {color} = {green}, ]), ]");
insta::assert_snapshot!(p(" colour IN [green, blue] AND color = green "), @"AND[{colour} IN[{green}, {blue}, ], {color} = {green}, ]");
insta::assert_snapshot!(p("NOT (colour IN [green, blue]) AND color = green "), @"AND[NOT ({colour} IN[{green}, {blue}, ]), {color} = {green}, ]");
insta::assert_snapshot!(p("x = 1 OR NOT (colour IN [green, blue] OR color = green) "), @"OR[{x} = {1}, NOT (OR[{colour} IN[{green}, {blue}, ], {color} = {green}, ]), ]");
// Test whitespace start/end
insta::assert_display_snapshot!(p(" colour = green "), @"{colour} = {green}");
insta::assert_display_snapshot!(p(" (colour = green OR colour = red) "), @"OR[{colour} = {green}, {colour} = {red}, ]");
insta::assert_display_snapshot!(p(" colour IN [green, blue] AND color = green "), @"AND[{colour} IN[{green}, {blue}, ], {color} = {green}, ]");
insta::assert_display_snapshot!(p(" colour NOT IN [green, blue] "), @"NOT ({colour} IN[{green}, {blue}, ])");
insta::assert_display_snapshot!(p(" colour IN [green, blue] "), @"{colour} IN[{green}, {blue}, ]");
insta::assert_snapshot!(p(" colour = green "), @"{colour} = {green}");
insta::assert_snapshot!(p(" (colour = green OR colour = red) "), @"OR[{colour} = {green}, {colour} = {red}, ]");
insta::assert_snapshot!(p(" colour IN [green, blue] AND color = green "), @"AND[{colour} IN[{green}, {blue}, ], {color} = {green}, ]");
insta::assert_snapshot!(p(" colour NOT IN [green, blue] "), @"NOT ({colour} IN[{green}, {blue}, ])");
insta::assert_snapshot!(p(" colour IN [green, blue] "), @"{colour} IN[{green}, {blue}, ]");
// Test conditions
insta::assert_display_snapshot!(p("channel != ponce"), @"{channel} != {ponce}");
insta::assert_display_snapshot!(p("NOT channel = ponce"), @"NOT ({channel} = {ponce})");
insta::assert_display_snapshot!(p("subscribers < 1000"), @"{subscribers} < {1000}");
insta::assert_display_snapshot!(p("subscribers > 1000"), @"{subscribers} > {1000}");
insta::assert_display_snapshot!(p("subscribers <= 1000"), @"{subscribers} <= {1000}");
insta::assert_display_snapshot!(p("subscribers >= 1000"), @"{subscribers} >= {1000}");
insta::assert_display_snapshot!(p("subscribers <= 1000"), @"{subscribers} <= {1000}");
insta::assert_display_snapshot!(p("subscribers 100 TO 1000"), @"{subscribers} {100} TO {1000}");
insta::assert_snapshot!(p("channel != ponce"), @"{channel} != {ponce}");
insta::assert_snapshot!(p("NOT channel = ponce"), @"NOT ({channel} = {ponce})");
insta::assert_snapshot!(p("subscribers < 1000"), @"{subscribers} < {1000}");
insta::assert_snapshot!(p("subscribers > 1000"), @"{subscribers} > {1000}");
insta::assert_snapshot!(p("subscribers <= 1000"), @"{subscribers} <= {1000}");
insta::assert_snapshot!(p("subscribers >= 1000"), @"{subscribers} >= {1000}");
insta::assert_snapshot!(p("subscribers <= 1000"), @"{subscribers} <= {1000}");
insta::assert_snapshot!(p("subscribers 100 TO 1000"), @"{subscribers} {100} TO {1000}");
// Test NOT
insta::assert_display_snapshot!(p("NOT subscribers < 1000"), @"NOT ({subscribers} < {1000})");
insta::assert_display_snapshot!(p("NOT subscribers 100 TO 1000"), @"NOT ({subscribers} {100} TO {1000})");
insta::assert_snapshot!(p("NOT subscribers < 1000"), @"NOT ({subscribers} < {1000})");
insta::assert_snapshot!(p("NOT subscribers 100 TO 1000"), @"NOT ({subscribers} {100} TO {1000})");
// Test NULL + NOT NULL
insta::assert_display_snapshot!(p("subscribers IS NULL"), @"{subscribers} IS NULL");
insta::assert_display_snapshot!(p("NOT subscribers IS NULL"), @"NOT ({subscribers} IS NULL)");
insta::assert_display_snapshot!(p("subscribers IS NOT NULL"), @"NOT ({subscribers} IS NULL)");
insta::assert_display_snapshot!(p("NOT subscribers IS NOT NULL"), @"{subscribers} IS NULL");
insta::assert_display_snapshot!(p("subscribers IS NOT NULL"), @"NOT ({subscribers} IS NULL)");
insta::assert_snapshot!(p("subscribers IS NULL"), @"{subscribers} IS NULL");
insta::assert_snapshot!(p("NOT subscribers IS NULL"), @"NOT ({subscribers} IS NULL)");
insta::assert_snapshot!(p("subscribers IS NOT NULL"), @"NOT ({subscribers} IS NULL)");
insta::assert_snapshot!(p("NOT subscribers IS NOT NULL"), @"{subscribers} IS NULL");
insta::assert_snapshot!(p("subscribers IS NOT NULL"), @"NOT ({subscribers} IS NULL)");
// Test EMPTY + NOT EMPTY
insta::assert_display_snapshot!(p("subscribers IS EMPTY"), @"{subscribers} IS EMPTY");
insta::assert_display_snapshot!(p("NOT subscribers IS EMPTY"), @"NOT ({subscribers} IS EMPTY)");
insta::assert_display_snapshot!(p("subscribers IS NOT EMPTY"), @"NOT ({subscribers} IS EMPTY)");
insta::assert_display_snapshot!(p("NOT subscribers IS NOT EMPTY"), @"{subscribers} IS EMPTY");
insta::assert_display_snapshot!(p("subscribers IS NOT EMPTY"), @"NOT ({subscribers} IS EMPTY)");
insta::assert_snapshot!(p("subscribers IS EMPTY"), @"{subscribers} IS EMPTY");
insta::assert_snapshot!(p("NOT subscribers IS EMPTY"), @"NOT ({subscribers} IS EMPTY)");
insta::assert_snapshot!(p("subscribers IS NOT EMPTY"), @"NOT ({subscribers} IS EMPTY)");
insta::assert_snapshot!(p("NOT subscribers IS NOT EMPTY"), @"{subscribers} IS EMPTY");
insta::assert_snapshot!(p("subscribers IS NOT EMPTY"), @"NOT ({subscribers} IS EMPTY)");
// Test EXISTS + NOT EXITS
insta::assert_display_snapshot!(p("subscribers EXISTS"), @"{subscribers} EXISTS");
insta::assert_display_snapshot!(p("NOT subscribers EXISTS"), @"NOT ({subscribers} EXISTS)");
insta::assert_display_snapshot!(p("subscribers NOT EXISTS"), @"NOT ({subscribers} EXISTS)");
insta::assert_display_snapshot!(p("NOT subscribers NOT EXISTS"), @"{subscribers} EXISTS");
insta::assert_display_snapshot!(p("subscribers NOT EXISTS"), @"NOT ({subscribers} EXISTS)");
insta::assert_snapshot!(p("subscribers EXISTS"), @"{subscribers} EXISTS");
insta::assert_snapshot!(p("NOT subscribers EXISTS"), @"NOT ({subscribers} EXISTS)");
insta::assert_snapshot!(p("subscribers NOT EXISTS"), @"NOT ({subscribers} EXISTS)");
insta::assert_snapshot!(p("NOT subscribers NOT EXISTS"), @"{subscribers} EXISTS");
insta::assert_snapshot!(p("subscribers NOT EXISTS"), @"NOT ({subscribers} EXISTS)");
// Test CONTAINS + NOT CONTAINS
insta::assert_snapshot!(p("subscribers CONTAINS 'hello'"), @"{subscribers} CONTAINS {hello}");
insta::assert_snapshot!(p("NOT subscribers CONTAINS 'hello'"), @"NOT ({subscribers} CONTAINS {hello})");
insta::assert_snapshot!(p("subscribers NOT CONTAINS hello"), @"NOT ({subscribers} CONTAINS {hello})");
insta::assert_snapshot!(p("NOT subscribers NOT CONTAINS 'hello'"), @"{subscribers} CONTAINS {hello}");
insta::assert_snapshot!(p("subscribers NOT CONTAINS 'hello'"), @"NOT ({subscribers} CONTAINS {hello})");
// Test STARTS WITH + NOT STARTS WITH
insta::assert_snapshot!(p("subscribers STARTS WITH 'hel'"), @"{subscribers} STARTS WITH {hel}");
insta::assert_snapshot!(p("NOT subscribers STARTS WITH 'hel'"), @"NOT ({subscribers} STARTS WITH {hel})");
insta::assert_snapshot!(p("subscribers NOT STARTS WITH hel"), @"NOT ({subscribers} STARTS WITH {hel})");
insta::assert_snapshot!(p("NOT subscribers NOT STARTS WITH 'hel'"), @"{subscribers} STARTS WITH {hel}");
insta::assert_snapshot!(p("subscribers NOT STARTS WITH 'hel'"), @"NOT ({subscribers} STARTS WITH {hel})");
// Test nested NOT
insta::assert_display_snapshot!(p("NOT NOT NOT NOT x = 5"), @"{x} = {5}");
insta::assert_display_snapshot!(p("NOT NOT (NOT NOT x = 5)"), @"{x} = {5}");
insta::assert_snapshot!(p("NOT NOT NOT NOT x = 5"), @"{x} = {5}");
insta::assert_snapshot!(p("NOT NOT (NOT NOT x = 5)"), @"{x} = {5}");
// Test geo radius
insta::assert_display_snapshot!(p("_geoRadius(12, 13, 14)"), @"_geoRadius({12}, {13}, {14})");
insta::assert_display_snapshot!(p("NOT _geoRadius(12, 13, 14)"), @"NOT (_geoRadius({12}, {13}, {14}))");
insta::assert_display_snapshot!(p("_geoRadius(12,13,14)"), @"_geoRadius({12}, {13}, {14})");
insta::assert_snapshot!(p("_geoRadius(12, 13, 14)"), @"_geoRadius({12}, {13}, {14})");
insta::assert_snapshot!(p("NOT _geoRadius(12, 13, 14)"), @"NOT (_geoRadius({12}, {13}, {14}))");
insta::assert_snapshot!(p("_geoRadius(12,13,14)"), @"_geoRadius({12}, {13}, {14})");
// Test geo bounding box
insta::assert_display_snapshot!(p("_geoBoundingBox([12, 13], [14, 15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
insta::assert_display_snapshot!(p("NOT _geoBoundingBox([12, 13], [14, 15])"), @"NOT (_geoBoundingBox([{12}, {13}], [{14}, {15}]))");
insta::assert_display_snapshot!(p("_geoBoundingBox([12,13],[14,15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
insta::assert_snapshot!(p("_geoBoundingBox([12, 13], [14, 15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
insta::assert_snapshot!(p("NOT _geoBoundingBox([12, 13], [14, 15])"), @"NOT (_geoBoundingBox([{12}, {13}], [{14}, {15}]))");
insta::assert_snapshot!(p("_geoBoundingBox([12,13],[14,15])"), @"_geoBoundingBox([{12}, {13}], [{14}, {15}])");
// Test OR + AND
insta::assert_display_snapshot!(p("channel = ponce AND 'dog race' != 'bernese mountain'"), @"AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
insta::assert_display_snapshot!(p("channel = ponce OR 'dog race' != 'bernese mountain'"), @"OR[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
insta::assert_display_snapshot!(p("channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000"), @"OR[AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ], {subscribers} > {1000}, ]");
insta::assert_display_snapshot!(
insta::assert_snapshot!(p("channel = ponce AND 'dog race' != 'bernese mountain'"), @"AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
insta::assert_snapshot!(p("channel = ponce OR 'dog race' != 'bernese mountain'"), @"OR[{channel} = {ponce}, {dog race} != {bernese mountain}, ]");
insta::assert_snapshot!(p("channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000"), @"OR[AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ], {subscribers} > {1000}, ]");
insta::assert_snapshot!(
p("channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000 OR colour = red OR colour = blue AND size = 7"),
@"OR[AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ], {subscribers} > {1000}, {colour} = {red}, AND[{colour} = {blue}, {size} = {7}, ], ]"
);
// Test parentheses
insta::assert_display_snapshot!(p("channel = ponce AND ( 'dog race' != 'bernese mountain' OR subscribers > 1000 )"), @"AND[{channel} = {ponce}, OR[{dog race} != {bernese mountain}, {subscribers} > {1000}, ], ]");
insta::assert_display_snapshot!(p("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, 13, 14)"), @"AND[OR[AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ], {subscribers} > {1000}, ], _geoRadius({12}, {13}, {14}), ]");
insta::assert_snapshot!(p("channel = ponce AND ( 'dog race' != 'bernese mountain' OR subscribers > 1000 )"), @"AND[{channel} = {ponce}, OR[{dog race} != {bernese mountain}, {subscribers} > {1000}, ], ]");
insta::assert_snapshot!(p("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, 13, 14)"), @"AND[OR[AND[{channel} = {ponce}, {dog race} != {bernese mountain}, ], {subscribers} > {1000}, ], _geoRadius({12}, {13}, {14}), ]");
// Test recursion
// This is the most that is allowed
insta::assert_display_snapshot!(
insta::assert_snapshot!(
p("(((((((((((((((((((((((((((((((((((((((((((((((((x = 1)))))))))))))))))))))))))))))))))))))))))))))))))"),
@"{x} = {1}"
);
insta::assert_display_snapshot!(
insta::assert_snapshot!(
p("NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT x = 1"),
@"NOT ({x} = {1})"
);
// Confusing keywords
insta::assert_display_snapshot!(p(r#"NOT "OR" EXISTS AND "EXISTS" NOT EXISTS"#), @"AND[NOT ({OR} EXISTS), NOT ({EXISTS} EXISTS), ]");
insta::assert_snapshot!(p(r#"NOT "OR" EXISTS AND "EXISTS" NOT EXISTS"#), @"AND[NOT ({OR} EXISTS), NOT ({EXISTS} EXISTS), ]");
}
#[test]
@ -689,182 +741,182 @@ pub mod tests {
Fc::parse(s).unwrap_err().to_string()
}
insta::assert_display_snapshot!(p("channel = Ponce = 12"), @r###"
insta::assert_snapshot!(p("channel = Ponce = 12"), @r###"
Found unexpected characters at the end of the filter: `= 12`. You probably forgot an `OR` or an `AND` rule.
17:21 channel = Ponce = 12
"###);
insta::assert_display_snapshot!(p("channel = "), @r###"
insta::assert_snapshot!(p("channel = "), @r###"
Was expecting a value but instead got nothing.
14:14 channel =
"###);
insta::assert_display_snapshot!(p("channel = 🐻"), @r###"
insta::assert_snapshot!(p("channel = 🐻"), @r###"
Was expecting a value but instead got `🐻`.
11:12 channel = 🐻
"###);
insta::assert_display_snapshot!(p("channel = 🐻 AND followers < 100"), @r###"
insta::assert_snapshot!(p("channel = 🐻 AND followers < 100"), @r###"
Was expecting a value but instead got `🐻`.
11:12 channel = 🐻 AND followers < 100
"###);
insta::assert_display_snapshot!(p("'OR'"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `\'OR\'`.
insta::assert_snapshot!(p("'OR'"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `\'OR\'`.
1:5 'OR'
"###);
insta::assert_display_snapshot!(p("OR"), @r###"
insta::assert_snapshot!(p("OR"), @r###"
Was expecting a value but instead got `OR`, which is a reserved keyword. To use `OR` as a field name or a value, surround it by quotes.
1:3 OR
"###);
insta::assert_display_snapshot!(p("channel Ponce"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `channel Ponce`.
insta::assert_snapshot!(p("channel Ponce"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `channel Ponce`.
1:14 channel Ponce
"###);
insta::assert_display_snapshot!(p("channel = Ponce OR"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` but instead got nothing.
insta::assert_snapshot!(p("channel = Ponce OR"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` but instead got nothing.
19:19 channel = Ponce OR
"###);
insta::assert_display_snapshot!(p("_geoRadius"), @r###"
insta::assert_snapshot!(p("_geoRadius"), @r###"
The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.
1:11 _geoRadius
"###);
insta::assert_display_snapshot!(p("_geoRadius = 12"), @r###"
insta::assert_snapshot!(p("_geoRadius = 12"), @r###"
The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.
1:16 _geoRadius = 12
"###);
insta::assert_display_snapshot!(p("_geoBoundingBox"), @r###"
insta::assert_snapshot!(p("_geoBoundingBox"), @r###"
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
1:16 _geoBoundingBox
"###);
insta::assert_display_snapshot!(p("_geoBoundingBox = 12"), @r###"
insta::assert_snapshot!(p("_geoBoundingBox = 12"), @r###"
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
1:21 _geoBoundingBox = 12
"###);
insta::assert_display_snapshot!(p("_geoBoundingBox(1.0, 1.0)"), @r###"
insta::assert_snapshot!(p("_geoBoundingBox(1.0, 1.0)"), @r###"
The `_geoBoundingBox` filter expects two pairs of arguments: `_geoBoundingBox([latitude, longitude], [latitude, longitude])`.
1:26 _geoBoundingBox(1.0, 1.0)
"###);
insta::assert_display_snapshot!(p("_geoPoint(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("_geoPoint(12, 13, 14)"), @r###"
`_geoPoint` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
1:22 _geoPoint(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("position <= _geoPoint(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("position <= _geoPoint(12, 13, 14)"), @r###"
`_geoPoint` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
13:34 position <= _geoPoint(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("_geoDistance(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("_geoDistance(12, 13, 14)"), @r###"
`_geoDistance` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
1:25 _geoDistance(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("position <= _geoDistance(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("position <= _geoDistance(12, 13, 14)"), @r###"
`_geoDistance` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
13:37 position <= _geoDistance(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("_geo(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("_geo(12, 13, 14)"), @r###"
`_geo` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
1:17 _geo(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("position <= _geo(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("position <= _geo(12, 13, 14)"), @r###"
`_geo` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` coordinates.
13:29 position <= _geo(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("position <= _geoRadius(12, 13, 14)"), @r###"
insta::assert_snapshot!(p("position <= _geoRadius(12, 13, 14)"), @r###"
The `_geoRadius` filter is an operation and can't be used as a value.
13:35 position <= _geoRadius(12, 13, 14)
"###);
insta::assert_display_snapshot!(p("channel = 'ponce"), @r###"
insta::assert_snapshot!(p("channel = 'ponce"), @r###"
Expression `\'ponce` is missing the following closing delimiter: `'`.
11:17 channel = 'ponce
"###);
insta::assert_display_snapshot!(p("channel = \"ponce"), @r###"
insta::assert_snapshot!(p("channel = \"ponce"), @r###"
Expression `\"ponce` is missing the following closing delimiter: `"`.
11:17 channel = "ponce
"###);
insta::assert_display_snapshot!(p("channel = mv OR (followers >= 1000"), @r###"
insta::assert_snapshot!(p("channel = mv OR (followers >= 1000"), @r###"
Expression `(followers >= 1000` is missing the following closing delimiter: `)`.
17:35 channel = mv OR (followers >= 1000
"###);
insta::assert_display_snapshot!(p("channel = mv OR followers >= 1000)"), @r###"
insta::assert_snapshot!(p("channel = mv OR followers >= 1000)"), @r###"
Found unexpected characters at the end of the filter: `)`. You probably forgot an `OR` or an `AND` rule.
34:35 channel = mv OR followers >= 1000)
"###);
insta::assert_display_snapshot!(p("colour NOT EXIST"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `colour NOT EXIST`.
insta::assert_snapshot!(p("colour NOT EXIST"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `colour NOT EXIST`.
1:17 colour NOT EXIST
"###);
insta::assert_display_snapshot!(p("subscribers 100 TO1000"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `subscribers 100 TO1000`.
insta::assert_snapshot!(p("subscribers 100 TO1000"), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `subscribers 100 TO1000`.
1:23 subscribers 100 TO1000
"###);
insta::assert_display_snapshot!(p("channel = ponce ORdog != 'bernese mountain'"), @r###"
insta::assert_snapshot!(p("channel = ponce ORdog != 'bernese mountain'"), @r###"
Found unexpected characters at the end of the filter: `ORdog != \'bernese mountain\'`. You probably forgot an `OR` or an `AND` rule.
17:44 channel = ponce ORdog != 'bernese mountain'
"###);
insta::assert_display_snapshot!(p("colour IN blue, green]"), @r###"
insta::assert_snapshot!(p("colour IN blue, green]"), @r###"
Expected `[` after `IN` keyword.
11:23 colour IN blue, green]
"###);
insta::assert_display_snapshot!(p("colour IN [blue, green, 'blue' > 2]"), @r###"
insta::assert_snapshot!(p("colour IN [blue, green, 'blue' > 2]"), @r###"
Expected only comma-separated field names inside `IN[..]` but instead found `> 2]`.
32:36 colour IN [blue, green, 'blue' > 2]
"###);
insta::assert_display_snapshot!(p("colour IN [blue, green, AND]"), @r###"
insta::assert_snapshot!(p("colour IN [blue, green, AND]"), @r###"
Expected only comma-separated field names inside `IN[..]` but instead found `AND]`.
25:29 colour IN [blue, green, AND]
"###);
insta::assert_display_snapshot!(p("colour IN [blue, green"), @r###"
insta::assert_snapshot!(p("colour IN [blue, green"), @r###"
Expected matching `]` after the list of field names given to `IN[`
23:23 colour IN [blue, green
"###);
insta::assert_display_snapshot!(p("colour IN ['blue, green"), @r###"
insta::assert_snapshot!(p("colour IN ['blue, green"), @r###"
Expression `\'blue, green` is missing the following closing delimiter: `'`.
12:24 colour IN ['blue, green
"###);
insta::assert_display_snapshot!(p("x = EXISTS"), @r###"
insta::assert_snapshot!(p("x = EXISTS"), @r###"
Was expecting a value but instead got `EXISTS`, which is a reserved keyword. To use `EXISTS` as a field name or a value, surround it by quotes.
5:11 x = EXISTS
"###);
insta::assert_display_snapshot!(p("AND = 8"), @r###"
insta::assert_snapshot!(p("AND = 8"), @r###"
Was expecting a value but instead got `AND`, which is a reserved keyword. To use `AND` as a field name or a value, surround it by quotes.
1:4 AND = 8
"###);
insta::assert_display_snapshot!(p("((((((((((((((((((((((((((((((((((((((((((((((((((x = 1))))))))))))))))))))))))))))))))))))))))))))))))))"), @r###"
insta::assert_snapshot!(p("((((((((((((((((((((((((((((((((((((((((((((((((((x = 1))))))))))))))))))))))))))))))))))))))))))))))))))"), @r###"
The filter exceeded the maximum depth limit. Try rewriting the filter so that it contains fewer nested conditions.
51:106 ((((((((((((((((((((((((((((((((((((((((((((((((((x = 1))))))))))))))))))))))))))))))))))))))))))))))))))
"###);
insta::assert_display_snapshot!(
insta::assert_snapshot!(
p("NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT NOT x = 1"),
@r###"
The filter exceeded the maximum depth limit. Try rewriting the filter so that it contains fewer nested conditions.
@ -872,41 +924,41 @@ pub mod tests {
"###
);
insta::assert_display_snapshot!(p(r#"NOT OR EXISTS AND EXISTS NOT EXISTS"#), @r###"
insta::assert_snapshot!(p(r#"NOT OR EXISTS AND EXISTS NOT EXISTS"#), @r###"
Was expecting a value but instead got `OR`, which is a reserved keyword. To use `OR` as a field name or a value, surround it by quotes.
5:7 NOT OR EXISTS AND EXISTS NOT EXISTS
"###);
insta::assert_display_snapshot!(p(r#"value NULL"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value NULL`.
insta::assert_snapshot!(p(r#"value NULL"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NULL`.
1:11 value NULL
"###);
insta::assert_display_snapshot!(p(r#"value NOT NULL"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value NOT NULL`.
insta::assert_snapshot!(p(r#"value NOT NULL"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NOT NULL`.
1:15 value NOT NULL
"###);
insta::assert_display_snapshot!(p(r#"value EMPTY"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value EMPTY`.
insta::assert_snapshot!(p(r#"value EMPTY"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value EMPTY`.
1:12 value EMPTY
"###);
insta::assert_display_snapshot!(p(r#"value NOT EMPTY"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value NOT EMPTY`.
insta::assert_snapshot!(p(r#"value NOT EMPTY"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value NOT EMPTY`.
1:16 value NOT EMPTY
"###);
insta::assert_display_snapshot!(p(r#"value IS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value IS`.
insta::assert_snapshot!(p(r#"value IS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS`.
1:9 value IS
"###);
insta::assert_display_snapshot!(p(r#"value IS NOT"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT`.
insta::assert_snapshot!(p(r#"value IS NOT"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT`.
1:13 value IS NOT
"###);
insta::assert_display_snapshot!(p(r#"value IS EXISTS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value IS EXISTS`.
insta::assert_snapshot!(p(r#"value IS EXISTS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS EXISTS`.
1:16 value IS EXISTS
"###);
insta::assert_display_snapshot!(p(r#"value IS NOT EXISTS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT EXISTS`.
insta::assert_snapshot!(p(r#"value IS NOT EXISTS"#), @r###"
Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `IN`, `NOT IN`, `TO`, `EXISTS`, `NOT EXISTS`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `CONTAINS`, `NOT CONTAINS`, `STARTS WITH`, `NOT STARTS WITH`, `_geoRadius`, or `_geoBoundingBox` at `value IS NOT EXISTS`.
1:20 value IS NOT EXISTS
"###);
}

View File

@ -42,7 +42,7 @@ fn quoted_by(quote: char, input: Span) -> IResult<Token> {
)));
}
}
// if it was preceeded by a `\` or if it was anything else we can continue to advance
// if it was preceded by a `\` or if it was anything else we can continue to advance
}
Ok((
@ -211,6 +211,9 @@ fn is_keyword(s: &str) -> bool {
| "IS"
| "NULL"
| "EMPTY"
| "CONTAINS"
| "STARTS"
| "WITH"
| "_geoRadius"
| "_geoBoundingBox"
)

View File

@ -11,10 +11,10 @@ edition.workspace = true
license.workspace = true
[dependencies]
arbitrary = { version = "1.3.0", features = ["derive"] }
clap = { version = "4.3.0", features = ["derive"] }
fastrand = "2.0.0"
arbitrary = { version = "1.3.2", features = ["derive"] }
clap = { version = "4.5.9", features = ["derive"] }
fastrand = "2.1.0"
milli = { path = "../milli" }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
tempfile = "3.5.0"
serde = { version = "1.0.204", features = ["derive"] }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
tempfile = "3.10.1"

View File

@ -11,32 +11,38 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
anyhow = "1.0.86"
bincode = "1.3.3"
csv = "1.2.1"
derive_builder = "0.12.0"
csv = "1.3.0"
derive_builder = "0.20.0"
dump = { path = "../dump" }
enum-iterator = "1.4.0"
enum-iterator = "2.1.0"
file-store = { path = "../file-store" }
flate2 = "1.0.28"
log = "0.4.17"
flate2 = "1.0.30"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
page_size = "0.5.0"
puffin = { version = "0.16.0", features = ["serialization"] }
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
page_size = "0.6.0"
rayon = "1.10.0"
roaring = { version = "0.10.6", features = ["serde"] }
serde = { version = "1.0.204", features = ["derive"] }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
synchronoise = "1.0.1"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
ureq = "2.9.1"
uuid = { version = "1.3.1", features = ["serde", "v4"] }
tempfile = "3.10.1"
thiserror = "1.0.61"
time = { version = "0.3.36", features = [
"serde-well-known",
"formatting",
"parsing",
"macros",
] }
tracing = "0.1.40"
ureq = "2.10.0"
uuid = { version = "1.10.0", features = ["serde", "v4"] }
[dev-dependencies]
arroy = "0.5.0"
big_s = "1.0.2"
crossbeam = "0.8.2"
insta = { version = "1.29.0", features = ["json", "redactions"] }
crossbeam = "0.8.4"
insta = { version = "1.39.0", features = ["json", "redactions"] }
maplit = "1.0.2"
meili-snap = { path = "../meili-snap" }
nelson = { git = "https://github.com/meilisearch/nelson.git", rev = "675f13885548fb415ead8fbb447e9e6d9314000a"}

View File

@ -24,8 +24,10 @@ enum AutobatchKind {
allow_index_creation: bool,
primary_key: Option<String>,
},
DocumentDeletion,
DocumentDeletionByFilter,
DocumentEdition,
DocumentDeletion {
by_filter: bool,
},
DocumentClear,
Settings {
allow_index_creation: bool,
@ -63,10 +65,13 @@ impl From<KindWithContent> for AutobatchKind {
primary_key,
..
} => AutobatchKind::DocumentImport { method, allow_index_creation, primary_key },
KindWithContent::DocumentDeletion { .. } => AutobatchKind::DocumentDeletion,
KindWithContent::DocumentEdition { .. } => AutobatchKind::DocumentEdition,
KindWithContent::DocumentDeletion { .. } => {
AutobatchKind::DocumentDeletion { by_filter: false }
}
KindWithContent::DocumentClear { .. } => AutobatchKind::DocumentClear,
KindWithContent::DocumentDeletionByFilter { .. } => {
AutobatchKind::DocumentDeletionByFilter
AutobatchKind::DocumentDeletion { by_filter: true }
}
KindWithContent::SettingsUpdate { allow_index_creation, is_deletion, .. } => {
AutobatchKind::Settings {
@ -98,11 +103,12 @@ pub enum BatchKind {
primary_key: Option<String>,
operation_ids: Vec<TaskId>,
},
DocumentEdition {
id: TaskId,
},
DocumentDeletion {
deletion_ids: Vec<TaskId>,
},
DocumentDeletionByFilter {
id: TaskId,
includes_by_filter: bool,
},
ClearAndSettings {
other: Vec<TaskId>,
@ -199,12 +205,14 @@ impl BatchKind {
}),
allow_index_creation,
),
K::DocumentDeletion => {
(Continue(BatchKind::DocumentDeletion { deletion_ids: vec![task_id] }), false)
}
K::DocumentDeletionByFilter => {
(Break(BatchKind::DocumentDeletionByFilter { id: task_id }), false)
}
K::DocumentEdition => (Break(BatchKind::DocumentEdition { id: task_id }), false),
K::DocumentDeletion { by_filter: includes_by_filter } => (
Continue(BatchKind::DocumentDeletion {
deletion_ids: vec![task_id],
includes_by_filter,
}),
false,
),
K::Settings { allow_index_creation } => (
Continue(BatchKind::Settings { allow_index_creation, settings_ids: vec![task_id] }),
allow_index_creation,
@ -222,7 +230,7 @@ impl BatchKind {
match (self, kind) {
// We don't batch any of these operations
(this, K::IndexCreation | K::IndexUpdate | K::IndexSwap | K::DocumentDeletionByFilter) => Break(this),
(this, K::IndexCreation | K::IndexUpdate | K::IndexSwap | K::DocumentEdition) => Break(this),
// We must not batch tasks that don't have the same index creation rights if the index doesn't already exists.
(this, kind) if !index_already_exists && this.allow_index_creation() == Some(false) && kind.allow_index_creation() == Some(true) => {
Break(this)
@ -258,7 +266,7 @@ impl BatchKind {
// The index deletion can batch with everything but must stop after
(
BatchKind::DocumentClear { mut ids }
| BatchKind::DocumentDeletion { deletion_ids: mut ids }
| BatchKind::DocumentDeletion { deletion_ids: mut ids, includes_by_filter: _ }
| BatchKind::DocumentOperation { method: _, allow_index_creation: _, primary_key: _, operation_ids: mut ids }
| BatchKind::Settings { allow_index_creation: _, settings_ids: mut ids },
K::IndexDeletion,
@ -278,7 +286,7 @@ impl BatchKind {
(
BatchKind::DocumentClear { mut ids },
K::DocumentClear | K::DocumentDeletion,
K::DocumentClear | K::DocumentDeletion { by_filter: _ },
) => {
ids.push(id);
Continue(BatchKind::DocumentClear { ids })
@ -322,7 +330,7 @@ impl BatchKind {
}
(
BatchKind::DocumentOperation { method, allow_index_creation, primary_key, mut operation_ids },
K::DocumentDeletion,
K::DocumentDeletion { by_filter: false },
) => {
operation_ids.push(id);
@ -333,6 +341,13 @@ impl BatchKind {
operation_ids,
})
}
// We can't batch a document operation with a delete by filter
(
this @ BatchKind::DocumentOperation { .. },
K::DocumentDeletion { by_filter: true },
) => {
Break(this)
}
// but we can't autobatch documents if it's not the same kind
// this match branch MUST be AFTER the previous one
(
@ -351,13 +366,18 @@ impl BatchKind {
operation_ids,
}),
(BatchKind::DocumentDeletion { mut deletion_ids }, K::DocumentClear) => {
(BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: _ }, K::DocumentClear) => {
deletion_ids.push(id);
Continue(BatchKind::DocumentClear { ids: deletion_ids })
}
// we can't autobatch the deletion and import if the document deletion contained a filter
(
this @ BatchKind::DocumentDeletion { deletion_ids: _, includes_by_filter: true },
K::DocumentImport { .. }
) => Break(this),
// we can autobatch the deletion and import if the index already exists
(
BatchKind::DocumentDeletion { mut deletion_ids },
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { method, allow_index_creation, primary_key }
) if index_already_exists => {
deletion_ids.push(id);
@ -371,7 +391,7 @@ impl BatchKind {
}
// we can autobatch the deletion and import if both can't create an index
(
BatchKind::DocumentDeletion { mut deletion_ids },
BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter: false },
K::DocumentImport { method, allow_index_creation, primary_key }
) if !allow_index_creation => {
deletion_ids.push(id);
@ -390,9 +410,9 @@ impl BatchKind {
) => {
Break(this)
}
(BatchKind::DocumentDeletion { mut deletion_ids }, K::DocumentDeletion) => {
(BatchKind::DocumentDeletion { mut deletion_ids, includes_by_filter }, K::DocumentDeletion { by_filter }) => {
deletion_ids.push(id);
Continue(BatchKind::DocumentDeletion { deletion_ids })
Continue(BatchKind::DocumentDeletion { deletion_ids, includes_by_filter: includes_by_filter | by_filter })
}
(this @ BatchKind::DocumentDeletion { .. }, K::Settings { .. }) => Break(this),
@ -406,7 +426,7 @@ impl BatchKind {
}),
(
this @ BatchKind::Settings { .. },
K::DocumentImport { .. } | K::DocumentDeletion,
K::DocumentImport { .. } | K::DocumentDeletion { .. },
) => Break(this),
(
BatchKind::Settings { mut settings_ids, allow_index_creation },
@ -437,7 +457,7 @@ impl BatchKind {
settings_ids,
allow_index_creation,
},
K::DocumentDeletion,
K::DocumentDeletion { .. },
) => {
other.push(id);
Continue(BatchKind::ClearAndSettings {
@ -499,7 +519,7 @@ impl BatchKind {
// this MUST be AFTER the two previous branch
(
this @ BatchKind::SettingsAndDocumentOperation { .. },
K::DocumentDeletion | K::DocumentImport { .. },
K::DocumentDeletion { .. } | K::DocumentImport { .. },
) => Break(this),
(
BatchKind::SettingsAndDocumentOperation { mut settings_ids, method, allow_index_creation,primary_key, operation_ids },
@ -519,7 +539,7 @@ impl BatchKind {
| BatchKind::IndexDeletion { .. }
| BatchKind::IndexUpdate { .. }
| BatchKind::IndexSwap { .. }
| BatchKind::DocumentDeletionByFilter { .. },
| BatchKind::DocumentEdition { .. },
_,
) => {
unreachable!()
@ -609,6 +629,13 @@ mod tests {
}
}
fn doc_del_fil() -> KindWithContent {
KindWithContent::DocumentDeletionByFilter {
index_uid: String::from("doggo"),
filter_expr: serde_json::json!("cuteness > 100"),
}
}
fn doc_clr() -> KindWithContent {
KindWithContent::DocumentClear { index_uid: String::from("doggo") }
}
@ -669,10 +696,16 @@ mod tests {
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1, 2] }, false))");
// we can autobatch one or multiple DocumentDeletion together
debug_snapshot!(autobatch_from(true, None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del(), doc_del(), doc_del()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: false }, false))");
// we can autobatch one or multiple DocumentDeletionByFilter together
debug_snapshot!(autobatch_from(true, None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), doc_del_fil(), doc_del_fil()]), @"Some((DocumentDeletion { deletion_ids: [0, 1, 2], includes_by_filter: true }, false))");
// we can autobatch one or multiple Settings together
debug_snapshot!(autobatch_from(true, None, [settings(true)]), @"Some((Settings { allow_index_creation: true, settings_ids: [0] }, true))");
@ -715,25 +748,63 @@ mod tests {
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, false, Some("catto"))]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0, 1] }, false))"###);
// But we can't autobatch document addition with document deletion by filter
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, true, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: Some("catto"), operation_ids: [0] }, true))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(ReplaceDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
debug_snapshot!(autobatch_from(false, None, [doc_imp(UpdateDocuments, false, Some("catto")), doc_del_fil()]), @r###"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: Some("catto"), operation_ids: [0] }, false))"###);
// And the other way around
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(ReplaceDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del_fil(), doc_imp(UpdateDocuments, false, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
}
#[test]
fn simple_document_operation_dont_autobatch_with_other() {
// addition, updates and deletion can't batch together
// addition, updates and deletion by filter can't batch together
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), doc_del_fil()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_create()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_create()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_update()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_update()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, true, None), idx_swap()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: true, primary_key: None, operation_ids: [0] }, true))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_swap()]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: true }, false))");
}
#[test]
@ -800,6 +871,7 @@ mod tests {
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(true, None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false))");
@ -809,6 +881,7 @@ mod tests {
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(ReplaceDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_imp(UpdateDocuments, false, None)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_del_fil()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), doc_clr()]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(true)]), @"Some((IndexDeletion { ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [idx_del(), settings(false)]), @"Some((IndexDeletion { ids: [0] }, false))");
@ -820,6 +893,7 @@ mod tests {
debug_snapshot!(autobatch_from(true, None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(true, None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(true, None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
@ -829,6 +903,7 @@ mod tests {
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_del_fil(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [settings(true), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, true))");
debug_snapshot!(autobatch_from(false,None, [settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 1] }, false))");
@ -870,7 +945,7 @@ mod tests {
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), settings(false), idx_del()]), @"Some((IndexDeletion { ids: [0, 2, 1] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments,false, None), settings(false), doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [1, 3, 0, 2] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), settings(false), doc_clr(), idx_del()]), @"Some((IndexDeletion { ids: [1, 3, 0, 2] }, false))");
// The third and final case is when the first task doesn't create an index but is directly followed by a task creating an index. In this case we can't batch whith what
// The third and final case is when the first task doesn't create an index but is directly followed by a task creating an index. In this case we can't batch whit what
// follows because we first need to process the erronous batch.
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments,false, None), settings(true), idx_del()]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false,None, [doc_imp(UpdateDocuments, false, None), settings(true), idx_del()]), @"Some((DocumentOperation { method: UpdateDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
@ -894,10 +969,10 @@ mod tests {
debug_snapshot!(autobatch_from(false,None, [doc_imp(ReplaceDocuments, false, None), settings(true)]), @"Some((DocumentOperation { method: ReplaceDocuments, allow_index_creation: false, primary_key: None, operation_ids: [0] }, false))");
// batch deletion and addition
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0] }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, Some("catto"))]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(ReplaceDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
debug_snapshot!(autobatch_from(false, None, [doc_del(), doc_imp(UpdateDocuments, true, None)]), @"Some((DocumentDeletion { deletion_ids: [0], includes_by_filter: false }, false))");
}
#[test]

View File

@ -13,7 +13,7 @@ We can combine the two tasks in a single batch:
1. import documents X and Y
Processing this batch is functionally equivalent to processing the two
tasks individally, but should be much faster since we are only performing
tasks individually, but should be much faster since we are only performing
one indexing operation.
*/
@ -24,7 +24,6 @@ use std::fs::{self, File};
use std::io::BufWriter;
use dump::IndexMetadata;
use log::{debug, error, info, trace};
use meilisearch_types::error::Code;
use meilisearch_types::heed::{RoTxn, RwTxn};
use meilisearch_types::milli::documents::{obkv_to_object, DocumentsBatchReader};
@ -32,7 +31,10 @@ use meilisearch_types::milli::heed::CompactionOption;
use meilisearch_types::milli::update::{
IndexDocumentsConfig, IndexDocumentsMethod, IndexerConfig, Settings as MilliSettings,
};
use meilisearch_types::milli::{self, Filter};
use meilisearch_types::milli::vector::parsed_vectors::{
ExplicitVectors, VectorOrArrayOfVectors, RESERVED_VECTORS_FIELD_NAME,
};
use meilisearch_types::milli::{self, Filter, Object};
use meilisearch_types::settings::{apply_settings_to_builder, Settings, Unchecked};
use meilisearch_types::tasks::{Details, IndexSwap, Kind, KindWithContent, Status, Task};
use meilisearch_types::{compression, Index, VERSION_FILE_NAME};
@ -60,7 +62,7 @@ pub(crate) enum Batch {
/// The list of tasks that were processing when this task cancelation appeared.
previous_processing_tasks: RoaringBitmap,
},
TaskDeletion(Task),
TaskDeletions(Vec<Task>),
SnapshotCreation(Vec<Task>),
Dump(Task),
IndexOperation {
@ -104,10 +106,14 @@ pub(crate) enum IndexOperation {
operations: Vec<DocumentOperation>,
tasks: Vec<Task>,
},
IndexDocumentDeletionByFilter {
DocumentEdition {
index_uid: String,
task: Task,
},
DocumentDeletion {
index_uid: String,
tasks: Vec<Task>,
},
DocumentClear {
index_uid: String,
tasks: Vec<Task>,
@ -143,23 +149,29 @@ pub(crate) enum IndexOperation {
impl Batch {
/// Return the task ids associated with this batch.
pub fn ids(&self) -> Vec<TaskId> {
pub fn ids(&self) -> RoaringBitmap {
match self {
Batch::TaskCancelation { task, .. }
| Batch::TaskDeletion(task)
| Batch::Dump(task)
| Batch::IndexCreation { task, .. }
| Batch::IndexUpdate { task, .. } => vec![task.uid],
Batch::SnapshotCreation(tasks) | Batch::IndexDeletion { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
| Batch::IndexUpdate { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
Batch::SnapshotCreation(tasks)
| Batch::TaskDeletions(tasks)
| Batch::IndexDeletion { tasks, .. } => {
RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid))
}
Batch::IndexOperation { op, .. } => match op {
IndexOperation::DocumentOperation { tasks, .. }
| IndexOperation::Settings { tasks, .. }
| IndexOperation::DocumentDeletion { tasks, .. }
| IndexOperation::DocumentClear { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid))
}
IndexOperation::DocumentEdition { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
IndexOperation::IndexDocumentDeletionByFilter { task, .. } => vec![task.uid],
IndexOperation::SettingsAndDocumentOperation {
document_import_tasks: tasks,
settings_tasks: other,
@ -169,9 +181,11 @@ impl Batch {
cleared_tasks: tasks,
settings_tasks: other,
..
} => tasks.iter().chain(other).map(|task| task.uid).collect(),
} => RoaringBitmap::from_iter(tasks.iter().chain(other).map(|task| task.uid)),
},
Batch::IndexSwap { task } => vec![task.uid],
Batch::IndexSwap { task } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
}
}
@ -180,7 +194,7 @@ impl Batch {
use Batch::*;
match self {
TaskCancelation { .. }
| TaskDeletion(_)
| TaskDeletions(_)
| SnapshotCreation(_)
| Dump(_)
| IndexSwap { .. } => None,
@ -199,7 +213,7 @@ impl fmt::Display for Batch {
let tasks = self.ids();
match self {
Batch::TaskCancelation { .. } => f.write_str("TaskCancelation")?,
Batch::TaskDeletion(_) => f.write_str("TaskDeletion")?,
Batch::TaskDeletions(_) => f.write_str("TaskDeletion")?,
Batch::SnapshotCreation(_) => f.write_str("SnapshotCreation")?,
Batch::Dump(_) => f.write_str("Dump")?,
Batch::IndexOperation { op, .. } => write!(f, "{op}")?,
@ -219,7 +233,8 @@ impl IndexOperation {
pub fn index_uid(&self) -> &str {
match self {
IndexOperation::DocumentOperation { index_uid, .. }
| IndexOperation::IndexDocumentDeletionByFilter { index_uid, .. }
| IndexOperation::DocumentEdition { index_uid, .. }
| IndexOperation::DocumentDeletion { index_uid, .. }
| IndexOperation::DocumentClear { index_uid, .. }
| IndexOperation::Settings { index_uid, .. }
| IndexOperation::DocumentClearAndSetting { index_uid, .. }
@ -234,8 +249,11 @@ impl fmt::Display for IndexOperation {
IndexOperation::DocumentOperation { .. } => {
f.write_str("IndexOperation::DocumentOperation")
}
IndexOperation::IndexDocumentDeletionByFilter { .. } => {
f.write_str("IndexOperation::IndexDocumentDeletionByFilter")
IndexOperation::DocumentEdition { .. } => {
f.write_str("IndexOperation::DocumentEdition")
}
IndexOperation::DocumentDeletion { .. } => {
f.write_str("IndexOperation::DocumentDeletion")
}
IndexOperation::DocumentClear { .. } => f.write_str("IndexOperation::DocumentClear"),
IndexOperation::Settings { .. } => f.write_str("IndexOperation::Settings"),
@ -271,12 +289,12 @@ impl IndexScheduler {
},
must_create_index,
})),
BatchKind::DocumentDeletionByFilter { id } => {
BatchKind::DocumentEdition { id } => {
let task = self.get_task(rtxn, id)?.ok_or(Error::CorruptedTaskQueue)?;
match &task.kind {
KindWithContent::DocumentDeletionByFilter { index_uid, .. } => {
KindWithContent::DocumentEdition { index_uid, .. } => {
Ok(Some(Batch::IndexOperation {
op: IndexOperation::IndexDocumentDeletionByFilter {
op: IndexOperation::DocumentEdition {
index_uid: index_uid.clone(),
task,
},
@ -333,30 +351,11 @@ impl IndexScheduler {
must_create_index,
}))
}
BatchKind::DocumentDeletion { deletion_ids } => {
BatchKind::DocumentDeletion { deletion_ids, includes_by_filter: _ } => {
let tasks = self.get_existing_tasks(rtxn, deletion_ids)?;
let mut operations = Vec::with_capacity(tasks.len());
let mut documents_counts = Vec::with_capacity(tasks.len());
for task in &tasks {
match task.kind {
KindWithContent::DocumentDeletion { ref documents_ids, .. } => {
operations.push(DocumentOperation::Delete(documents_ids.clone()));
documents_counts.push(documents_ids.len() as u64);
}
_ => unreachable!(),
}
}
Ok(Some(Batch::IndexOperation {
op: IndexOperation::DocumentOperation {
index_uid,
primary_key: None,
method: IndexDocumentsMethod::ReplaceDocuments,
documents_counts,
operations,
tasks,
},
op: IndexOperation::DocumentDeletion { index_uid, tasks },
must_create_index,
}))
}
@ -515,12 +514,11 @@ impl IndexScheduler {
/// 3. We get the *next* snapshot to process.
/// 4. We get the *next* dump to process.
/// 5. We get the *next* tasks to process for a specific index.
#[tracing::instrument(level = "trace", skip(self, rtxn), target = "indexing::scheduler")]
pub(crate) fn create_next_batch(&self, rtxn: &RoTxn) -> Result<Option<Batch>> {
#[cfg(test)]
self.maybe_fail(crate::tests::FailureLocation::InsideCreateBatch)?;
puffin::profile_function!();
let enqueued = &self.get_status(rtxn, Status::Enqueued)?;
let to_cancel = self.get_kind(rtxn, Kind::TaskCancelation)? & enqueued;
@ -539,9 +537,9 @@ impl IndexScheduler {
// 2. we get the next task to delete
let to_delete = self.get_kind(rtxn, Kind::TaskDeletion)? & enqueued;
if let Some(task_id) = to_delete.min() {
let task = self.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
return Ok(Some(Batch::TaskDeletion(task)));
if !to_delete.is_empty() {
let tasks = self.get_existing_tasks(rtxn, to_delete)?;
return Ok(Some(Batch::TaskDeletions(tasks)));
}
// 3. we batch the snapshot.
@ -620,6 +618,7 @@ impl IndexScheduler {
/// The list of tasks that were processed. The metadata of each task in the returned
/// list is updated accordingly, with the exception of the its date fields
/// [`finished_at`](meilisearch_types::tasks::Task::finished_at) and [`started_at`](meilisearch_types::tasks::Task::started_at).
#[tracing::instrument(level = "trace", skip(self, batch), target = "indexing::scheduler", fields(batch=batch.to_string()))]
pub(crate) fn process_batch(&self, batch: Batch) -> Result<Vec<Task>> {
#[cfg(test)]
{
@ -628,8 +627,6 @@ impl IndexScheduler {
self.breakpoint(crate::Breakpoint::InsideProcessBatch);
}
puffin::profile_function!(batch.to_string());
match batch {
Batch::TaskCancelation { mut task, previous_started_at, previous_processing_tasks } => {
// 1. Retrieve the tasks that matched the query at enqueue-time.
@ -669,9 +666,10 @@ impl IndexScheduler {
Ok(()) => {
for content_uuid in canceled_tasks_content_uuids {
if let Err(error) = self.delete_update_file(content_uuid) {
error!(
"We failed deleting the content file indentified as {}: {}",
content_uuid, error
tracing::error!(
file_content_uuid = %content_uuid,
%error,
"Failed deleting content file"
)
}
}
@ -681,31 +679,43 @@ impl IndexScheduler {
Ok(vec![task])
}
Batch::TaskDeletion(mut task) => {
Batch::TaskDeletions(mut tasks) => {
// 1. Retrieve the tasks that matched the query at enqueue-time.
let matched_tasks =
let mut matched_tasks = RoaringBitmap::new();
for task in tasks.iter() {
if let KindWithContent::TaskDeletion { tasks, query: _ } = &task.kind {
tasks
matched_tasks |= tasks;
} else {
unreachable!()
}
}
let mut wtxn = self.env.write_txn()?;
let mut deleted_tasks = self.delete_matched_tasks(&mut wtxn, &matched_tasks)?;
wtxn.commit()?;
for task in tasks.iter_mut() {
task.status = Status::Succeeded;
let KindWithContent::TaskDeletion { tasks, query: _ } = &task.kind else {
unreachable!()
};
let mut wtxn = self.env.write_txn()?;
let deleted_tasks_count = self.delete_matched_tasks(&mut wtxn, matched_tasks)?;
let deleted_tasks_count = deleted_tasks.intersection_len(tasks);
deleted_tasks -= tasks;
task.status = Status::Succeeded;
match &mut task.details {
Some(Details::TaskDeletion {
matched_tasks: _,
deleted_tasks,
original_filter: _,
}) => {
*deleted_tasks = Some(deleted_tasks_count);
match &mut task.details {
Some(Details::TaskDeletion {
matched_tasks: _,
deleted_tasks,
original_filter: _,
}) => {
*deleted_tasks = Some(deleted_tasks_count);
}
_ => unreachable!(),
}
_ => unreachable!(),
}
wtxn.commit()?;
Ok(vec![task])
Ok(tasks)
}
Batch::SnapshotCreation(mut tasks) => {
fs::create_dir_all(&self.snapshots_path)?;
@ -764,10 +774,12 @@ impl IndexScheduler {
let dst = temp_snapshot_dir.path().join("auth");
fs::create_dir_all(&dst)?;
// TODO We can't use the open_auth_store_env function here but we should
let auth = milli::heed::EnvOpenOptions::new()
.map_size(1024 * 1024 * 1024) // 1 GiB
.max_dbs(2)
.open(&self.auth_path)?;
let auth = unsafe {
milli::heed::EnvOpenOptions::new()
.map_size(1024 * 1024 * 1024) // 1 GiB
.max_dbs(2)
.open(&self.auth_path)
}?;
auth.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
// 5. Copy and tarball the flat snapshot
@ -887,19 +899,76 @@ impl IndexScheduler {
let fields_ids_map = index.fields_ids_map(&rtxn)?;
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
let embedding_configs = index.embedding_configs(&rtxn)?;
// 3.1. Dump the documents
for ret in index.all_documents(&rtxn)? {
if self.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
let (_id, doc) = ret?;
let document = milli::obkv_to_json(&all_fields, &fields_ids_map, doc)?;
let (id, doc) = ret?;
let mut document = milli::obkv_to_json(&all_fields, &fields_ids_map, doc)?;
'inject_vectors: {
let embeddings = index.embeddings(&rtxn, id)?;
if embeddings.is_empty() {
break 'inject_vectors;
}
let vectors = document
.entry(RESERVED_VECTORS_FIELD_NAME.to_owned())
.or_insert(serde_json::Value::Object(Default::default()));
let serde_json::Value::Object(vectors) = vectors else {
return Err(milli::Error::UserError(
milli::UserError::InvalidVectorsMapType {
document_id: {
if let Ok(Some(Ok(index))) = index
.external_id_of(&rtxn, std::iter::once(id))
.map(|it| it.into_iter().next())
{
index
} else {
format!("internal docid={id}")
}
},
value: vectors.clone(),
},
)
.into());
};
for (embedder_name, embeddings) in embeddings {
let user_provided = embedding_configs
.iter()
.find(|conf| conf.name == embedder_name)
.is_some_and(|conf| conf.user_provided.contains(id));
let embeddings = ExplicitVectors {
embeddings: Some(
VectorOrArrayOfVectors::from_array_of_vectors(embeddings),
),
regenerate: !user_provided,
};
vectors.insert(
embedder_name,
serde_json::to_value(embeddings).unwrap(),
);
}
}
index_dumper.push_document(&document)?;
}
// 3.2. Dump the settings
let settings = meilisearch_types::settings::settings(index, &rtxn)?;
let settings = meilisearch_types::settings::settings(
index,
&rtxn,
meilisearch_types::settings::SecretPolicy::RevealSecrets,
)?;
index_dumper.settings(&settings)?;
Ok(())
})?;
@ -958,7 +1027,10 @@ impl IndexScheduler {
match res {
Ok(_) => (),
Err(e) => error!("Could not write the stats of the index {}", e),
Err(e) => tracing::error!(
error = &e as &dyn std::error::Error,
"Could not write the stats of the index"
),
}
Ok(tasks)
@ -986,7 +1058,7 @@ impl IndexScheduler {
builder.set_primary_key(primary_key);
let must_stop_processing = self.must_stop_processing.clone();
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
index_wtxn.commit()?;
@ -1013,7 +1085,10 @@ impl IndexScheduler {
match res {
Ok(_) => (),
Err(e) => error!("Could not write the stats of the index {}", e),
Err(e) => tracing::error!(
error = &e as &dyn std::error::Error,
"Could not write the stats of the index"
),
}
Ok(vec![task])
@ -1132,14 +1207,17 @@ impl IndexScheduler {
///
/// ## Return
/// The list of processed tasks.
#[tracing::instrument(
level = "trace",
skip(self, index_wtxn, index),
target = "indexing::scheduler"
)]
fn apply_index_operation<'i>(
&self,
index_wtxn: &mut RwTxn<'i>,
index: &'i Index,
operation: IndexOperation,
) -> Result<Vec<Task>> {
puffin::profile_function!();
match operation {
IndexOperation::DocumentClear { mut tasks, .. } => {
let count = milli::update::ClearDocuments::new(index_wtxn, index).execute()?;
@ -1169,6 +1247,7 @@ impl IndexScheduler {
operations,
mut tasks,
} => {
let started_processing_at = std::time::Instant::now();
let mut primary_key_has_been_set = false;
let must_stop_processing = self.must_stop_processing.clone();
let indexer_config = self.index_mapper.indexer_config();
@ -1192,7 +1271,7 @@ impl IndexScheduler {
milli::update::Settings::new(index_wtxn, index, indexer_config);
builder.set_primary_key(primary_key);
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.clone().get(),
)?;
primary_key_has_been_set = true;
@ -1211,7 +1290,7 @@ impl IndexScheduler {
index,
indexer_config,
config,
|indexing_step| trace!("update: {:?}", indexing_step),
|indexing_step| tracing::trace!(?indexing_step, "Update"),
|| must_stop_processing.get(),
)?;
@ -1283,7 +1362,7 @@ impl IndexScheduler {
if !tasks.iter().all(|res| res.error.is_some()) {
let addition = builder.execute()?;
info!("document addition done: {:?}", addition);
tracing::info!(indexing_result = ?addition, processed_in = ?started_processing_at.elapsed(), "document indexing done");
} else if primary_key_has_been_set {
// Everything failed but we've set a primary key.
// We need to remove it.
@ -1291,53 +1370,64 @@ impl IndexScheduler {
milli::update::Settings::new(index_wtxn, index, indexer_config);
builder.reset_primary_key();
builder.execute(
|indexing_step| trace!("update: {:?}", indexing_step),
|indexing_step| tracing::trace!(update = ?indexing_step),
|| must_stop_processing.clone().get(),
)?;
}
Ok(tasks)
}
IndexOperation::IndexDocumentDeletionByFilter { mut task, index_uid: _ } => {
let filter =
if let KindWithContent::DocumentDeletionByFilter { filter_expr, .. } =
&task.kind
IndexOperation::DocumentEdition { mut task, .. } => {
let (filter, context, function) =
if let KindWithContent::DocumentEdition {
filter_expr, context, function, ..
} = &task.kind
{
filter_expr
(filter_expr, context, function)
} else {
unreachable!()
};
let deleted_documents = delete_document_by_filter(
let result_count = edit_documents_by_function(
index_wtxn,
filter,
context.clone(),
function,
self.index_mapper.indexer_config(),
self.must_stop_processing.clone(),
index,
);
let original_filter = if let Some(Details::DocumentDeletionByFilter {
let (original_filter, context, function) = if let Some(Details::DocumentEdition {
original_filter,
deleted_documents: _,
context,
function,
..
}) = task.details
{
original_filter
(original_filter, context, function)
} else {
// In the case of a `documentDeleteByFilter` the details MUST be set
// In the case of a `documentEdition` the details MUST be set
unreachable!();
};
match deleted_documents {
Ok(deleted_documents) => {
match result_count {
Ok((deleted_documents, edited_documents)) => {
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletionByFilter {
task.details = Some(Details::DocumentEdition {
original_filter,
context,
function,
deleted_documents: Some(deleted_documents),
edited_documents: Some(edited_documents),
});
}
Err(e) => {
task.status = Status::Failed;
task.details = Some(Details::DocumentDeletionByFilter {
task.details = Some(Details::DocumentEdition {
original_filter,
context,
function,
deleted_documents: Some(0),
edited_documents: Some(0),
});
task.error = Some(e.into());
}
@ -1345,6 +1435,103 @@ impl IndexScheduler {
Ok(vec![task])
}
IndexOperation::DocumentDeletion { mut tasks, index_uid: _ } => {
let mut to_delete = RoaringBitmap::new();
let external_documents_ids = index.external_documents_ids();
for task in tasks.iter_mut() {
let before = to_delete.len();
task.status = Status::Succeeded;
match &task.kind {
KindWithContent::DocumentDeletion { index_uid: _, documents_ids } => {
for id in documents_ids {
if let Some(id) = external_documents_ids.get(index_wtxn, id)? {
to_delete.insert(id);
}
}
let will_be_removed = to_delete.len() - before;
task.details = Some(Details::DocumentDeletion {
provided_ids: documents_ids.len(),
deleted_documents: Some(will_be_removed),
});
}
KindWithContent::DocumentDeletionByFilter { index_uid: _, filter_expr } => {
let before = to_delete.len();
let filter = match Filter::from_json(filter_expr) {
Ok(filter) => filter,
Err(err) => {
// theorically, this should be catched by deserr before reaching the index-scheduler and cannot happens
task.status = Status::Failed;
task.error = match err {
milli::Error::UserError(
milli::UserError::InvalidFilterExpression { .. },
) => Some(
Error::from(err)
.with_custom_error_code(Code::InvalidDocumentFilter)
.into(),
),
e => Some(e.into()),
};
None
}
};
if let Some(filter) = filter {
let candidates =
filter.evaluate(index_wtxn, index).map_err(|err| match err {
milli::Error::UserError(
milli::UserError::InvalidFilter(_),
) => Error::from(err)
.with_custom_error_code(Code::InvalidDocumentFilter),
e => e.into(),
});
match candidates {
Ok(candidates) => to_delete |= candidates,
Err(err) => {
task.status = Status::Failed;
task.error = Some(err.into());
}
};
}
let will_be_removed = to_delete.len() - before;
if let Some(Details::DocumentDeletionByFilter {
original_filter: _,
deleted_documents,
}) = &mut task.details
{
*deleted_documents = Some(will_be_removed);
} else {
// In the case of a `documentDeleteByFilter` the details MUST be set
unreachable!()
}
}
_ => unreachable!(),
}
}
let config = IndexDocumentsConfig {
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
};
let must_stop_processing = self.must_stop_processing.clone();
let mut builder = milli::update::IndexDocuments::new(
index_wtxn,
index,
self.index_mapper.indexer_config(),
config,
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
let (new_builder, _count) =
builder.remove_documents_from_db_no_batch(&to_delete)?;
builder = new_builder;
let _ = builder.execute()?;
Ok(tasks)
}
IndexOperation::Settings { index_uid: _, settings, mut tasks } => {
let indexer_config = self.index_mapper.indexer_config();
let mut builder = milli::update::Settings::new(index_wtxn, index, indexer_config);
@ -1361,7 +1548,7 @@ impl IndexScheduler {
let must_stop_processing = self.must_stop_processing.clone();
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
@ -1435,7 +1622,11 @@ impl IndexScheduler {
/// Delete each given task from all the databases (if it is deleteable).
///
/// Return the number of tasks that were actually deleted.
fn delete_matched_tasks(&self, wtxn: &mut RwTxn, matched_tasks: &RoaringBitmap) -> Result<u64> {
fn delete_matched_tasks(
&self,
wtxn: &mut RwTxn,
matched_tasks: &RoaringBitmap,
) -> Result<RoaringBitmap> {
// 1. Remove from this list the tasks that we are not allowed to delete
let enqueued_tasks = self.get_status(wtxn, Status::Enqueued)?;
let processing_tasks = &self.processing_tasks.read().unwrap().processing.clone();
@ -1500,7 +1691,7 @@ impl IndexScheduler {
}
}
Ok(to_delete_tasks.len())
Ok(to_delete_tasks)
}
/// Cancel each given task from all the databases (if it is cancelable).
@ -1543,42 +1734,43 @@ impl IndexScheduler {
}
}
fn delete_document_by_filter<'a>(
fn edit_documents_by_function<'a>(
wtxn: &mut RwTxn<'a>,
filter: &serde_json::Value,
filter: &Option<serde_json::Value>,
context: Option<Object>,
code: &str,
indexer_config: &IndexerConfig,
must_stop_processing: MustStopProcessing,
index: &'a Index,
) -> Result<u64> {
let filter = Filter::from_json(filter)?;
Ok(if let Some(filter) = filter {
let candidates = filter.evaluate(wtxn, index).map_err(|err| match err {
) -> Result<(u64, u64)> {
let candidates = match filter.as_ref().map(Filter::from_json) {
Some(Ok(Some(filter))) => filter.evaluate(wtxn, index).map_err(|err| match err {
milli::Error::UserError(milli::UserError::InvalidFilter(_)) => {
Error::from(err).with_custom_error_code(Code::InvalidDocumentFilter)
}
e => e.into(),
})?;
})?,
None | Some(Ok(None)) => index.documents_ids(wtxn)?,
Some(Err(e)) => return Err(e.into()),
};
let config = IndexDocumentsConfig {
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
};
let config = IndexDocumentsConfig {
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
};
let mut builder = milli::update::IndexDocuments::new(
wtxn,
index,
indexer_config,
config,
|indexing_step| debug!("update: {:?}", indexing_step),
|| must_stop_processing.get(),
)?;
let mut builder = milli::update::IndexDocuments::new(
wtxn,
index,
indexer_config,
config,
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
let (new_builder, count) = builder.remove_documents_from_db_no_batch(&candidates)?;
builder = new_builder;
let (new_builder, count) = builder.edit_documents(&candidates, context, code)?;
builder = new_builder;
let _ = builder.execute()?;
count
} else {
0
})
let _ = builder.execute()?;
Ok(count.unwrap())
}

View File

@ -48,6 +48,8 @@ impl From<DateField> for Code {
pub enum Error {
#[error("{1}")]
WithCustomErrorCode(Code, Box<Self>),
#[error("Received bad task id: {received} should be >= to {expected}.")]
BadTaskId { received: TaskId, expected: TaskId },
#[error("Index `{0}` not found.")]
IndexNotFound(String),
#[error("Index `{0}` already exists.")]
@ -161,6 +163,7 @@ impl Error {
match self {
Error::IndexNotFound(_)
| Error::WithCustomErrorCode(_, _)
| Error::BadTaskId { .. }
| Error::IndexAlreadyExists(_)
| Error::SwapDuplicateIndexFound(_)
| Error::SwapDuplicateIndexesFound(_)
@ -205,6 +208,7 @@ impl ErrorCode for Error {
fn error_code(&self) -> Code {
match self {
Error::WithCustomErrorCode(code, _) => *code,
Error::BadTaskId { .. } => Code::BadRequest,
Error::IndexNotFound(_) => Code::IndexNotFound,
Error::IndexAlreadyExists(_) => Code::IndexAlreadyExists,
Error::SwapDuplicateIndexesFound(_) => Code::InvalidSwapDuplicateIndexFound,

View File

@ -30,19 +30,6 @@ impl RoFeatures {
self.runtime
}
pub fn check_score_details(&self) -> Result<()> {
if self.runtime.score_details {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Computing score details",
feature: "score details",
issue_link: "https://github.com/meilisearch/product/discussions/674",
}
.into())
}
}
pub fn check_metrics(&self) -> Result<()> {
if self.runtime.metrics {
Ok(())
@ -56,6 +43,19 @@ impl RoFeatures {
}
}
pub fn check_logs_route(&self) -> Result<()> {
if self.runtime.logs_route {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Modifying logs through the `/logs/*` routes",
feature: "logs route",
issue_link: "https://github.com/orgs/meilisearch/discussions/721",
}
.into())
}
}
pub fn check_vector(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.vector_store {
Ok(())
@ -69,14 +69,27 @@ impl RoFeatures {
}
}
pub fn check_puffin(&self) -> Result<()> {
if self.runtime.export_puffin_reports {
pub fn check_edit_documents_by_function(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.edit_documents_by_function {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Outputting Puffin reports to disk",
feature: "export puffin reports",
issue_link: "https://github.com/meilisearch/product/discussions/693",
disabled_action,
feature: "edit documents by function",
issue_link: "https://github.com/orgs/meilisearch/discussions/762",
}
.into())
}
}
pub fn check_contains_filter(&self) -> Result<()> {
if self.runtime.contains_filter {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Using `CONTAINS` or `STARTS WITH` in a filter",
feature: "contains filter",
issue_link: "https://github.com/orgs/meilisearch/discussions/763",
}
.into())
}
@ -92,8 +105,11 @@ impl FeatureData {
let txn = env.read_txn()?;
let persisted_features: RuntimeTogglableFeatures =
runtime_features_db.get(&txn, EXPERIMENTAL_FEATURES)?.unwrap_or_default();
let InstanceTogglableFeatures { metrics, logs_route, contains_filter } = instance_features;
let runtime = Arc::new(RwLock::new(RuntimeTogglableFeatures {
metrics: instance_features.metrics || persisted_features.metrics,
metrics: metrics || persisted_features.metrics,
logs_route: logs_route || persisted_features.logs_route,
contains_filter: contains_filter || persisted_features.contains_filter,
..persisted_features
}));

View File

@ -3,13 +3,13 @@ use std::sync::{Arc, RwLock};
use std::time::Duration;
use std::{fs, thread};
use log::error;
use meilisearch_types::heed::types::{SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::{FieldDistribution, Index};
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use tracing::error;
use uuid::Uuid;
use self::index_map::IndexMap;
@ -108,8 +108,10 @@ pub struct IndexStats {
/// Association of every field name with the number of times it occurs in the documents.
pub field_distribution: FieldDistribution,
/// Creation date of the index.
#[serde(with = "time::serde::rfc3339")]
pub created_at: OffsetDateTime,
/// Date of the last update of the index.
#[serde(with = "time::serde::rfc3339")]
pub updated_at: OffsetDateTime,
}

View File

@ -11,10 +11,14 @@ use crate::index_mapper::IndexMapper;
use crate::{IndexScheduler, Kind, Status, BEI128};
pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
// Since we'll snapshot the index right afterward, we don't need to ensure it's internally consistent for every run.
// We can only do it for the release run, where the function runs way faster.
#[cfg(not(debug_assertions))]
scheduler.assert_internally_consistent();
let IndexScheduler {
autobatching_enabled,
cleanup_enabled: _,
must_stop_processing: _,
processing_tasks,
file_store,
@ -31,7 +35,6 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
features: _,
max_number_of_tasks: _,
max_number_of_batched_tasks: _,
puffin_frame: _,
wake_up: _,
dumps_path: _,
snapshots_path: _,
@ -177,6 +180,17 @@ fn snapshot_details(d: &Details) -> String {
} => {
format!("{{ received_documents: {received_documents}, indexed_documents: {indexed_documents:?} }}")
}
Details::DocumentEdition {
deleted_documents,
edited_documents,
original_filter,
context,
function,
} => {
format!(
"{{ deleted_documents: {deleted_documents:?}, edited_documents: {edited_documents:?}, context: {context:?}, function: {function:?}, original_filter: {original_filter:?} }}"
)
}
Details::SettingsUpdate { settings } => {
format!("{{ settings: {settings:?} }}")
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,15 @@
---
source: index-scheduler/src/lib.rs
expression: doc
---
{
"doggo": "Intel",
"breed": "beagle",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@ -0,0 +1,22 @@
---
source: index-scheduler/src/lib.rs
expression: task.details
---
{
"embedders": {
"A_fakerest": {
"source": "rest",
"apiKey": "MyXXXX...",
"dimensions": 384,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}"
},
"B_small_hf": {
"source": "huggingFace",
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"documentTemplate": "{{doc.doggo}} the {{doc.breed}} best doggo"
}
}
}

View File

@ -0,0 +1,15 @@
---
source: index-scheduler/src/lib.rs
expression: doc
---
{
"doggo": "kefir",
"breed": "patou",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@ -0,0 +1,15 @@
---
source: index-scheduler/src/lib.rs
expression: fakerest_config.embedder_options
---
{
"Rest": {
"api_key": "My super secret",
"distribution": null,
"dimensions": 384,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}",
"headers": {}
}
}

View File

@ -0,0 +1,11 @@
---
source: index-scheduler/src/lib.rs
expression: simple_hf_config.embedder_options
---
{
"HuggingFace": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"distribution": null
}
}

View File

@ -0,0 +1,22 @@
---
source: index-scheduler/src/lib.rs
expression: task.details
---
{
"embedders": {
"A_fakerest": {
"source": "rest",
"apiKey": "MyXXXX...",
"dimensions": 384,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}"
},
"B_small_hf": {
"source": "huggingFace",
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"documentTemplate": "{{doc.doggo}} the {{doc.breed}} best doggo"
}
}
}

View File

@ -0,0 +1,16 @@
---
source: index-scheduler/src/lib.rs
expression: task.details
---
{
"embedders": {
"default": {
"source": "rest",
"apiKey": "MyXXXX...",
"dimensions": 4,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}"
}
}
}

View File

@ -0,0 +1,15 @@
---
source: index-scheduler/src/lib.rs
expression: config.embedder_options
---
{
"Rest": {
"api_key": "My super secret",
"distribution": null,
"dimensions": 4,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}",
"headers": {}
}
}

View File

@ -0,0 +1,16 @@
---
source: index-scheduler/src/lib.rs
expression: task.details
---
{
"embedders": {
"default": {
"source": "rest",
"apiKey": "MyXXXX...",
"dimensions": 4,
"url": "http://localhost:7777",
"request": "{{text}}",
"response": "{{embedding}}"
}
}
}

View File

@ -0,0 +1,44 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 3, indexed_documents: Some(3) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [0,1,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 3, field_distribution: {"catto": 1, "doggo": 2, "id": 3} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,43 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: enqueued, details: { received_documents: 3, indexed_documents: None }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [1,]
succeeded [0,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 0, field_distribution: {} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000000
----------------------------------------------------------------------

View File

@ -0,0 +1,43 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: enqueued, details: { received_documents: 3, indexed_documents: None }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [1,]
succeeded [0,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 0, field_distribution: {} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000000
----------------------------------------------------------------------

View File

@ -0,0 +1,56 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 3, indexed_documents: Some(3) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
2 {uid: 2, status: succeeded, details: { received_document_ids: 1, deleted_documents: Some(1) }, kind: DocumentDeletion { index_uid: "doggos", documents_ids: ["1"] }}
3 {uid: 3, status: failed, error: ResponseError { code: 200, message: "Invalid type for filter subexpression: expected: String, Array, found: true.", error_code: "invalid_document_filter", error_type: "invalid_request", error_link: "https://docs.meilisearch.com/errors#invalid_document_filter" }, details: { original_filter: true, deleted_documents: Some(0) }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: Bool(true) }}
4 {uid: 4, status: failed, error: ResponseError { code: 200, message: "Attribute `id` is not filterable. Available filterable attributes are: `catto`.\n1:3 id = 2", error_code: "invalid_document_filter", error_type: "invalid_request", error_link: "https://docs.meilisearch.com/errors#invalid_document_filter" }, details: { original_filter: "id = 2", deleted_documents: Some(0) }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: String("id = 2") }}
5 {uid: 5, status: succeeded, details: { original_filter: "catto EXISTS", deleted_documents: Some(1) }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: String("catto EXISTS") }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [0,1,2,5,]
failed [3,4,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"documentDeletion" [2,3,4,5,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,2,3,4,5,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 1, field_distribution: {"doggo": 1, "id": 1} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,3,4,5,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,3,4,5,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,9 @@
---
source: index-scheduler/src/lib.rs
---
[
{
"id": 3,
"doggo": "bork"
}
]

View File

@ -0,0 +1,53 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 3, indexed_documents: Some(3) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
2 {uid: 2, status: enqueued, details: { received_document_ids: 1, deleted_documents: None }, kind: DocumentDeletion { index_uid: "doggos", documents_ids: ["1"] }}
3 {uid: 3, status: enqueued, details: { original_filter: true, deleted_documents: None }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: Bool(true) }}
4 {uid: 4, status: enqueued, details: { original_filter: "id = 2", deleted_documents: None }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: String("id = 2") }}
5 {uid: 5, status: enqueued, details: { original_filter: "catto EXISTS", deleted_documents: None }, kind: DocumentDeletionByFilter { index_uid: "doggos", filter_expr: String("catto EXISTS") }}
----------------------------------------------------------------------
### Status:
enqueued [2,3,4,5,]
succeeded [0,1,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"documentDeletion" [2,3,4,5,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,2,3,4,5,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 3, field_distribution: {"catto": 1, "doggo": 2, "id": 3} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [4,]
[timestamp] [5,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,39 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: Set({"catto"}), sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: NotSet, search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: enqueued, details: { received_documents: 3, indexed_documents: None }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: ReplaceDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 3, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [0,1,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
----------------------------------------------------------------------
### Finished At:
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000000
----------------------------------------------------------------------

View File

@ -0,0 +1,48 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
2 {uid: 2, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: None, method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000001, documents_count: 1, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [0,1,2,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,2,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,2,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 1, field_distribution: {"_vectors": 1, "breed": 1, "doggo": 1, "id": 1} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,47 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
2 {uid: 2, status: enqueued, details: { received_documents: 1, indexed_documents: None }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: None, method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000001, documents_count: 1, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [2,]
succeeded [0,1,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,2,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,2,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 1, field_distribution: {"_vectors": 1, "breed": 1, "doggo": 1, "id": 1} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
[timestamp] [2,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000001
----------------------------------------------------------------------

View File

@ -0,0 +1,44 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: succeeded, details: { received_documents: 1, indexed_documents: Some(1) }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [0,1,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 1, field_distribution: {"_vectors": 1, "breed": 1, "doggo": 1, "id": 1} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,43 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
1 {uid: 1, status: enqueued, details: { received_documents: 1, indexed_documents: None }, kind: DocumentAdditionOrUpdate { index_uid: "doggos", primary_key: Some("id"), method: UpdateDocuments, content_file: 00000000-0000-0000-0000-000000000000, documents_count: 1, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [1,]
succeeded [0,]
----------------------------------------------------------------------
### Kind:
"documentAdditionOrUpdate" [1,]
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,1,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 0, field_distribution: {} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000000
----------------------------------------------------------------------

View File

@ -0,0 +1,35 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued [0,]
----------------------------------------------------------------------
### Kind:
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,]
----------------------------------------------------------------------
### Index Mapper:
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
----------------------------------------------------------------------
### Started At:
----------------------------------------------------------------------
### Finished At:
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,39 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: succeeded, details: { settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> } }, kind: SettingsUpdate { index_uid: "doggos", new_settings: Settings { displayed_attributes: WildcardSetting(NotSet), searchable_attributes: WildcardSetting(NotSet), filterable_attributes: NotSet, sortable_attributes: NotSet, ranking_rules: NotSet, stop_words: NotSet, non_separator_tokens: NotSet, separator_tokens: NotSet, dictionary: NotSet, synonyms: NotSet, distinct_attribute: NotSet, proximity_precision: NotSet, typo_tolerance: NotSet, faceting: NotSet, pagination: NotSet, embedders: Set({"A_fakerest": Set(EmbeddingSettings { source: Set(Rest), model: NotSet, revision: NotSet, api_key: Set("My super secret"), dimensions: Set(384), binary_quantized: NotSet, document_template: NotSet, document_template_max_bytes: NotSet, url: Set("http://localhost:7777"), request: Set(String("{{text}}")), response: Set(String("{{embedding}}")), headers: NotSet, distribution: NotSet }), "B_small_hf": Set(EmbeddingSettings { source: Set(HuggingFace), model: Set("sentence-transformers/all-MiniLM-L6-v2"), revision: Set("e4ce9877abf3edfe10b0d82785e83bdcb973e22e"), api_key: NotSet, dimensions: NotSet, binary_quantized: NotSet, document_template: Set("{{doc.doggo}} the {{doc.breed}} best doggo"), document_template_max_bytes: NotSet, url: NotSet, request: NotSet, response: NotSet, headers: NotSet, distribution: NotSet })}), search_cutoff_ms: NotSet, localized_attributes: NotSet, _kind: PhantomData<meilisearch_types::settings::Unchecked> }, is_deletion: false, allow_index_creation: true }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [0,]
----------------------------------------------------------------------
### Kind:
"settingsUpdate" [0,]
----------------------------------------------------------------------
### Index Tasks:
doggos [0,]
----------------------------------------------------------------------
### Index Mapper:
doggos: { number_of_documents: 0, field_distribution: {} }
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

Some files were not shown because too many files have changed in this diff Show More