Compare commits

...

599 Commits

Author SHA1 Message Date
ca4876fd10 Do not reindex when modifying unknown faceted field 2024-03-12 16:18:58 +01:00
ee3076d5ba Merge #4462
4462: Divide threshold by ten r=dureuill a=ManyTheFish

Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-03-06 13:05:38 +00:00
ab1224bfa7 Merge #4458
4458: Replace logging timer by spans r=Kerollmops a=dureuill

- Remove logging timer dependency.
- Remplace last uses in search by spans

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 16:43:23 +00:00
eefc1c421e Merge #4459
4459: Put a bound on OpenAI timeout r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4460 

## What does this PR do?
- Makes sure that the timeout of the openai embedder is limited to max 1min, rather than the prior 15min+



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 15:18:51 +00:00
4d42a7af7c Merge #4445
4445: Add subcommand to run benchmarks r=irevoire a=dureuill

# Pull Request

## Related issue
Not user-facing, no issue

## What does this PR do?
- Adds a new `cargo xtask bench` subcommand that can run one or multiple workload files and report the results to a server
- A workload file is a JSON file with a specific schema
- Refactor our use of the `vergen` crate:
  - update to the beta `vergen-git2` crate
  - VERGEN_GIT_SEMVER_LIGHTWEIGHT => VERGEN_GIT_DESCRIBE
  - factor logic in a single `build-info` crate that is used both by meilisearch and xtask (prevents vergen variables from overriding themselves)
  - checked that defining the variables by hand when no git repo is available (docker build case) still works.
- Add CI to run `cargo xtask bench`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 14:03:57 +00:00
7408db2a46 Meilisearch: fix date formatting 2024-03-05 14:56:48 +01:00
663629a9d6 Remove unused build dependency from xtask
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 14:45:06 +01:00
15c38dca78 Output RFC 3339 dates where we can
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 14:44:48 +01:00
7ee20b0895 Refactor xtask bench 2024-03-05 14:42:06 +01:00
0c216048b5 Cap timeout duration 2024-03-05 12:19:25 +01:00
36d17110d8 openai: Handle BAD_GETAWAY, be more resilient to failure 2024-03-05 12:18:54 +01:00
bdd428c22e Merge #4450
4450: Add the content type in the webhook + improve the test r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4436

## What does this PR do?
- Specify the content type of the webhook
- Ensure it’s the case in the test

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 10:36:53 +00:00
b130917933 add the content type in the webhook + improve the test 2024-03-05 11:22:29 +01:00
25f64ce7df Replace logging timer by spans 2024-03-05 11:05:42 +01:00
adcd848809 CI: Add bench workflows 2024-03-05 11:02:05 +01:00
eee46b7537 Add first workloads 2024-03-05 10:13:11 +01:00
55f60a3638 Update .gitignore
- Ignore `/bench` directory for git purposes
- Ignore benchmark DB
2024-03-05 10:12:52 +01:00
c608b3f9b5 Factor vergen stuff to a build-info crate 2024-03-05 10:11:43 +01:00
86ce843f3d Add cargo xtask bench 2024-03-05 10:11:43 +01:00
b11df7ec34 Meilisearch: fix some wrong spans 2024-03-05 10:11:43 +01:00
6862caef64 Span Stats compute self-time 2024-03-05 10:11:43 +01:00
f75c7ac979 Compile xtask in --release 2024-03-05 10:11:43 +01:00
eada6de261 Divide threshold by ten 2024-03-04 18:02:54 +01:00
f4a6261dea Merge #4453
4453: Don't test on nightly r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4441 better 😅 

## What does this PR do?
- No longer run tests on nightly

The motivation for this change is that we are now updating Rust at fixed points in time, and so no longer need nightly runs to ensure that a change won't get into stable and break our build at the worst possible moment.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-29 14:41:59 +00:00
9806a3e5f6 Don't test on nightly 2024-02-29 14:24:50 +01:00
a96b45dda7 Merge #4451
4451: Fix nightly build r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4441 

## What does this PR do?
- Change imports following https://github.com/rust-lang/rust/pull/117772

## Note

This one is going to be annoying a bit until the lint stabilizes:

- We only get the warning on nightly, so we will discover them when it runs in the CI that uses the nightly compiler (not on regular PRs)
- There's the case of `TryInto`/`TryFrom` traits. They have been added to the prelude in Rust edition 2021, so it means that `use`ing them is a warning on nightly for 2021 edition crates (most crates), but not `use`ing them is an error anywhere for 2018 Rust edition crates, such as `milli`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-29 07:20:22 +00:00
452a343a2b Fix imports 2024-02-28 18:09:40 +01:00
b87485e80d Merge #4433
4433: Enhance facet incremental r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #4367
Fixes #4409

## What does this PR do?

- Add a test reproducing #4409
- Fix #4409 by removing a document from a level only if it is no more present in all the linked sub-level nodes
- Optimize facet Incremental indexing by creating or deleting a complete level once per field id instead of for each facet value
- Optimize facet Incremental indexing by doing the additions and the deletions in the same process instead of doing them separately


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-28 15:28:46 +00:00
147a67dc82 Merge #4446
4446: Do not omit vectors when importing a dump r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4447 

## What does this PR do?
- Correctly populate the maps of embedders before starting the indexing operations, while importing a dump


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-27 09:11:00 +00:00
716ffc07ee Build the embedders when importing a dump 2024-02-26 22:15:57 +01:00
b005eb3289 Merge #4435
4435: Make update file deletion atomic r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4432
Fixes https://github.com/meilisearch/meilisearch/issues/4438 by adding the logs the user asked

## What does this PR do?
- Adds a bunch of logs to help debug this kind of issue in the future
- Delete the update files AFTER committing the update in the `index-scheduler` (thus, if a restart happens, we are able to re-process the batch successfully)
- Multi-thread the deletion of all update files.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 17:54:40 +00:00
9e664d87eb Merge #4443
4443: Add GPU analytics r=dureuill a=dureuill

# Pull Request

## Related issue

Adds analytics indicating whether Meilisearch  was compiled with the `milli/cuda` feature.

Cc `@macraig` 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-26 17:13:45 +00:00
6dcb5219a0 Merge #4442
4442: Send custom task r=ManyTheFish a=irevoire

This PR has already been merged on main but was supposed to be merged on `release-v1.7.0` thus we need to merge it a second time; sorry 😓 

### This PR implements the necessary parameters for the High Availability

Introduce a new CLI flag called `--experimental-replication-parameters` that changes a few behaviors in the engine:
- [The auto-deletion of tasks is disabled](https://specs.meilisearch.com/specifications/text/0060-tasks-api.html#_2-technical-details)
- Upon registering a task, you can choose its task ID by sending a new header: `TaskId: 456645`. It must be a valid number, which must be superior to the last task id ever seen.
- Add the ability to « dry-register » a task. That means meilisearch will answer to you with a valid task ID like everything went well, but won’t actually write anything in the database. To do that, you need to use the `DryRun: true` header.
- Specification’s here: https://github.com/meilisearch/specifications/pull/266

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 15:20:16 +00:00
5e83bac448 Fix PR comments 2024-02-26 15:40:15 +01:00
0562818c2a fix and remove the file-store hack of /dev/null 2024-02-26 13:59:41 +01:00
a478392b7a create a test with the dry-run parameter enabled 2024-02-26 13:59:41 +01:00
bbf3fb88ca rename the cli parameter 2024-02-26 13:59:40 +01:00
60510e037b update the discussion link 2024-02-26 13:58:04 +01:00
36c27a18a1 implement the dry run ha parameter 2024-02-26 13:58:04 +01:00
1eb1c043b5 disable the auto deletion of tasks when the ha mode is enabled 2024-02-26 13:58:04 +01:00
507739bd98 add an experimental cli parameter to allow specifying your task id 2024-02-26 13:58:03 +01:00
eb25b07390 let you specify your task id 2024-02-26 13:56:31 +01:00
066a7a3cde takes only one read transaction per thread 2024-02-26 10:43:04 +01:00
55796406c5 Add GPU analytics 2024-02-26 10:41:47 +01:00
91cdd502f8 When processing tasks, make the update file deletion atomic 2024-02-22 14:56:22 +01:00
a493a50825 Fix clippy 2024-02-22 14:53:33 +01:00
9d1f489a37 Fix facet incremental indexing 2024-02-21 18:42:16 +01:00
865b415b3f Add test rerpoducing bug 2024-02-15 16:00:48 +01:00
5ee6aaddc4 Merge #4418
4418: Output logs to stderr r=dureuill a=irevoire

Output the logs to `stderr` instead of `stdout`. This was introduced in the `v1.7.0-rc.0` and is a bug; logs should always be outputted to stderr.

Fix #4419

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 14:31:37 +00:00
4148d391b8 move logs to stderr 2024-02-15 15:24:16 +01:00
88c6165e20 Merge #4410
4410: Implement the experimental log mode cli flag and log level updates at runtime r=dureuill a=irevoire

# Pull Request
This PR fixes two issues at once because they’re highly correlated in the codebase.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4415
Fixes https://github.com/meilisearch/meilisearch/issues/4413

## What does this PR do?
- It makes the fmt logger configurable to output json or human-readable logs (like we already do today)
- It moves the fmt logger under a `reload` layer so we can update its targets at runtime
- Add the possibility to stream logs in the json mode
- Adds an analytics for the new CLI flag

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 10:01:06 +00:00
d097431113 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:58:43 +01:00
1f8af81ba9 update the log mode discussion link 2024-02-15 10:32:48 +01:00
5d3bad4120 Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:31:23 +01:00
d34692e30b Merge #4365
4365: Update charabia r=dureuill a=ManyTheFish

Update Charabia v0.8.7,

- Add Vietnamese Normalization (Ð and Đ into d)

Fixes #4357

Charabia versions:
- https://github.com/meilisearch/charabia/releases/tag/v0.8.6
- https://github.com/meilisearch/charabia/releases/tag/v0.8.7

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-14 16:57:25 +00:00
a081da0d90 add support for the json format in the stream route 2024-02-14 15:34:39 +01:00
78e04520fc Update charabia version 2024-02-14 15:16:16 +01:00
72c1674a31 Merge #4350
4350: Make several indexing optimizations r=Kerollmops a=ManyTheFish

# Summary

Implement several enhancements to reduce the indexing time.

# Steps

- Compute the indexing chunk size dynamically based on the available threads and the data size
- Remove the merging step before the writing step and merge at the writing time
- Remove append function
- Make Facet search indexing incremental

# Running Indexing process

## `main`
Each type of data is written after a merging phase:
![Capture d’écran 2024-01-23 à 10 18 08](https://github.com/meilisearch/meilisearch/assets/6482087/6203c3ce-407c-46b4-8b83-04282da1bb16)

> Highlighted parts are the writings

## `remove-merging-phase-from-indexing`
When the extraction of a chunk is finished, the data is written:
![Capture d’écran 2024-01-23 à 10 18 18](https://github.com/meilisearch/meilisearch/assets/6482087/ab1307b4-d0a9-42ac-abbb-fdeb27ddf0d4)

> Highlighted parts are the writings

## Related

This PR removes the appending writes on several indexing parts, which may fix https://github.com/meilisearch/meilisearch/issues/4300. However, all of the appending writes are not removed. There are 2 remaining calls that could trigger this bug:
- When [putting embedders in the settings](b6fc181993/milli/src/update/settings.rs (L996))
- when [bulk indexing the facets](b6fc181993/milli/src/update/facet/bulk.rs (L150))


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-02-14 14:12:48 +00:00
03bb6372af Change is_batchable_with by mergeable_with 2024-02-14 11:50:22 +01:00
3beda8833d Fix and add logs 2024-02-14 11:46:30 +01:00
3b6544db6d Implement the experimental log mode cli flag 2024-02-13 18:09:15 +01:00
55e942cd45 buggy 2024-02-13 15:26:30 +01:00
48026aa75c fix PR comments 2024-02-13 15:19:01 +01:00
e5e811e2c9 Update milli/src/update/index_documents/extract/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:21 +01:00
55de96f74e Update milli/src/update/facet/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:10 +01:00
15dafde21d Merge #4401
4401: Update version for the next release (v1.7.0) in Cargo.toml r=irevoire a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: irevoire <irevoire@users.noreply.github.com>
2024-02-12 10:17:10 +00:00
290f6d15e7 Update version for the next release (v1.7.0) in Cargo.toml 2024-02-12 10:15:00 +00:00
39c83cb3d9 fix clippy 2024-02-12 09:12:54 +01:00
7efb1cae11 yield in loop when the channel is not disconnected 2024-02-12 09:12:54 +01:00
7877788510 fix logs 2024-02-12 09:12:54 +01:00
be1b054b05 Compute chunk size based on the input data size ant the number of indexing threads 2024-02-08 17:28:37 +01:00
023c2d755f Merge #4391
4391: Tracing r=dureuill a=irevoire

# Pull Request

- [ ] Hide the parameters of the process batch
- [x] Make actix-web trace every call on every route
- [x] Remove all `env_logger`/`logs` dependencies
- [x] Be able to enable or disable the memory measurement using the `/logs` route parameters

See the following product discussion: https://github.com/orgs/meilisearch/discussions/721

Supersedes https://github.com/meilisearch/meilisearch/pull/4338

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4317

## What does this PR do?

Update the format of the logs from:
```
[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers
```

to

```
2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers
```

First, run meilisearch with the route enabled via the feature flag:
- `cargo run --experimental-enable-logs-route`
- Or at runtime by sending the following payload:
```
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "logsRoute": true
  }'
```

Then gather data from meilisearch by calling for example:
```
curl \
	-X POST http://localhost:7700/logs \
	-H 'Content-Type: application/json' \
	--data-binary '{
	    "mode": "fmt",
            "target": "milli=trace"
    }'
```

Once your operation is over, tell meilisearch to stop the route:
```
curl \
	-X DELETE http://localhost:7700/logs
```

----

In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand.

```bash
cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json
```

Then go to https://profiler.firefox.com and load it.
Note that we can also share the profiles using the https://share.firefox.dev website.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-08 14:16:56 +00:00
407ad753ed rust fmt 2024-02-08 15:11:42 +01:00
285aa15d2f make the mode camelCase instead of lowercase 2024-02-08 15:04:06 +01:00
bf43a3f60a fix typo 2024-02-08 15:04:06 +01:00
2c88131bb1 rename the fmt mode to human 2024-02-08 15:04:06 +01:00
35aa9d5904 fix an error message 2024-02-08 15:04:06 +01:00
cfb3e6b51f update the actix-web trace 2024-02-08 15:04:06 +01:00
1502382316 use debug instead of debug_span 2024-02-08 15:04:06 +01:00
ef994d84d0 Change error messages and fix tests 2024-02-08 15:04:06 +01:00
1b74010e9e Remove "with_line_numbers" 2024-02-08 15:04:06 +01:00
08af0e690c Structures a bunch of logs 2024-02-08 15:04:06 +01:00
d71b77f18b Add panic hook to log panics 2024-02-08 15:04:06 +01:00
c443ed7e3f delete inner .gitignore 2024-02-08 15:04:06 +01:00
db722d201a Write entries into database downgraded to trace level 2024-02-08 15:04:05 +01:00
91eb67e981 logs route: make memory profiling toggling usable 2024-02-08 15:04:05 +01:00
902d700a24 Tracing trace: toggle the profiling of memory at runtime 2024-02-08 15:04:05 +01:00
f70a615ed9 update the github discussion links 2024-02-08 15:04:05 +01:00
7ff722b72e get rids of the log dependencies everywhere 2024-02-08 15:04:05 +01:00
bcf7909bba add a profile_memory parameter disabled by default 2024-02-08 15:04:05 +01:00
ceb211c515 move the /logs route to the /logs/stream route 2024-02-08 15:04:05 +01:00
f3c34d5b8c Simplify MemoryStats fetching 2024-02-08 15:04:05 +01:00
4de2db6786 add back the actix-web logs 2024-02-08 15:04:05 +01:00
661baa716b logs route profile mode: don't barf bytes if the buffer is not empty 2024-02-08 15:04:05 +01:00
02dcaf07db Replace the procfs by libproc 2024-02-08 15:04:05 +01:00
d78ada07b5 spanstats: change field names 2024-02-08 15:04:05 +01:00
bc097d90cb tracing-trace: Spanstats deserializable + public fields 2024-02-08 15:04:05 +01:00
b393823f36 Replace stats_alloc with procfs 2024-02-08 15:04:05 +01:00
e773dfa9ba get rids of log in milli and add logs for the bucket sort 2024-02-08 15:04:05 +01:00
f158e96fe7 fix the auth 2024-02-08 15:04:05 +01:00
e23ec4886d fix the tests and add tests on the experimental features 2024-02-08 15:04:03 +01:00
7793ba67a4 hide the route logs behind a feature flag 2024-02-08 15:03:33 +01:00
80774148fd handle and tests errors 2024-02-08 15:03:33 +01:00
bf5cea8b10 add a test 2024-02-08 15:03:33 +01:00
38e1c40f38 meilisearch: logs route disconnects in profile mode 2024-02-08 15:03:33 +01:00
afc0585c1c meilisearch: don't spawn a report everytime Meilisearch starts 2024-02-08 15:03:33 +01:00
0e7a411d4d tracing-trace: introduce TraceWriter, trace now only exposes the channel 2024-02-08 15:03:33 +01:00
0f327f2821 tracing-trace: implement Error on Error 2024-02-08 15:03:33 +01:00
77254765e8 get rids of env loggegr and fix the tests 2024-02-08 15:03:33 +01:00
ce6e6ec2c5 stops profiling in a file by default 2024-02-08 15:03:32 +01:00
91a8f74763 Add cancel log route 2024-02-08 15:03:32 +01:00
abaa72e2bf start handling reloads with profiling 2024-02-08 15:03:32 +01:00
3c3a258a22 start exposing the profiling layer 2024-02-08 15:03:32 +01:00
73e66d5a97 Add dummy log when calling tasks 2024-02-08 15:03:32 +01:00
b8da117b9c Simplify stream implementation 2024-02-08 15:03:32 +01:00
5e52107474 better than before??? 2024-02-08 15:03:32 +01:00
bcf1c4dae5 make it compile and runtime error 2024-02-08 15:03:32 +01:00
50f84d43f5 init commit 2024-02-08 15:03:32 +01:00
f76cc0806e WIP: first draft at introducing a new log route 2024-02-08 15:03:32 +01:00
2f1abd2c03 nelson is not used anymore 2024-02-08 15:03:32 +01:00
dedc91e2cf use json lines 2024-02-08 15:03:32 +01:00
a61d8c59ff Add span stats processor 2024-02-08 15:03:32 +01:00
6e23040464 Use with tokio channel in Meilisearch 2024-02-08 15:03:32 +01:00
8febbf64ce Switch to tokio channel 2024-02-08 15:03:32 +01:00
b141c82a04 Support Events in trace layer 2024-02-08 15:03:32 +01:00
cc79cd0b04 Switch to a single view indicating current usage 2024-02-08 15:03:32 +01:00
256538ccb9 Refactor memory handling and add markers 2024-02-08 15:03:31 +01:00
ca8990394e Remove the stats_alloc from the default features 2024-02-08 15:03:31 +01:00
83fb2949c3 Give the allocator to the tracer when necessary 2024-02-08 15:03:31 +01:00
6cf703387d Format the bytes as human readable bytes
Uses the same `byte_unit` version as `meilisearch`
2024-02-08 15:03:31 +01:00
771861599b Logging the memory usage over time 2024-02-08 15:03:31 +01:00
7e47cea0c4 Add tracing to Meilisearch 2024-02-08 15:03:31 +01:00
5d7061682e Add tracing to milli 2024-02-08 15:03:31 +01:00
02e6c8a440 Add tracing to index-scheduler 2024-02-08 15:03:31 +01:00
89401d097b Add tracing-trace 2024-02-08 15:03:30 +01:00
72ebac1fbb Merge #4388
4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops

This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available.

Fixes #4152.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-08 13:19:28 +00:00
a616a1d37b Merge #4389
4389: Stabilize scoreDetails r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4359

## What does this PR do?

### User standpoint

- Users no longer need to enable the `scoreDetails` experimental feature to use `showRankingScoreDetails` in search queries.
- ⚠️ **Breaking change**: sending an object containing the key `"scoreDetails"` to the `/experimental-features` route is now an error. However, importing a dump of a database where that feature was enabled completes successfully.

### Implementation standpoint
- remove `scoreDetails` from the experimental features
- remove check on the experimental feature `scoreDetails` before accepting `showRankingScoreDetails`
- remove `scoreDetails` from the accepted fields in the `/experimental-features` route
- fix tests accordingly

## Manual tests

1. exported a dump with the `scoreDetails` feature enabled on `main`
    - tried to import the dump after the changes in this PR
    - the dump imported successfully
2. tried to make a search with `showRankingScoreDetails: true`
    - the ranking score details are displayed
    - an automated test case also exists and passes
3. tried to enable the `scoreDetails` in `/experimental-features`
    - get error message 
      ```
       Unknown field `scoreDetails`: expected one of `vectorStore`, `metrics`, `exportPuffinReports`
      ```

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-08 10:40:00 +00:00
3e120619fa Merge #4375
4375: Feat: add new OpenAI models and ability to override dimensions r=dureuill a=Gosti

# Pull Request

Fixes #4394 

## Related discussion
https://github.com/orgs/meilisearch/discussions/677#discussioncomment-8306384

## What does this PR do?
- Add text-embedding-3-small
- Add text-embedding-3-large
- Add optional dimensions parameter for both new models


## Note
As the dimensions option is not available for text-embedding-ada-002 I've added a manual check to prevent, but I feel it could be implemented in a more idiomatic rust 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Gosti <gostitsog@gmail.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-07 16:20:15 +00:00
a1caac9bfb Correct distribution shifts for new models 2024-02-07 15:09:16 +01:00
88d03c56ab Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model 2024-02-07 11:52:09 +01:00
32ee05ccef Fix default dimensions for models 2024-02-07 11:52:09 +01:00
74c180267e pass dimensions only when defined 2024-02-07 11:52:08 +01:00
517f5332d6 Allow actually passing dimensions for OpenAI source
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
9ac5750096 Retrieve the overriden dimensions from the configuration when fetching settings 2024-02-07 11:51:44 +01:00
7ae4013478 Make sure the overriden dimensions are always used when embedding 2024-02-07 11:51:44 +01:00
fb705116a6 feat: add new models and ability to override dimensions 2024-02-07 11:51:42 +01:00
053306c0e7 Try with 500MiB 2024-02-07 11:24:43 +01:00
84235a63df Merge #4360
4360: fix readme broken links r=curquiza a=Elliot67

# Pull Request

## What does this PR do?
- fix some links in the readme

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Elliot Lintz <45725915+Elliot67@users.noreply.github.com>
Co-authored-by: gui machiavelli <hey@guimachiavelli.com>
2024-02-06 16:00:16 +00:00
29f8300ac7 Update README.md 2024-02-06 16:49:29 +01:00
05edd85d75 Stabilize scoreDetails 2024-02-06 11:15:19 +01:00
9eeb75d501 Clamp the max memory of the grenad sorters to a reasonable maximum 2024-02-06 10:47:04 +01:00
4792651462 Merge #4384
4384: Bump peter-evans/repository-dispatch from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [peter-evans/repository-dispatch](https://github.com/peter-evans/repository-dispatch) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/peter-evans/repository-dispatch/releases">peter-evans/repository-dispatch's releases</a>.</em></p>
<blockquote>
<h2>Repository Dispatch v3.0.0</h2>
<p>⚙️  Updated runtime to Node.js 20</p>
<ul>
<li>The action now requires a minimum version of <a href="https://github.com/actions/runner/releases/tag/v2.308.0">v2.308.0</a> for the Actions runner. Update self-hosted runners to v2.308.0 or later to ensure compatibility.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Bump prettier to fix deps by <a href="https://github.com/peter-evans"><code>`@​peter-evans</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/255">peter-evans/repository-dispatch#255</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.12 to 18.17.14 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/257">peter-evans/repository-dispatch#257</a></li>
<li>build(deps-dev): bump <code>`@​vercel/ncc</code>` from 0.36.1 to 0.38.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/258">peter-evans/repository-dispatch#258</a></li>
<li>build(deps): bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/259">peter-evans/repository-dispatch#259</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.14 to 18.17.16 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/261">peter-evans/repository-dispatch#261</a></li>
<li>build(deps): bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/262">peter-evans/repository-dispatch#262</a></li>
<li>build(deps-dev): bump jest-circus from 29.6.4 to 29.7.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/263">peter-evans/repository-dispatch#263</a></li>
<li>build(deps-dev): bump eslint from 8.48.0 to 8.49.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/264">peter-evans/repository-dispatch#264</a></li>
<li>Update distribution by <a href="https://github.com/actions-bot"><code>`@​actions-bot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/265">peter-evans/repository-dispatch#265</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.16 to 18.17.18 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/266">peter-evans/repository-dispatch#266</a></li>
<li>build(deps-dev): bump eslint-plugin-github from 4.10.0 to 4.10.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/267">peter-evans/repository-dispatch#267</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.17.18 to 18.18.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/268">peter-evans/repository-dispatch#268</a></li>
<li>build(deps-dev): bump eslint from 8.49.0 to 8.50.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/269">peter-evans/repository-dispatch#269</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.0 to 18.18.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/271">peter-evans/repository-dispatch#271</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.0.0 to 5.0.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/275">peter-evans/repository-dispatch#275</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.3 to 18.18.5 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/274">peter-evans/repository-dispatch#274</a></li>
<li>build(deps-dev): bump eslint from 8.50.0 to 8.51.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/276">peter-evans/repository-dispatch#276</a></li>
<li>build(deps-dev): bump <code>`@​babel/traverse</code>` from 7.16.3 to 7.23.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/278">peter-evans/repository-dispatch#278</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.5 to 18.18.6 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/279">peter-evans/repository-dispatch#279</a></li>
<li>build(deps-dev): bump <code>`@​vercel/ncc</code>` from 0.38.0 to 0.38.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/280">peter-evans/repository-dispatch#280</a></li>
<li>build(deps-dev): bump eslint from 8.51.0 to 8.52.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/281">peter-evans/repository-dispatch#281</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.6 to 18.18.7 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/282">peter-evans/repository-dispatch#282</a></li>
<li>build(deps): bump actions/setup-node from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/283">peter-evans/repository-dispatch#283</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.7 to 18.18.8 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/284">peter-evans/repository-dispatch#284</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.8 to 18.18.9 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/285">peter-evans/repository-dispatch#285</a></li>
<li>build(deps-dev): bump eslint from 8.52.0 to 8.53.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/286">peter-evans/repository-dispatch#286</a></li>
<li>build(deps-dev): bump prettier from 3.0.3 to 3.1.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/287">peter-evans/repository-dispatch#287</a></li>
<li>build(deps-dev): bump eslint from 8.53.0 to 8.54.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/289">peter-evans/repository-dispatch#289</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.9 to 18.18.13 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/290">peter-evans/repository-dispatch#290</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.18.13 to 18.19.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/291">peter-evans/repository-dispatch#291</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.0 to 18.19.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/292">peter-evans/repository-dispatch#292</a></li>
<li>build(deps-dev): bump eslint from 8.54.0 to 8.55.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/293">peter-evans/repository-dispatch#293</a></li>
<li>build(deps-dev): bump prettier from 3.1.0 to 3.1.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/296">peter-evans/repository-dispatch#296</a></li>
<li>build(deps): bump actions/upload-artifact from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/295">peter-evans/repository-dispatch#295</a></li>
<li>build(deps-dev): bump eslint from 8.55.0 to 8.56.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/297">peter-evans/repository-dispatch#297</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.0.1 to 5.1.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/298">peter-evans/repository-dispatch#298</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.1.1 to 5.1.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/299">peter-evans/repository-dispatch#299</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.3 to 18.19.4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/300">peter-evans/repository-dispatch#300</a></li>
<li>build(deps-dev): bump eslint-plugin-prettier from 5.1.2 to 5.1.3 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/301">peter-evans/repository-dispatch#301</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.4 to 18.19.6 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/302">peter-evans/repository-dispatch#302</a></li>
<li>build(deps-dev): bump prettier from 3.1.1 to 3.2.4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/303">peter-evans/repository-dispatch#303</a></li>
<li>build(deps-dev): bump <code>`@​types/node</code>` from 18.19.6 to 18.19.8 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/304">peter-evans/repository-dispatch#304</a></li>
<li>feat: update runtime to node 20 by <a href="https://github.com/peter-evans"><code>`@​peter-evans</code></a>` in <a href="https://redirect.github.com/peter-evans/repository-dispatch/pull/305">peter-evans/repository-dispatch#305</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="ff45666b94"><code>ff45666</code></a> feat: update runtime to node 20 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/305">#305</a>)</li>
<li><a href="a4a90276d0"><code>a4a9027</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.6 to 18.19.8 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/304">#304</a>)</li>
<li><a href="2605253283"><code>2605253</code></a> build(deps-dev): bump prettier from 3.1.1 to 3.2.4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/303">#303</a>)</li>
<li><a href="ab3258eeef"><code>ab3258e</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.4 to 18.19.6 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/302">#302</a>)</li>
<li><a href="240bc73193"><code>240bc73</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.1.2 to 5.1.3 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/301">#301</a>)</li>
<li><a href="8aa15c54a0"><code>8aa15c5</code></a> build(deps-dev): bump <code>`@​types/node</code>` from 18.19.3 to 18.19.4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/300">#300</a>)</li>
<li><a href="22aa07cf23"><code>22aa07c</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.1.1 to 5.1.2 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/299">#299</a>)</li>
<li><a href="ba0298574b"><code>ba02985</code></a> build(deps-dev): bump eslint-plugin-prettier from 5.0.1 to 5.1.1 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/298">#298</a>)</li>
<li><a href="accfd7b5bf"><code>accfd7b</code></a> build(deps-dev): bump eslint from 8.55.0 to 8.56.0 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/297">#297</a>)</li>
<li><a href="3c7d964ae9"><code>3c7d964</code></a> build(deps): bump actions/upload-artifact from 3 to 4 (<a href="https://redirect.github.com/peter-evans/repository-dispatch/issues/295">#295</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/peter-evans/repository-dispatch/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=peter-evans/repository-dispatch&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-05 14:13:35 +00:00
58c3501b54 Bump peter-evans/repository-dispatch from 2 to 3
Bumps [peter-evans/repository-dispatch](https://github.com/peter-evans/repository-dispatch) from 2 to 3.
- [Release notes](https://github.com/peter-evans/repository-dispatch/releases)
- [Commits](https://github.com/peter-evans/repository-dispatch/compare/v2...v3)

---
updated-dependencies:
- dependency-name: peter-evans/repository-dispatch
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-01 17:05:50 +00:00
ff76d8f21a Merge #4382
4382: Bring back changes from `release-v1.6.1` into `main` r=curquiza a=dureuill

Bring back changes from release-v1.6.1 into main

Supersedes https://github.com/meilisearch/meilisearch/pull/4380 and #4381 

Third time's the charm

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2024-02-01 11:16:31 +00:00
698ea5139d Update Cargo.lock 2024-02-01 10:40:23 +01:00
880e790bff Update Cargo.toml 2024-02-01 10:33:27 +01:00
fbf5f2a392 Don't use a runtime in extract_embedder, use it only for OpenAI 2024-02-01 10:33:27 +01:00
1555870088 Truncate HuggingFace vectors that are too long 2024-02-01 10:33:27 +01:00
9f8f3105d5 make clippy happy 2024-02-01 10:33:27 +01:00
318843aacd add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-02-01 10:33:27 +01:00
6d111139b5 Add test 2024-02-01 10:33:27 +01:00
dff2707471 Use MatchingWords from keyword search instead of the one from vector search 2024-02-01 10:33:27 +01:00
c57f7f7379 Update version for the next release (v1.6.1) in Cargo.toml 2024-02-01 10:33:26 +01:00
b968616a99 Merge #4364
4364: Revert "Remove panic on the geosearch" r=curquiza a=irevoire

After more thought about it, we want to fix this bug in a patch release instead of `main`.
I revert this PR for now, but the fix will still land on `main` once we bring back the change of the `v1.6.1` on `main`.

Reverts meilisearch/meilisearch#4337

Co-authored-by: Tamo <irevoire@protonmail.ch>
2024-01-25 18:01:08 +00:00
c1bf33a112 Revert "Remove panic on the geosearch" 2024-01-25 18:51:19 +01:00
ddc2b7129a fix readme broken links 2024-01-24 22:50:18 +01:00
b6fc181993 Merge #4304
4304: Add CUDA GPU support for Hugging Face embedders r=Kerollmops a=dureuill

Adds a "cuda" feature to `milli`.

Compiling with this feature requires that the CUDA support library be installed (see "with CUDA support" paragraph in https://huggingface.github.io/candle/guide/installation.html), and adds CUDA support to the `huggingFace` embedder.

To enable GPU support, users will need to:

1. Have a compatible NVidia GPU under Linux
2. Follow [the guide](https://huggingface.github.io/candle/guide/installation.html) to install the CUDA dependencies
3. Compile Meilisearch with the `cuda` feature: `cargo build --release --features cuda`

# Impact

Enabling the CUDA feature allows to use an available GPU to compute embeddings with a `huggingFace` embedder. 
On an AWS Graviton 2, this yields a x3 - x5 improvement on indexing time.

# Technical details

- I had to change the CI so that the cuda feature is not included in the `Tests all features` workflow
- To achieve that, I had to add a binary following the `cargo xtask` design pattern, to list all features excepted the cuda one.
- I then changed the workflow accordingly (renamed to "Tests almost all features" 😉)
- A test run of the new feature was done on a temporary version of this PR that had it enabled for PRs: [See the results here](https://github.com/meilisearch/meilisearch/actions/runs/7461331929/job/20301216732)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-22 13:55:04 +00:00
388fce9e46 Merge #4345
4345: Bump h2 from 0.3.20 to 0.3.24 r=curquiza a=dependabot[bot]

Bumps [h2](https://github.com/hyperium/h2) from 0.3.20 to 0.3.24.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/hyperium/h2/releases">h2's releases</a>.</em></p>
<blockquote>
<h2>v0.3.24</h2>
<h2>Fixed</h2>
<ul>
<li>Limit error resets for misbehaving connections.</li>
</ul>
<h2>v0.3.23</h2>
<h2>What's Changed</h2>
<ul>
<li>cherry-pick fix: streams awaiting capacity lockout in <a href="https://redirect.github.com/hyperium/h2/pull/734">hyperium/h2#734</a></li>
</ul>
<h2>v0.3.22</h2>
<h2>What's Changed</h2>
<ul>
<li>Add <code>header_table_size(usize)</code> option to client and server builders.</li>
<li>Improve throughput when vectored IO is not available.</li>
<li>Update indexmap to 2.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/tottoto"><code>`@​tottoto</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/714">hyperium/h2#714</a></li>
<li><a href="https://github.com/xiaoyawei"><code>`@​xiaoyawei</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/712">hyperium/h2#712</a></li>
<li><a href="https://github.com/Protryon"><code>`@​Protryon</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/719">hyperium/h2#719</a></li>
<li><a href="https://github.com/4JX"><code>`@​4JX</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/638">hyperium/h2#638</a></li>
<li><a href="https://github.com/vuittont60"><code>`@​vuittont60</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/724">hyperium/h2#724</a></li>
</ul>
<h2>v0.3.21</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix opening of new streams over peer's max concurrent limit.</li>
<li>Fix <code>RecvStream</code> to return data even if it has received a <code>CANCEL</code> stream error.</li>
<li>Update MSRV to 1.63.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/DDtKey"><code>`@​DDtKey</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/703">hyperium/h2#703</a></li>
<li><a href="https://github.com/jwilm"><code>`@​jwilm</code></a>` made their first contribution in <a href="https://redirect.github.com/hyperium/h2/pull/707">hyperium/h2#707</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md">h2's changelog</a>.</em></p>
<blockquote>
<h1>0.3.24 (January 17, 2024)</h1>
<ul>
<li>Limit error resets for misbehaving connections.</li>
</ul>
<h1>0.3.23 (January 10, 2024)</h1>
<ul>
<li>Backport fix from 0.4.1 for stream capacity assignment.</li>
</ul>
<h1>0.3.22 (November 15, 2023)</h1>
<ul>
<li>Add <code>header_table_size(usize)</code> option to client and server builders.</li>
<li>Improve throughput when vectored IO is not available.</li>
<li>Update indexmap to 2.</li>
</ul>
<h1>0.3.21 (August 21, 2023)</h1>
<ul>
<li>Fix opening of new streams over peer's max concurrent limit.</li>
<li>Fix <code>RecvStream</code> to return data even if it has received a <code>CANCEL</code> stream error.</li>
<li>Update MSRV to 1.63.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="7243ab5854"><code>7243ab5</code></a> Prepare v0.3.24</li>
<li><a href="d919cd6fd8"><code>d919cd6</code></a> streams: limit error resets for misbehaving connections</li>
<li><a href="a7eb14a487"><code>a7eb14a</code></a> v0.3.23</li>
<li><a href="b668c7fbe2"><code>b668c7f</code></a> fix: streams awaiting capacity lockout (<a href="https://redirect.github.com/hyperium/h2/issues/730">#730</a>) (<a href="https://redirect.github.com/hyperium/h2/issues/734">#734</a>)</li>
<li><a href="0f412d8b9c"><code>0f412d8</code></a> v0.3.22</li>
<li><a href="c7ca62f69b"><code>c7ca62f</code></a> docs: fix typos (<a href="https://redirect.github.com/hyperium/h2/issues/724">#724</a>)</li>
<li><a href="ef743ecb22"><code>ef743ec</code></a> Add a setter for header_table_size (<a href="https://redirect.github.com/hyperium/h2/issues/638">#638</a>)</li>
<li><a href="56651e6e51"><code>56651e6</code></a> fix lint about unused import</li>
<li><a href="4aa7b16342"><code>4aa7b16</code></a> Fix documentation for max_send_buffer_size (<a href="https://redirect.github.com/hyperium/h2/issues/718">#718</a>)</li>
<li><a href="d03c54a80d"><code>d03c54a</code></a> chore(dependencies): update tracing minimal version to 0.1.35</li>
<li>Additional commits viewable in <a href="https://github.com/hyperium/h2/compare/v0.3.20...v0.3.24">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=h2&package-manager=cargo&previous-version=0.3.20&new-version=0.3.24)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-22 11:53:51 +00:00
d35fe43fd5 Update lock file 2024-01-22 10:49:17 +01:00
f692021bfc Implement PR comments 2024-01-22 10:25:56 +01:00
1b90778bf5 Change CI 2024-01-22 10:25:56 +01:00
66ae81a909 Make it so binary can be used with cargo xtask 2024-01-22 10:25:56 +01:00
4aa4a15dc9 Add to Cargo.lock 2024-01-22 10:25:54 +01:00
4b4e8ea2a4 Add binary to list features 2024-01-22 10:25:16 +01:00
84f49d76cd Add cuda feature 2024-01-22 10:25:16 +01:00
afb0e8eab9 Merge #4325
4325: Add Setting API reminder in issue template r=ManyTheFish a=ManyTheFish

When adding a new setting, several important points can be easily forgotten.
This PR adds a small reminder list of some of these points in the issue template.


Co-authored-by: Many the fish <many@meilisearch.com>
2024-01-22 09:02:27 +00:00
b5b2333a05 Bump h2 from 0.3.20 to 0.3.24
Bumps [h2](https://github.com/hyperium/h2) from 0.3.20 to 0.3.24.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md)
- [Commits](https://github.com/hyperium/h2/compare/v0.3.20...v0.3.24)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-19 16:20:22 +00:00
40fa0b4df6 Update .github/ISSUE_TEMPLATE/sprint_issue.md 2024-01-18 11:17:29 +01:00
ab4d614599 Update .github/ISSUE_TEMPLATE/sprint_issue.md
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-18 10:28:30 +01:00
262b20fdba Merge #4330
4330: Add job variable to grafana dashboard r=irevoire a=capJavert

# Pull Request

## Related issue
Fixes https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282

## What does this PR do?

"meilisearch" as [job_name](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#job_name) was hardcoded in the dashboard config so if user sets anything but "meilisearch" as job_name on prometheus side the dashboard does not work.

With this change dasboard will auto load the values from data source (much like instance variable) and show the correct data. This now also adds support for multiple meilisearch jobs in single dashboard. 

See: https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: capJavert <ante@kickass.website>
2024-01-17 15:48:24 +00:00
9020606c45 Merge #4337
4337: Remove panic on the geosearch r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes  #4333

## What does this PR do?
- Add tests for the enrich pipeline on malformed documents with `null` value
- Reproduce the issue when updating the settings while there is malformed documents in the DB
- Fix the bug


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-17 15:09:46 +00:00
0887186ecf make clippy happy 2024-01-17 16:07:10 +01:00
7d190d8078 add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-01-17 15:51:52 +01:00
3b8a9597e2 Merge #4332
4332: Update the dependencies r=irevoire a=Kerollmops

This PR upgrades the dependencies and fixes #4287.

 - ~We keep arroy at the current commit. We will release and use the latest version published when possible~
 - We also updated arroy to 0.2.0.
 - I rolled back the version of rustls has too many breaking changes.
 - I had to keep HTTP to 0.2.11 due to actix-cors.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-01-17 13:42:02 +00:00
f275554982 Make sure we override the default Rust version 2024-01-16 18:10:30 +01:00
d997ea1f01 Make Clippy happy 2024-01-16 17:10:48 +01:00
50e1d34c66 Rollback http to 0.2.11 2024-01-16 16:57:33 +01:00
406531c991 Fix sysinfo 2024-01-16 16:49:51 +01:00
01e2c3d6bb Bump arroy to v0.2.0 2024-01-16 16:45:55 +01:00
cfaa522d68 Bump the Rust version to 1.75.0 2024-01-16 16:36:54 +01:00
0c8d1644a6 Rollback rustls to 0.20.9 2024-01-16 15:55:16 +01:00
5e0268d40e Fix the sysinfo errors 2024-01-16 15:43:03 +01:00
9f9ad4cc05 Fix Clippy warnings 2024-01-16 15:27:24 +01:00
3ee7682fa7 Fix some integer comparisons 2024-01-16 15:22:23 +01:00
7f125bfb12 Update incompatible dependencies 2024-01-16 15:15:54 +01:00
5869ca7716 Upgrade all compatible dependencies 2024-01-16 15:05:03 +01:00
7a89abd2a0 Merge #4263
4263: Bump rustls-webpki from 0.101.3 to 0.101.7 r=irevoire a=dependabot[bot]

Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.101.3 to 0.101.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/rustls/webpki/releases">rustls-webpki's releases</a>.</em></p>
<blockquote>
<h2>0.101.7</h2>
<ul>
<li>Upgrades <code>*ring*</code> to 0.17, and <code>untrusted</code> to 0.9. Note: since <code>untrusted</code> appears in the <code>Error</code> API this may be a breaking change for applications using two <code>untrusted</code> versions.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>Simplify tests for DER errors by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Upgrade to ring 0.17, untrusted 0.9 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Bump MSRV to 1.61 by <a href="https://github.com/djc"><code>`@​djc</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/193">rustls/webpki#193</a></li>
<li>Upgrade to rcgen 0.11.3 by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/189">rustls/webpki#189</a>, <a href="https://redirect.github.com/rustls/webpki/pull/195">rustls/webpki#195</a></li>
<li>v0.101.7 preparation by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/199">rustls/webpki#199</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.6...v/0.101.7">https://github.com/rustls/webpki/compare/v/0.101.6...v/0.101.7</a></p>
<h2>0.101.6</h2>
<ul>
<li>The <code>CertificateRevocationList</code> trait's <code>verify_signature</code> <code>Budget</code> argument was removed. This was a semver incompatible change mistakenly introduced in v0.101.5.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>crl: rm Budget from verify_signature fn by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/187">rustls/webpki#187</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.5...v/0.101.6">https://github.com/rustls/webpki/compare/v/0.101.5...v/0.101.6</a></p>
<h2>0.101.5</h2>
<ul>
<li>Path building complexity is now limited to a maximum budget of path finding operations, avoiding exponential processing time when encountering certificate chains containing many certificates with the same subject/issuer distinguished name but different subject public key information.</li>
<li>Name constraints evaluation is now limited to a maximum number of comparison operations, avoiding exponential processing time when encountering certificate chains containing many name constraints and subject alternate names.</li>
<li>Subject common names are no longer parsed for name iteration, or applying name constraints. Webpki only uses Subject Alternate Names when validating certificates, and the common name handling was buggy, producing <code>Error::BadDer</code> when iterating certificates with printable string subject common names, or omitted common names encoded as an empty sequence.</li>
</ul>
<h2>What's Changed</h2>
<p>The following PRs were backported to the rel-0.101 branch in <a href="https://redirect.github.com/rustls/webpki/issues/170">#170</a>:</p>
<ul>
<li>Further limits on expensive path building (<a href="https://redirect.github.com/rustls/webpki/issues/163">#163</a>)</li>
<li>Budget tweaks (<a href="https://redirect.github.com/rustls/webpki/issues/164">#164</a>)</li>
<li>Bound name constraint comparisons (<a href="https://redirect.github.com/rustls/webpki/issues/165">#165</a>)</li>
<li>Remove subject common name parsing (<a href="https://redirect.github.com/rustls/webpki/issues/169">#169</a>, thanks to <a href="https://github.com/hawkw"><code>`@​hawkw</code></a>)</li>`
<li>Correct handling of fatal errors (<a href="https://redirect.github.com/rustls/webpki/issues/168">#168</a>)</li>
</ul>
<p>Thanks to all who have contributed, on behalf of the rustls team (<a href="https://github.com/ctz"><code>`@​ctz</code></a>,` <a href="https://github.com/cpu"><code>`@​cpu</code></a>` and <a href="https://github.com/djc"><code>`@​djc</code></a>)!</p>`
<h2>0.101.4</h2>
<h2>Release notes</h2>
<ul>
<li>certificate path building and verification is now capped at 100 signature validation operations to avoid the risk of CPU usage denial-of-service attack when validating crafted certificate chains producing quadratic runtime. This risk affected both clients, as well as servers that verified client certificates.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>v0.101.4 prep by <a href="https://github.com/cpu"><code>`@​cpu</code></a>` in <a href="https://redirect.github.com/rustls/webpki/pull/153">rustls/webpki#153</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.4">https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.4</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="ee5aab1dff"><code>ee5aab1</code></a> Cargo: v0.101.6 -&gt; v0.101.7</li>
<li><a href="4f721a901f"><code>4f721a9</code></a> Upgrade to rcgen 0.11.3</li>
<li><a href="3be3625584"><code>3be3625</code></a> Bump MSRV to 1.61</li>
<li><a href="bb7c7f47ab"><code>bb7c7f4</code></a> Upgrade to ring 0.17, untrusted 0.9</li>
<li><a href="2eeb2920cf"><code>2eeb292</code></a> Simplify tests for DER errors</li>
<li><a href="7956538ee7"><code>7956538</code></a> Cargo: v0.101.5 -&gt; v0.101.6</li>
<li><a href="7f8208ec06"><code>7f8208e</code></a> crl: rm <code>Budget</code> from <code>verify_signature</code> fn</li>
<li><a href="7cb6c646a0"><code>7cb6c64</code></a> Cargo: bump version 0.101.4 -&gt; 0.101.5</li>
<li><a href="2dd2a06016"><code>2dd2a06</code></a> verify_cert: use enum for build chain error</li>
<li><a href="c255d61a6a"><code>c255d61</code></a> verify_cert: correct handling of fatal errors</li>
<li>Additional commits viewable in <a href="https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.7">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rustls-webpki&package-manager=cargo&previous-version=0.101.3&new-version=0.101.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-16 13:55:49 +00:00
d9d0419845 Update the dependencies 2024-01-16 14:38:48 +01:00
5dc8d9e9bf feat: add job variable to dashboard
meilisearch job_name was hardcoded in the dashboard config

so if user sets anything but meilisearch as job_name on

prometheus side the dashboard does not work

see: https://github.com/orgs/meilisearch/discussions/625#discussioncomment-8143282
2024-01-16 12:44:37 +01:00
9e12a91afb Update .github/ISSUE_TEMPLATE/sprint_issue.md 2024-01-16 11:04:50 +01:00
8e016fbfeb Merge #4319
4319: Update README r=curquiza a=codesmith-emmy

# Pull Request

## Related issue
Fixes #<issue_number>

## What does this PR do?
- ...

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: emmanuel <154705254+codesmith-emmy@users.noreply.github.com>
2024-01-15 18:41:14 +00:00
1ccde9bf0b Merge #4316
4316: Autobatch the task deletions r=curquiza a=irevoire

# Pull Request

## Related issue
Fix part of https://github.com/meilisearch/meilisearch-support/issues/69
Fix #4315 

## What does this PR do?
- Autobatch the task deletions

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-15 17:54:50 +00:00
34e814f400 Merge #4327
4327: Bring back changes from `release-v1.6.0` to `main` r=dureuill a=curquiza



Co-authored-by: Paul Sanders <psanders1@gmail.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2024-01-15 16:52:05 +00:00
857cd09285 Add Setting API reminder in issue template
When adding a new setting, there are several important points that can be easily forgotten.
This PR adds a small reminder list of some of these points.
2024-01-15 11:19:13 +01:00
a6fa0b97ec Merge #4318
4318: Hide embedders r=ManyTheFish a=dureuill

Hides `embedders` when it is an empty dictionary.

Manual tests:

- getting settings with empty embedders: not displayed
- getting settings with non-empty embedders: displayed like before
- dump with empty embedders: can be imported
- dump with non-empty embedders: can be imported

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-15 09:37:31 +00:00
552127021f Update 2024-01-12 16:03:23 +01:00
38abfec611 Fix tests 2024-01-11 21:35:30 +01:00
84a5c304fc Don't display the embedders setting when it is an empty dict 2024-01-11 21:35:06 +01:00
e93d36d5b9 Merge #4313
4313: Fix document formatting performances r=Kerollmops a=ManyTheFish

reduce the formatted option list to the attributes that should be formatted,
instead of all the attributes to display.
The time to compute the `format` list scales with the number of fields to format;
cumulated with `map_leaf_values` that iterates over all the nested fields, it gives a quadratic complexity:
`d*f` where `d` is the total number of fields to display and `f` is the total number of fields to format.

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-01-11 14:19:44 +00:00
95f8e21533 fix typos 2024-01-11 15:07:08 +01:00
b4d7d80ad9 autobatch the task deletions 2024-01-11 14:58:07 +01:00
68f197624e Merge #4314
4314: Fix proximity precision telemetry r=Kerollmops a=ManyTheFish

The proximity precision telemetry was partially missing in the global setting route.
This PR adds the missing field and return the default value when the value is not set.


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-01-11 13:50:03 +00:00
b79b03d4e2 Fix proximity precision telemetry 2024-01-11 13:24:26 +01:00
86270e6878 Transform fields contained into _format into strings 2024-01-11 12:44:56 +01:00
81b6128b29 Update tests 2024-01-11 12:28:32 +01:00
5f5a486895 Reduce formatting time 2024-01-11 11:36:41 +01:00
5f4fc6c955 Add timer logs 2024-01-11 09:44:16 +01:00
1f5e8fc072 Merge #4311
4311: Limit the number of values returned by the facet search r=dureuill a=Kerollmops

This PR fixes a bug where the number of values per facet returned by the `indexes/{index}/facet-search` route was not tacking the `faceting.maxValuePerFacet` setting. It also adds a test.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-01-10 16:04:06 +00:00
3f3462ab62 Limit the number of values returned by the facet search 2024-01-10 16:54:08 +01:00
93363b0201 Merge #4308
4308: Fix hang on `/indexes` and `/stats` routes r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4218 

## Context

- A previous fix added a field to the `IndexScheduler` to memorize the `currently_updating_index`, so that accessing it through the search would return the handle without trying to open it. This resolved a hang on the search, but #4218 reported further hangs on the `/indexes` and `/stats` routes
- These routes were shunting the `IndexScheduler` and using internal `IndexMapper` logic to access the indexes, again trying to reopen the updating index.

## What does this PR do?

- Moves the logic relative to the `currently_updating_index` from the `IndexScheduler` to the `IndexMapper`, so that any index request to the `IndexMapper` can benefit from it.

## Test

1. Follow reproducer from #4218 
2. Before this PR, notice a hang on `/stats` and `/indexes`, but not on `/indexes/<updating_index>/search`
3. After this PR, notice no hang on either of `/stats`, `/indexes` or `/indexes/<updating_index>/search`



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-10 10:46:20 +00:00
97bb1ff9e2 Move currently_updating_index to IndexMapper 2024-01-09 15:37:27 +01:00
5ee1378856 Merge #4303
4303: Display default value when proximityPrecision is not set r=dureuill a=ManyTheFish

# Pull Request

## Related
Issue: #4187
Spec change requests: https://github.com/meilisearch/specifications/pull/261#discussion_r1441725272

## What does this PR do?
- Display default value when proximityPrecision is not set instead of Null


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-01-08 14:29:57 +00:00
e27b850b09 move the default display strategy on setting getter function 2024-01-08 14:03:47 +01:00
f75f22e026 Display default value when proximityPrecision is not set 2024-01-08 11:09:37 +01:00
6203f4acef Merge #4296
4296: Fix single element search r=irevoire a=dureuill

# Pull Request

Before this PR, indexing a single vector in a single document would result in the vector not being found by the vector search.

This PR adds a test case for this condition, and resolves it by bumping arroy to a version containing the fix.

# Test case

Output of the test before and after this PR:

```diff
diff --git a/meilisearch/tests/search/hybrid.rs b/meilisearch/tests/search/hybrid.rs
index 2cd4b83e7..79819cab2 100644
--- a/meilisearch/tests/search/hybrid.rs on release-v1.6.0
+++ b/meilisearch/tests/search/hybrid.rs on fix-single-element-search
`@@` -171,5 +171,5 `@@` async fn single_document() {
     .await;

     snapshot!(code, `@"200` OK");
-    snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":0.0}"###);
+    snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":1.0,"_semanticScore":1.0}"###);
 }
```



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-03 15:01:43 +00:00
12edc2c20a Update arroy to a fixed version 2024-01-03 15:59:37 +01:00
94b9f3b310 Add test 2024-01-03 15:56:20 +01:00
5204c0b60b Merge #4297
4297: Update license for 2024 r=curquiza a=meili-bot

_This PR is auto-generated._


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2024-01-03 13:54:19 +00:00
e73cd692db Update LICENSE 2024-01-03 14:32:41 +01:00
29b453346b Merge #4293
4293: Update SDK test dependencies r=curquiza a=curquiza

Replace dependabot updates

The changes are really un-impactful for the engine team velocity because is about a CI
- that does not run during release deployment
- that does not run to merge a PR

It's only a weekly scheduled CI to check the breaking we introduced in the integrations.

I updated the dependencies based on what we do on the integration CIs
For example for dart, I looked at what we have in the [Dart CI](63fd758882/.github/workflows/tests.yml (L16-L54)) and I updated our CI in this repo accordingly. I did the same for each repository. This ensures we test the same things.


Co-authored-by: curquiza <clementine@meilisearch.com>
2024-01-03 13:26:50 +00:00
c4bb435374 Merge #4295
4295: fix compilation warnings on main r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4292

## What does this PR do?
- Removed unused imports

#4294 fixes the issue for the release v1.6

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-02 15:33:06 +00:00
da99a04eb3 Merge #4294
4294: fix compilation warnings for release v1.6 r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes #4292

## What does this PR do?
- Removed unused imports

#4295 fixes the issue no main

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-02 15:00:40 +00:00
54ae6951eb fix warning 2024-01-02 15:19:30 +01:00
2bcff2ea46 fix warning 2024-01-02 15:19:00 +01:00
1275e72e0b Update SDK test dependencies 2024-01-02 09:59:46 +01:00
658ec6e0a4 Merge #4279
4279: Check experimental feature on setting update query rather than in the task. r=ManyTheFish a=dureuill

Improve the UX by checking for the vector store feature and returning an error synchronously when sending a setting update, rather than in the indexing task.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-22 11:36:12 +00:00
43e822e802 Merge #4238
4238: Task queue webhook r=dureuill a=irevoire

# Prototype `prototype-task-queue-webhook-1`

The prototype is available through Docker by using the following command:

```bash
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-task-queue-webhook-1
```

# Pull Request

Implements the task queue webhook.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4236

## What does this PR do?
- Provide a new cli and env var for the webhook, respectively called `--task-webhook-url` and `MEILI_TASK_WEBHOOK_URL`
- Also supports sending the requests with a custom `Authorization` header by specifying the optional `--task-webhook-authorization-header` CLI parameter or `MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER` env variable.
- Throw an error if the specified URL is invalid
- Every time a batch is processed, send all the finished tasks into the webhook with our public `TaskView` type as a JSON Line GZIPed body.
- Add one test.

## PR checklist

### Before becoming ready to review
- [x] Add a test
- [x] Compress the data we send
- [x] Chunk and stream the data we send
- [x] Remove the unwrap in the index-scheduler when sending the data fails
- [x] The analytics are missing

### Before merging
- [x] Release a prototype



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-12-21 14:43:46 +00:00
ee54d3171e Check experimental feature at query time 2023-12-21 15:26:12 +01:00
a0e713c4e7 Merge #4277
4277: Update mini-dashboard to v0.2.12 r=curquiza a=mdubus

# Pull Request

## Related issue
Fixes #4276

## What does this PR do?
Upgrade mini-dashboard to version 0.2.12 ([see changes](https://github.com/meilisearch/mini-dashboard/releases/tag/v0.2.12))

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2023-12-21 11:03:46 +00:00
d4cb0a885b Merge #4275
4275: Flatten settings r=dureuill a=dureuill

# Pull Request

## Related issue
Initial internal feedback seems to indicate that the current shape of the `embedders` setting is undesirable: it has too much depth.

This PR changes this by flattening the structure of the embedders to the following:

```json5
// NEW structure
"embedders": {
  // still starts with the embedder name
  "default": {
    "source": "huggingFace", // now a string
    // properties of the source are all at the same level as the source
    "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
    "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd",
    "documentTemplate": "A product titled '{{doc.title}}'" // now a string
  }
}
```

By comparison, the old structure was:

```json5
// PREVIOUS version, no longer working with this PR
"embedders": {
  // still starts with the embedder name
  "default": {
    "source": {
      "huggingFace": {
        "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
        "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd"
      },
    "documentTemplate": { 
      "template": "A product titled '{{doc.title}}'" // now a string
    }
  }
}
```

The fields that are accepted in the new version of the `embedders` setting are depending on the value of the `source` field:

```json5
// huggingFace
"embedders": {
   "default": {
    "source": "huggingFace",
    "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
    "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd",
    "documentTemplate": "A product titled '{{doc.title}}'"
  }
}

// openAi
"embedders": {
   "default": {
    "source": "openAi",
    "model": "text-embedding-ada-002",
    "apiKey": "open_ai_api_key",
    "documentTemplate": "A product titled '{{doc.title}}'"
  }
}

// userProvided
"embedders": {
   "default": {
    "source": "userProvided",
    "dimensions": 42, // mandatory
  }
}
```

## What does this PR do?
- Flatten the settings structure
- Validate the prompt earlier to return a synchronous error on setting change rather than in the failing task
- Make it an error to pass a field for the wrong source (see above for allowed fields for each source)
- Not changed: It is still an error not to pass `dimensions` to the `userProvided` embedder
- If `source` was specified in the settings, validate the setting early to return a synchronous error in case of a missing mandatory field for the userProvided source (dimensions) or a forbidden field for the specified source.
- If `source` was not specified in the settings, still validate the setting, but only at indexing time, by using the source stored in the DB.
- Resets all values if the source changes, even if the user did not reset them explicitly.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Change the public facing guide for using the API
- [ ] Change examples of use in the changelog


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-21 09:58:01 +00:00
f52dee2b3b Update Cargo.toml
Update mini-dashboard with v0.2.12
2023-12-21 09:53:13 +01:00
0bf879fb88 Fix warning on rust stable 2023-12-20 17:48:09 +01:00
6ff81de401 Fix tests 2023-12-20 17:16:46 +01:00
2e4c9651df Validate settings in route 2023-12-20 17:16:46 +01:00
ec9649c922 Add function to validate settings in Meilisearch, to be used in the routes 2023-12-20 17:16:46 +01:00
9123370e90 Validate fused settings in settings task after fusing with existing setting 2023-12-20 17:16:46 +01:00
14b396d302 Add new errors 2023-12-20 17:16:45 +01:00
393216bf30 Flatten embedders settings 2023-12-20 17:16:43 +01:00
e249e4db7b Change Setting::apply function signature 2023-12-20 17:15:24 +01:00
de2ca7006e Merge #4272
4272: Don't pass default revision when the model is explicitly set in config r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4271 

## What does this PR do?

- When the `model` is explicitly set in the `embedders` setting, we reset the `revision` to `None`, such that if the user doesn't specify a revision, the head of the model repository is chosen. 
- Not changed: If the user specifies a revision, it applies, like previously. 
- Not changed: If the user doesn't specify a model, the default model with the default revision applies, like previously.

## Manual testing on a fresh DB

1. Enable experimental feature:
```sh
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' -H 'Authorization: Bearer foo' \
--data-binary '{ "vectorStore": true
  }'
```
2. Send settings with a specified model but no specified revision:
```sh
curl \
-X PATCH 'http://localhost:7700/indexes/products/settings' \
-H 'Content-Type: application/json' --data-binary \
'{ "embedders": { "default": { "source": { "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" } }, "documentTemplate": { "template": "A product titled '{{doc.title}}'"} } } }'
```
3. Check that the task was successful:
```sh
curl 'http://localhost:7700/tasks/0'

{"uid":0,"indexUid":"products","status":"succeeded","type":"settingsUpdate","canceledBy":null,"details":{"embedders":{"default":{"source":{"huggingFace":{"model":"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}},"documentTemplate":{"template":"A product titled {{doc.title}}"}}}},"error":null,"duration":"PT0.001892S","enqueuedAt":"2023-12-20T09:17:01.73789Z","startedAt":"2023-12-20T09:17:01.73854Z","finishedAt":"2023-12-20T09:17:01.740432Z"}
```
4. Send documents to index:
```sh
curl 'https://localhost:7700/indexes/products/documents' -H 'Content-Type: application/json' --data-binary '{"id": 0, "title": "Best product"}'
```

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-20 14:27:51 +00:00
333ce12eb2 Fixed issue where the default revision is always the one we picked for the default model 2023-12-20 10:17:49 +01:00
fb9db1eba6 Merge #4269
4269: Remove dependency that requires libstdc++ r=dureuill a=dureuill

Removes the dependency that caused the additional runtime dependency on libstdc++ by disabling the default features of the hf tokenizer.

## Discussion

- This removes a feature that is using a C++ dependency and is supposed to accelerate the tokenizer. As the tokenizer is likely to be a significant bottleneck for embedding texts using a HF model, this is an issue.
- We should at least rerun the movies vector indexing and check that it still works correctly and that it has a runtime in the ballpark of what it used to be.

Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
2023-12-19 12:26:48 +00:00
fa2b96b9a5 Add an Authorization Header along with the webhook calls 2023-12-19 12:18:45 +01:00
19736cefe8 add the analytics 2023-12-19 10:36:04 +01:00
4fb25b8782 fix clippy 2023-12-19 10:35:51 +01:00
c83a33017e stream and chunk the data 2023-12-19 10:35:51 +01:00
be72326c0a gzip the tasks 2023-12-19 10:35:51 +01:00
547379abb0 parse the url correctly 2023-12-19 10:35:51 +01:00
0b2fff27f2 update and fix the test 2023-12-19 10:35:51 +01:00
3adbc2b942 return a task view instead of a task 2023-12-19 10:35:51 +01:00
fbea721378 add a first working test with actixweb 2023-12-19 10:35:51 +01:00
391eb72137 start writing a test with actix but it doesn't works 2023-12-19 10:35:50 +01:00
d78ad51082 Implement the webhook 2023-12-19 10:35:50 +01:00
1956045a06 add the option 2023-12-19 10:23:56 +01:00
b2193e612f Revert "Add libstdc++ in Dockerfile" as it is no longer needed
This reverts commit 9df8cfc013.
2023-12-18 22:17:29 +01:00
942d49314c Remove dependency that requires libstdc++ 2023-12-18 22:17:18 +01:00
9a846e82bc Merge #4268
4268: Add libstdc++ in Dockerfile r=curquiza a=sanders41

# Pull Request

## Related issue
Fixes #4267

## What does this PR do?
- Add libstdc++ in the Dockerfile

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Paul Sanders <psanders1@gmail.com>
2023-12-18 18:35:53 +00:00
9df8cfc013 Add libstdc++ in Dockerfile 2023-12-18 13:05:46 -05:00
d868131bb7 Bump rustls-webpki from 0.101.3 to 0.101.7
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.101.3 to 0.101.7.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.101.3...v/0.101.7)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-18 14:57:38 +00:00
248aaa6d45 Merge #4262
4262: Update version for the next release (v1.6.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2023-12-18 14:00:19 +00:00
50d6317ec0 Update version for the next release (v1.6.0) in Cargo.toml 2023-12-18 13:57:46 +00:00
b734bd9891 Merge #4261
4261: Set rust toolchain to 1.71.1 in dockerfile r=curquiza a=dureuill

Fixes docker [CI](https://github.com/meilisearch/meilisearch/actions/workflows/publish-docker-images.yml)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-18 12:32:26 +00:00
9800d5a103 Set rust toolchain to 1.71.1 in dockerfile 2023-12-18 10:59:25 +01:00
7c4ed07617 Merge #4257
4257: Change proximity precision settings r=dureuill a=ManyTheFish

- [x] Add proximity_precision value into the analytics
- [x] Change the naming of `attributeScale` and `wordScale` into `byAttribute` and `byWord`
- [x] Remove proximityPrecision from the experimental feature

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2023-12-18 09:07:28 +00:00
3a99a555a2 Fix experimental features snapshots in tests 2023-12-18 10:05:51 +01:00
9e1b458010 Merge branch 'main' into change-proximity-precision-settings 2023-12-18 09:08:47 +01:00
2aede03bc2 Merge #4226
4226: Hybrid search r=dureuill a=dureuill

Allows to perform hybrid search requests that combine the results of semantic and keyword search and automatically generate embeddings.

## How to use

See [feature description](https://meilisearch.notion.site/v1-6-Hybrid-Search-Embedders-ea42c82f90cc4bc0be1eeb917c1118c8)

## Changes

- work is based on #4213 
- milli::new search now takes an input universe directly, rather than computing it from a filter. This adds flexibility to require results on a subset of documents
- vector search is now a regular ranking rule (akin to sort and geosort) and reports its score as a ScoreDetail
- separate keyword search and vector search functions, vector search now respects (geo)sort ranking rules
- add automatic embedding
- add hybrid search

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-12-14 16:24:56 +00:00
e741bc1c62 Add proximity_precision value into the analytics 2023-12-14 16:48:06 +01:00
6425996e36 Change the naming of attributeScale and wordScale into byAttribute and byWord 2023-12-14 16:31:00 +01:00
eb5cb91da2 Switch default from hf to openai 2023-12-14 16:19:46 +01:00
87bba98bd8 Various changes
- fixed seed for arroy
- check vector dimensions as soon as it is provided to search
- don't embed whitespace
2023-12-14 16:08:42 +01:00
217105b7da hybrid search uses semantic ratio, error handling 2023-12-14 16:08:42 +01:00
1b7c164a55 Pass the semantic ratio to milli 2023-12-14 16:08:42 +01:00
f3f3944469 Fix error checking 2023-12-14 16:08:42 +01:00
93dcbf598d Deserialize semantic ratio 2023-12-14 16:08:42 +01:00
ac68f33194 Add simple test 2023-12-14 16:08:42 +01:00
9991152bbe Add TODOs 2023-12-14 16:08:42 +01:00
a4536b1381 Small adjustments to respect the spec 2023-12-14 16:08:42 +01:00
5b51cb04af Remove some settings 2023-12-14 16:08:42 +01:00
3c1a14f1cd Add settings routes 2023-12-14 16:08:42 +01:00
b8e4709dfa Remove prompt strategy and fallback 2023-12-14 16:08:41 +01:00
806e5b6899 Tests pass 2023-12-14 16:08:41 +01:00
61bd2fb7a9 Update arroy 2023-12-14 16:08:41 +01:00
e0cc775dc4 Various changes
- DistributionShift in Search object (to be set from model in embed?)
- Fix issue where embedder index wasn't computed at search time
- Accept as default embedder either the "default" one, or the only embedder when there is only one
2023-12-14 16:08:41 +01:00
12940d79a9 WIP
- manual embedder
- multi embedders OK
- clippy + tests OK
2023-12-14 16:08:41 +01:00
922a640188 WIP multi embedders
fixed template bugs
2023-12-14 16:08:41 +01:00
abbe131084 Cosmetic change 2023-12-14 16:08:41 +01:00
d4715e0c4d Fix same vector sort bug 2023-12-14 16:08:41 +01:00
11e2a2c1aa Fix geosort bug 2023-12-14 16:08:41 +01:00
65e49b7092 Remove stuff, add distribution shift (WIP) 2023-12-14 16:08:38 +01:00
e56f160032 Actually pass embedders on reindex 2023-12-14 16:07:49 +01:00
687d92f217 prompt bifluor+ 2023-12-14 16:07:49 +01:00
fb539f61fe WIP 2023-12-14 16:07:49 +01:00
cb4ebe163e WIP 2023-12-14 16:07:49 +01:00
dde3a04679 WIP arroy integration 2023-12-14 16:07:49 +01:00
13c2c6c16b Small commit to add hybrid search and autoembedding 2023-12-14 16:07:48 +01:00
21bcf32109 Add candle and hg_hub, updating a lot of deps in the process 2023-12-14 16:07:48 +01:00
35e1981488 Remove proximityPrecision form the experimental feature 2023-12-14 15:52:42 +01:00
e0f712b9d3 Merge #4254
4254: Bring back v1.5.1 changes into main r=ManyTheFish a=Kerollmops

This pull request brings back changes from the _release-v1.5.1_ branch into _main_.

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-12-14 09:41:57 +00:00
56571f762a Merge remote-tracking branch 'origin/main' into tmp-release-v1.5.1 2023-12-13 11:57:01 +01:00
005800634d Merge pull request #4249 from meilisearch/flag-limit-batch-size
Introduce parameters to limit the number of batched tasks
2023-12-13 10:32:14 +01:00
976af4fa8f Add the default commented experimental batched tasks limit parameter to the config file 2023-12-12 10:59:00 +01:00
99fec27788 Make the --max-number-of-batched-tasks argument experimental 2023-12-12 10:55:39 +01:00
afa8f273a8 Merge #4250
4250: Update version for the next release (v1.5.1) in Cargo.toml r=dureuill a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2023-12-12 08:26:06 +00:00
4b644f6bc0 Update version for the next release (v1.5.1) in Cargo.toml 2023-12-11 17:15:11 +00:00
7e259cb0d2 Expose the --max-number-of-batched-tasks argument 2023-12-11 16:08:39 +01:00
0fbc1511d7 Merge #4225
4225: [EXP] Let the user customize the proximity precision r=dureuill a=ManyTheFish

# Pull Request
This PR introduces a new setting `proximityPrecision` allowing the user to trade indexing time with search precision on proximity-based features:
- proximity ranking rules
- multi-word synonyms
- phrase search
- split-words

I put the API PRD below:
https://www.notion.so/meilisearch/3988b345b5b248948a4a0dc5932a18ce?v=45d79150adb84b0aa27826ff6da2e029&p=aa69c2bab2c3402bab9340ae4def4577&pm=s

## Related issue
Fixes #4187

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-12-06 17:21:43 +00:00
c9860c7913 Small test fixes 2023-12-06 15:49:05 +01:00
03ffabe889 Add a new dump test 2023-12-06 15:49:05 +01:00
1f4fc9c229 Make the feature experimental 2023-12-06 15:49:05 +01:00
8cc3c54117 Add proximityPrecision setting in settings route 2023-12-06 15:49:05 +01:00
467b49153d Implement proximityPrecision setting on milli side 2023-12-06 15:49:02 +01:00
0c3fa8cbc4 Add tests on proximityPrecision setting 2023-12-06 14:59:23 +01:00
bddc168d83 List TODOs 2023-12-06 14:59:23 +01:00
84a36002d7 Merge #4239
4239: Remove the actix-web dependency from milli r=dureuill a=Kerollmops

Just remove actix-web from milli.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-11-29 10:19:40 +00:00
c95d68e244 Merge #4233
4233: Add test reproducing #4232 r=dureuill a=ManyTheFish

- add a test reproducing the bug
- fix the bug by creating 2 different restricting lists of attributes, one for the exact attributes, and the other for the tolerant attributes

## Related issue
Fixes #4232


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-11-29 08:47:17 +00:00
3b3fa38f27 Put the restrict list in a sub-struct 2023-11-28 18:37:57 +01:00
170e063b80 Remove the actix-web dependency from milli 2023-11-28 17:19:57 +01:00
d6c2ee15a9 Filter on attributes before computing the docids when attribute restriction is on 2023-11-28 14:55:29 +01:00
6376c342c1 Merge #4223
4223: Update to heed 0.20 r=dureuill a=Kerollmops

This PR brings the v0.20-alpha.9 version of heed into Meilisearch 🎉 The main goal is to test it in a real environment to make the necessary changes if needed. We also want to merge it as soon as possible during the pre-release phase to ensure we catch bugs before the release.

Most of the calls to heed are the same as before, except:
 - The `PolyDatabase` has been replaced with a `Database<Unspecified, Unspecified>`. We replaced the `get<T, U>()` by a `remap<T, U>().get()` calls.
 - The `Database` `append(...)` method has been replaced with a `put_with_flags(PutFlags::APPEND, ...)`.
 - The `RwTxn<'e, 'p>` has been simplified into a `RwTxn<'e>`.
 - The `BytesEncode/Decode` traits return a `Result<_, BoxedError>` instead of an `Option<_>`.
 - We no longer need to wrap and unwrap the `BEU32` integer when storing/getting them from heed.

### TODO
 - [x] Create actual, simple error types instead of using strings in the codecs.

### Follow-up work
 - Move the codecs into another member crate (we depend on the uuid one in the meilitool crate).
 - Display the internal decoding error in the `SerializationError` internal error variant.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-11-28 13:39:44 +00:00
5b563f872b Move the clippy attribute on the problematic part of the code 2023-11-28 14:37:58 +01:00
ec9b52d608 Rename copy_to_path to copy_to_file 2023-11-28 14:32:30 +01:00
34c67ac389 Remove the possibility to fail fetching the env info 2023-11-28 14:31:23 +01:00
d050c9b4ae Only remap the main database once 2023-11-28 14:27:30 +01:00
7dd1226faf Clarify an unreachable unwrap 2023-11-28 14:26:31 +01:00
1575456594 Further reduce an async block 2023-11-28 14:23:32 +01:00
add2ceef67 Introduce error types to avoid panics 2023-11-28 14:21:49 +01:00
548c8247c2 Create and use real error types in the codecs 2023-11-28 10:11:17 +01:00
181ca48482 Merge #4234
4234: Fix puffin in the index scheduler r=dureuill a=irevoire

Currently, we can't compile the index scheduler without this feature.

It could be cool to specify the dependencies in the main workspace cargo toml like quickwit does to avoid this kind of error in the future; https://github.com/quickwit-oss/quickwit/blob/main/quickwit/Cargo.toml#L41

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-11-28 08:23:48 +00:00
5751f5c640 fix puffin in the index scheduler 2023-11-27 15:18:33 +01:00
d32eb11329 Move to the v0.20.0-alpha.9 of heed 2023-11-27 11:52:22 +01:00
dc07790133 Add test reproducing #4232 2023-11-27 11:39:11 +01:00
3d23b388bc Merge #4231
4231: Fixed payload limit setting being ignored for delete documents by batch r=Kerollmops a=Karribalu


# Pull Request

## Related issue
Fixes #4224

## What does this PR do?
- Added http_payload_size_limit to JsonConfig to allow deleting documents in batches with a payload size greater than 2MB, which is the default limit set in the JsonConfig crate.

## PR checklist
Please check if your PR fulfills the following requirements:
- [Y] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [Y] Have you read the contributing guidelines?
- [Y] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2023-11-27 09:26:21 +00:00
85626cff8e Fixed payload limit setting being ignored for delete documents by batch route 2023-11-25 18:41:16 +00:00
58dac8af42 Remove the panics and unwraps 2023-11-23 15:00:48 +01:00
0dbf1a16ff Make clippy happy 2023-11-23 14:11:38 +01:00
462b4c0080 Fix the tests 2023-11-23 12:07:35 +01:00
0d4482625a Make the changes to use heed v0.20-alpha.6 2023-11-23 11:43:58 +01:00
56a0d91ecd Update the heed dependency and lock file 2023-11-22 15:11:09 +01:00
b366acdae6 Merge #4220
4220: Bring back changes from v1.5.0 into main r=dureuill a=Kerollmops

This will bring the fixes from v1.5.0 into main. By [following this guide](https://github.com/meilisearch/engine-team/blob/main/resources/meilisearch-release.md#after-the-release) I decided to create a temporary branch to fix the git conflicts and merge into main afterward.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-11-22 07:46:22 +00:00
7cb7e37ba8 Merge branch 'main' into tmp-release-v1.5.0 2023-11-21 16:30:46 +01:00
33b7c574ea Merge #4090
4090: Diff indexing r=ManyTheFish a=ManyTheFish

This pull request aims to reduce the indexing time by computing a difference between the data added to the index and the data removed from the index before writing in LMDB.

## Why focus on reducing the writings in LMDB?

The indexing in Meilisearch is split into 3 main phases:
1) The computing or the extraction of the data (Multi-threaded)
2) The writing of the data in LMDB (Mono-threaded)
3) The processing of the prefix databases (Mono-threaded)

see below:
![Capture d’écran 2023-09-28 à 20 01 45](https://github.com/meilisearch/meilisearch/assets/6482087/51513162-7c39-4244-978b-2c6b60c43a56)


Because the writing is mono-threaded, it represents a bottleneck in the indexing, reducing the number of writes in LMDB will reduce the pressure on the main thread and should reduce the global time spent on the indexing.

## Give Feedback

We created [a dedicated discussion](https://github.com/meilisearch/meilisearch/discussions/4196) for users to try this new feature and to give feedback on bugs or performance issues.

## Technical approach
### Part 1: merge the addition and the deletion process
This part:
a) Aims to reduce the time spent on indexing only the filterable/sortable fields of documents, for example:
  - Updating the number of "likes" or "stars" of a song or a movie
  - Updating the "stock count" or the "price" of a product

b) Aims to reduce the time spent on writing in LMDB which should reduce the global indexing time for the highly multi-threaded machines by reducing the writing bottleneck.

c) Aims to reduce the average time spent to delete documents without having to keep the soft-deleted documents implementation

- [x] Create a preprocessing function that creates the diff-based documents chuck (`OBKV<fid, OBKV<AddDel, value>>`)
  - [x] and clearly separate the faceted fields and the searchable fields in two different chunks
- Change the parameters of the input extractor by taking an `OBKV<fid, OBKV<AddDel, value>>` instead of  `OBKV<fid, value>`.
  - [x] extract_docid_word_positions
  - [x] extract_geo_points
  - [x] extract_vector_points
  - [x] extract_fid_docid_facet_values
- Adapt the searchable extractors to the new diff-chucks
  - [x] extract_fid_word_count_docids
  - [x] extract_word_pair_proximity_docids
  - [x] extract_word_position_docids
  - [x] extract_word_docids
- Adapt the facet extractors to the new diff-chucks
  - [x] extract_facet_number_docids
  - [x] extract_facet_string_docids
  - [x] extract_fid_docid_facet_values
  - [x] FacetsUpdate
- [x] Adapt the prefix database extractors ⚠️ ⚠️ 
- [x] Make the LMDB writer remove the document_ids to delete at the same time the new document_ids are added
- [x] Remove document deletion pipeline
  - [x] remove `new_documents_ids` entirely and `replaced_documents_ids`
  - [x] reuse extracted external id from transform instead of re-extracting in `TypedChunks::Documents`
  - [x] Remove deletion pipeline after autobatcher
  - [x] remove autobatcher deletion pipeline
    - [x] everything uses `IndexOperation::DocumentOperation`
    - [x] repair deletion by internal id for filter by delete
    - [x] Improve the deletion via internal ids by avoiding iterating over the whole set of external document ids.  
- [x] Remove soft-deleted documents

#### FIXME

- [x] field distribution is not correctly updated after deletion
- [x] missing documents in the tests of tokenizer_customization

### Part 2: Only compute the documents field by field
This part aims to reduce the global indexing time for any kind of partial document modification on any size of machine from the mono-threaded one to the highly multi-threaded one.

- [ ] Make the preprocessing function only send the fields that changed to the extractors
- [ ] remove the `word_docids` and `exact_word_docids` database and adapt the search (⚠️ could impact the search performances)
- [ ] replace the `word_pair_proximity_docids` database with a `word_pair_proximity_fid_docids` database and adapt the search (⚠️ could impact the search performances)
- [ ] Adapt the prefix database extractors ⚠️ ⚠️

## Technical Concerns
- The part 1 implementation could increase the indexing time for the smallest machines (with few threads) by increasing the extracting time (multi-threaded) more than the writing time (mono-threaded)
- The part 2 implementation needs to change the databases which could have a significant impact on the search performances
- The prefix databases are a bit special to process and may be a pain to adapt to the difference-based indexing

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-11-21 09:44:38 +00:00
d3575fb028 Make into_del_add_obkv parameters more human readable 2023-11-20 16:10:39 +01:00
39cbb499c2 Small fixes 2023-11-20 10:20:39 +01:00
ebef6bc24d Simplify documents database writing 2023-11-20 10:14:57 +01:00
d59b7db8d0 remove unused code 2023-11-20 10:10:45 +01:00
263e825619 Fix typos in comments 2023-11-20 10:06:29 +01:00
69354a6144 Add the benchmarck name to the bot message 2023-11-15 13:56:54 +01:00
b0adc73ce6 Merge pull request #4207 from meilisearch/diff-indexing-prefix-databases
Diff indexing prefix databases
2023-11-14 16:04:05 +01:00
2b5d9042d1 Merge #4208
4208: Makes the dump cancellable r=Kerollmops a=irevoire

# Pull Request

Make the dump tasks cancellable even when they have already started processing.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4157


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-11-14 13:31:45 +00:00
5b57fbab08 makes the dump cancellable 2023-11-14 11:23:13 +01:00
72d3fa4898 Merge #4203
4203: Extract external document docids from docs on deletion by filter r=Kerollmops a=dureuill

This fixes some of the performance regression observed on `diff-indexing` when doing delete-by-filter with a filter matching many documents.

To delete 19 768 771 documents (hackernews dataset, all documents matching `type = comment`), here are the observed time:

|branch (commit sha1sum)|time|speed-down factor (lower is better)|
|--|--|--|
|`main` (48865470d7)|1212.885536s (~20min)|x1.0 (baseline)|
|`diff-indexing` (523519fdbf)|5385.550543s (90min)|x4.44|
|**`diff-indexing-extract-primary-key`**(f8289cd974)|2582.323324s (43min) | x2.13|

So we're still suffering a speed-down of x2.13, but that's much better than x4.44.

---

Changes:

- Refactor the logic of PrimaryKey extraction to a struct
- Add a trait to abstract the extraction of field id from a name between `DocumentBatch` and `FieldIdMap`.
- Add `Index::external_id_of` to get the external ids of a bitmap of internal ids.
- Use this new method to add new Transform and Batch methods to remove documents that are known to be from the DB.
- Modify delete-by-filter to use the new method

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-11-13 13:02:10 +00:00
772964125d Factor removal of document from DB 2023-11-13 13:51:22 +01:00
378deb0bef Rename trait 2023-11-13 13:38:36 +01:00
1f36410541 Update tests 2023-11-13 13:36:39 +01:00
b11f85a635 Merge #4205
4205: Prevent search hang on the processing index r=Kerollmops a=dureuill

Fixes #4206, an issue originally [reported on Discord](https://discord.com/channels/1006923006964154428/1148983671026618579/1148983671026618579) where having parallel search requests on more indexes than the index cache capacity would cause search requests on the currently updating index to hang until the index is done updating.

## Test setup

- Create 20 empty indexes by sending settings to them
- repeatedly send placeholder search requests to each of the indexes in a loop
- Create another index and send a significant batch of documents to index.
- Attempt to perform a search request on that last index.
  - Before this PR, the search request hangs while the index update task is processing
  - After this PR, the search request respond immediately even while the index update task is processing

## Changes

- When getting the handle to an index for some potentially long running batches of tasks, save it in the index scheduler.
- Drop the handle from the index-scheduler when the task is done so that we don't leak indexes.
- When getting an index from outside the task queue processor, check if there is such an handle matching the requested index. If so, skip the cache entirely and clone the handle.

Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-11-13 10:36:01 +00:00
a2d6dc8571 Fix typo, remove caching for the change of index 2023-11-13 10:44:36 +01:00
ee1701157f Merge #4204
4204: Throw error when the vector search is sent with the wrong size r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4201 


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-11-13 09:43:20 +00:00
8c649d8061 Throw error when the vector search is sent with the wrong size 2023-11-13 09:57:42 +01:00
492fc086f0 cargo fmt 2023-11-12 21:53:11 +01:00
a2d0c73b41 Save the currently updating index so that the search can access it at all times 2023-11-10 10:52:03 +01:00
264b10ec20 Fixup documentation 2023-11-09 16:23:20 +01:00
825257da76 Use more efficient method for deletion in benchmarks 2023-11-09 16:13:15 +01:00
f8289cd974 Use it from delete-by-filter 2023-11-09 14:23:15 +01:00
3053e01c05 Batch::remove_documents_from_db_no_batch 2023-11-09 14:23:02 +01:00
b11c2afac0 Index::external_id_of 2023-11-09 14:22:43 +01:00
9cef800b2a Enrich uses the new type 2023-11-09 14:22:05 +01:00
db2fb86b8b Extract PrimaryKey logic to a type 2023-11-09 14:19:16 +01:00
882ab9cc85 remove warnings 2023-11-09 11:35:33 +01:00
5a9c96e1db Compute word integer prefix cache 2023-11-09 11:34:26 +01:00
70ce40828c Compute word docids prefix cache 2023-11-08 17:01:00 +01:00
688266c83e Remove word pair proximity prefix cache and compute it at search time 2023-11-08 14:16:01 +01:00
6dab826908 Reactivate prefix databases 2023-11-08 13:58:01 +01:00
1e2fbc6a42 revert "REVERT ME: ignore prefix pair databases tests"
This reverts commit 1b2ea6cf19.
2023-11-08 11:50:52 +01:00
523519fdbf Merge pull request #4195 from meilisearch/diff-indexing-remove-from-batch
Remove `IndexOperation::DocumentDeletion`
2023-11-08 10:29:49 +01:00
ef6fa10f7a Remove IndexOperation::DocumentDeletion 2023-11-06 12:16:15 +01:00
620fee35f9 Fix benches 2023-11-06 11:56:46 +01:00
cbaa54cafd Fix clippy issues 2023-11-06 11:19:31 +01:00
1bccf2079e Correctly mark non-tests as non-tests 2023-11-06 11:03:56 +01:00
1b2ea6cf19 REVERT ME: ignore prefix pair databases tests 2023-11-06 10:46:22 +01:00
1ad1fcc8c8 Remove all warnings 2023-11-06 10:31:14 +01:00
48865470d7 Merge #4191
4191: Remove banner r=Kerollmops a=curquiza



Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2023-11-02 17:14:23 +00:00
c810df4d9f Update README.md 2023-11-02 17:40:18 +01:00
87610a5f98 Don't try to delete a document that is not in the database 2023-11-02 16:49:03 +01:00
2544bc1416 Merge pull request #4160 from meilisearch/diff-indexing-vector-points
Diff Indexing for the vector points
2023-11-02 16:01:51 +01:00
ff522c919d Fix the vector extractions for the diff indexing 2023-11-02 15:58:08 +01:00
1c39459cf4 Merge pull request #4179 from meilisearch/diff-indexing-fix-nested-primary-key
Diff indexing fix nested primary key
2023-11-02 15:39:50 +01:00
bf0651f23c Implement iter method on ExternalDocumentsIds 2023-11-02 15:38:00 +01:00
5b20e625f3 fix merge 2023-11-02 15:31:37 +01:00
bc51d6157a Fix transform reindexing path 2023-11-02 15:26:20 +01:00
1b4ff991c0 update typed chunks 2023-11-02 15:26:20 +01:00
4b64c33aa2 update vector extractor 2023-11-02 15:26:20 +01:00
12323d610e Change the original document sorter key from the internal docid to a concatenation of the internal and the external docid 2023-11-02 15:26:20 +01:00
44e9033b3a Merge pull request #4181 from meilisearch/diff-indexing-parallel-transform
Use rayon to sort entries in parallel
2023-11-02 15:16:10 +01:00
4d864f0702 Always sort internal Sorter entries in parallel 2023-11-02 14:47:43 +01:00
5e3df76699 Merge #4183
4183: Bump docker/login-action from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [docker/login-action](https://github.com/docker/login-action) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/login-action/releases">docker/login-action's releases</a>.</em></p>
<blockquote>
<h2>v3.0.0</h2>
<ul>
<li>Node 20 as default runtime (requires <a href="https://github.com/actions/runner/releases/tag/v2.308.0">Actions Runner v2.308.0</a> or later) by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/login-action/pull/593">docker/login-action#593</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 in <a href="https://redirect.github.com/docker/login-action/pull/598">docker/login-action#598</a></li>
<li>Bump <code>`@​aws-sdk/client-ecr</code>` and <code>`@​aws-sdk/client-ecr-public</code>` to 3.410.0 in <a href="https://redirect.github.com/docker/login-action/pull/555">docker/login-action#555</a> <a href="https://redirect.github.com/docker/login-action/pull/560">docker/login-action#560</a> <a href="https://redirect.github.com/docker/login-action/pull/582">docker/login-action#582</a> <a href="https://redirect.github.com/docker/login-action/pull/599">docker/login-action#599</a></li>
<li>Bump semver from 6.3.0 to 6.3.1 in <a href="https://redirect.github.com/docker/login-action/pull/556">docker/login-action#556</a></li>
<li>Bump https-proxy-agent to 7.0.2 <a href="https://redirect.github.com/docker/login-action/pull/561">docker/login-action#561</a> <a href="https://redirect.github.com/docker/login-action/pull/588">docker/login-action#588</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/login-action/compare/v2.2.0...v3.0.0">https://github.com/docker/login-action/compare/v2.2.0...v3.0.0</a></p>
<h2>v2.2.0</h2>
<ul>
<li>Switch to actions-toolkit implementation by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/login-action/pull/409">docker/login-action#409</a> <a href="https://redirect.github.com/docker/login-action/pull/470">docker/login-action#470</a> <a href="https://redirect.github.com/docker/login-action/pull/476">docker/login-action#476</a></li>
<li>Bump <code>`@​aws-sdk/client-ecr</code>` and <code>`@​aws-sdk/client-ecr-public</code>` to 3.347.1 in <a href="https://redirect.github.com/docker/login-action/pull/524">docker/login-action#524</a> <a href="https://redirect.github.com/docker/login-action/pull/364">docker/login-action#364</a> <a href="https://redirect.github.com/docker/login-action/pull/363">docker/login-action#363</a></li>
<li>Bump minimatch from 3.0.4 to 3.1.2 in <a href="https://redirect.github.com/docker/login-action/pull/354">docker/login-action#354</a></li>
<li>Bump json5 from 2.2.0 to 2.2.3 in <a href="https://redirect.github.com/docker/login-action/pull/378">docker/login-action#378</a></li>
<li>Bump http-proxy-agent from 5.0.0 to 7.0.0 in <a href="https://redirect.github.com/docker/login-action/pull/509">docker/login-action#509</a></li>
<li>Bump https-proxy-agent from 5.0.1 to 7.0.0 in <a href="https://redirect.github.com/docker/login-action/pull/508">docker/login-action#508</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/login-action/compare/v2.1.0...v2.2.0">https://github.com/docker/login-action/compare/v2.1.0...v2.2.0</a></p>
<h2>v2.1.0</h2>
<ul>
<li>Ensure AWS temp credentials are redacted in workflow logs by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/login-action/issues/275">#275</a>)</li>
<li>Bump <code>`@​actions/core</code>` from 1.6.0 to 1.10.0 (<a href="https://redirect.github.com/docker/login-action/issues/252">#252</a> <a href="https://redirect.github.com/docker/login-action/issues/292">#292</a>)</li>
<li>Bump <code>`@​aws-sdk/client-ecr</code>` from 3.53.0 to 3.186.0 (<a href="https://redirect.github.com/docker/login-action/issues/298">#298</a>)</li>
<li>Bump <code>`@​aws-sdk/client-ecr-public</code>` from 3.53.0 to 3.186.0 (<a href="https://redirect.github.com/docker/login-action/issues/299">#299</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/login-action/compare/v2.0.0...v2.1.0">https://github.com/docker/login-action/compare/v2.0.0...v2.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="343f7c4344"><code>343f7c4</code></a> Merge pull request <a href="https://redirect.github.com/docker/login-action/issues/599">#599</a> from docker/dependabot/npm_and_yarn/aws-sdk-dependenc...</li>
<li><a href="aad0f974f2"><code>aad0f97</code></a> chore: update generated content</li>
<li><a href="2e0cd39144"><code>2e0cd39</code></a> build(deps): bump the aws-sdk-dependencies group with 2 updates</li>
<li><a href="203bc9c4ef"><code>203bc9c</code></a> Merge pull request <a href="https://redirect.github.com/docker/login-action/issues/588">#588</a> from docker/dependabot/npm_and_yarn/proxy-agent-depen...</li>
<li><a href="2199648fc8"><code>2199648</code></a> chore: update generated content</li>
<li><a href="b489376173"><code>b489376</code></a> build(deps): bump the proxy-agent-dependencies group with 1 update</li>
<li><a href="7c309e74e6"><code>7c309e7</code></a> Merge pull request <a href="https://redirect.github.com/docker/login-action/issues/598">#598</a> from docker/dependabot/npm_and_yarn/actions/core-1.10.1</li>
<li><a href="0ccf222961"><code>0ccf222</code></a> chore: update generated content</li>
<li><a href="56d703e106"><code>56d703e</code></a> Merge pull request <a href="https://redirect.github.com/docker/login-action/issues/597">#597</a> from docker/dependabot/github_actions/aws-actions/con...</li>
<li><a href="24d3b3519e"><code>24d3b35</code></a> build(deps): bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1</li>
<li>Additional commits viewable in <a href="https://github.com/docker/login-action/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/login-action&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-02 13:18:13 +00:00
02765fb267 Merge #4184
4184: Bump actions/setup-node from 3 to 4 r=curquiza a=dependabot[bot]

Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p>
<blockquote>
<h2>v4.0.0</h2>
<h2>What's Changed</h2>
<p>In scope of this release we changed version of node runtime for action from node16 to node20 and updated dependencies in <a href="https://redirect.github.com/actions/setup-node/pull/866">actions/setup-node#866</a></p>
<p>Besides, release contains such changes as:</p>
<ul>
<li>Upgrade actions/checkout to v4 by <a href="https://github.com/gmembre-zenika"><code>`@​gmembre-zenika</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li>
<li>Update actions/checkout for documentation and yaml by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/876">actions/setup-node#876</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/gmembre-zenika"><code>`@​gmembre-zenika</code></a>` made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v4.0.0">https://github.com/actions/setup-node/compare/v3...v4.0.0</a></p>
<h2>v3.8.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Update semver by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/861">actions/setup-node#861</a></li>
<li>Update temp directory creation by <a href="https://github.com/nikolai-laevskii"><code>`@​nikolai-laevskii</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/859">actions/setup-node#859</a></li>
<li>Bump <code>`@​babel/traverse</code>` from 7.15.4 to 7.23.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/870">actions/setup-node#870</a></li>
<li>Add notice about binaries not being updated yet by <a href="https://github.com/nikolai-laevskii"><code>`@​nikolai-laevskii</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/872">actions/setup-node#872</a></li>
<li>Update toolkit cache and core by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` and <a href="https://github.com/seongwon-privatenote"><code>`@​seongwon-privatenote</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/875">actions/setup-node#875</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v3.8.2">https://github.com/actions/setup-node/compare/v3...v3.8.2</a></p>
<h2>v3.8.1</h2>
<h2>What's Changed</h2>
<p>In scope of this release, the filter was removed within the cache-save step by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/831">actions/setup-node#831</a>. It is filtered and checked in the toolkit/cache library.</p>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v3.8.1">https://github.com/actions/setup-node/compare/v3...v3.8.1</a></p>
<h2>v3.8.0</h2>
<h2>What's Changed</h2>
<h3>Bug fixes:</h3>
<ul>
<li>Add check for existing paths by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/803">actions/setup-node#803</a></li>
<li>Resolve SymbolicLink by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/809">actions/setup-node#809</a></li>
<li>Change passing logic for cache input by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/816">actions/setup-node#816</a></li>
<li>Fix armv7 cache issue by <a href="https://github.com/louislam"><code>`@​louislam</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/794">actions/setup-node#794</a></li>
<li>Update check-dist workflow name by <a href="https://github.com/sinchang"><code>`@​sinchang</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/710">actions/setup-node#710</a></li>
</ul>
<h3>Feature implementations:</h3>
<ul>
<li>feat: handling the case where &quot;node&quot; is used for tool-versions file. by <a href="https://github.com/xytis"><code>`@​xytis</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/812">actions/setup-node#812</a></li>
</ul>
<h3>Documentation changes:</h3>
<ul>
<li>Refer to semver package name in README.md by <a href="https://github.com/olleolleolle"><code>`@​olleolleolle</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/808">actions/setup-node#808</a></li>
</ul>
<h3>Update dependencies:</h3>
<ul>
<li>Update toolkit cache to fix zstd by <a href="https://github.com/dmitry-shibanov"><code>`@​dmitry-shibanov</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/804">actions/setup-node#804</a></li>
<li>Bump tough-cookie and <code>`@​azure/ms-rest-js</code>` by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/802">actions/setup-node#802</a></li>
<li>Bump semver from 6.1.2 to 6.3.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/actions/setup-node/pull/807">actions/setup-node#807</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="8f152de45c"><code>8f152de</code></a> Update actions/checkout for documentation and yaml (<a href="https://redirect.github.com/actions/setup-node/issues/876">#876</a>)</li>
<li><a href="23755b521f"><code>23755b5</code></a> upgrade actions/checkout to v4 (<a href="https://redirect.github.com/actions/setup-node/issues/868">#868</a>)</li>
<li><a href="54534a2a9b"><code>54534a2</code></a> Change node version for action to node20 (<a href="https://redirect.github.com/actions/setup-node/issues/866">#866</a>)</li>
<li>See full diff in <a href="https://github.com/actions/setup-node/compare/v3...v4">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-02 11:28:03 +00:00
841165d529 Merge #4185
4185: Bump Swatinem/rust-cache from 2.6.2 to 2.7.1 r=curquiza a=dependabot[bot]

Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.6.2 to 2.7.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/swatinem/rust-cache/releases">Swatinem/rust-cache's releases</a>.</em></p>
<blockquote>
<h2>v2.7.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix save-if documentation in readme by <a href="https://github.com/rukai"><code>`@​rukai</code></a>` in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/166">Swatinem/rust-cache#166</a></li>
<li>Support for <code>trybuild</code> and similar macro testing tools by <a href="https://github.com/neysofu"><code>`@​neysofu</code></a>` in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/168">Swatinem/rust-cache#168</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/rukai"><code>`@​rukai</code></a>` made their first contribution in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/166">Swatinem/rust-cache#166</a></li>
<li><a href="https://github.com/neysofu"><code>`@​neysofu</code></a>` made their first contribution in <a href="https://redirect.github.com/Swatinem/rust-cache/pull/168">Swatinem/rust-cache#168</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/Swatinem/rust-cache/compare/v2.6.2...v2.7.0">https://github.com/Swatinem/rust-cache/compare/v2.6.2...v2.7.0</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md">Swatinem/rust-cache's changelog</a>.</em></p>
<blockquote>
<h2>2.7.1</h2>
<ul>
<li>Update toml parser to fix parsing errors.</li>
</ul>
<h2>2.7.0</h2>
<ul>
<li>Properly cache <code>trybuild</code> tests.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="3cf7f8cc28"><code>3cf7f8c</code></a> 2.7.1</li>
<li><a href="e03705e031"><code>e03705e</code></a> changelog</li>
<li><a href="b86d1c6caa"><code>b86d1c6</code></a> bump all the other dependencies too</li>
<li><a href="f27990c89a"><code>f27990c</code></a> Update Dependencies (<a href="https://redirect.github.com/swatinem/rust-cache/issues/172">#172</a>)</li>
<li><a href="a95ba19544"><code>a95ba19</code></a> 2.7.0</li>
<li><a href="82c8487d00"><code>82c8487</code></a> changelog</li>
<li><a href="67c46e7159"><code>67c46e7</code></a> Support for <code>trybuild</code> and similar macro testing tools (<a href="https://redirect.github.com/swatinem/rust-cache/issues/168">#168</a>)</li>
<li><a href="44b6087283"><code>44b6087</code></a> Fix save-if documentation in readme (<a href="https://redirect.github.com/swatinem/rust-cache/issues/166">#166</a>)</li>
<li>See full diff in <a href="https://github.com/swatinem/rust-cache/compare/v2.6.2...v2.7.1">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Swatinem/rust-cache&package-manager=github_actions&previous-version=2.6.2&new-version=2.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-02 10:48:25 +00:00
ea4a266f08 Merge #4182
4182: Bump mislav/bump-homebrew-formula-action from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [mislav/bump-homebrew-formula-action](https://github.com/mislav/bump-homebrew-formula-action) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/mislav/bump-homebrew-formula-action/releases">mislav/bump-homebrew-formula-action's releases</a>.</em></p>
<blockquote>
<h2>bump-homebrew-formula 3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>feat: bump to use node20 runtime by <a href="https://github.com/chenrui333"><code>`@​chenrui333</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/61">mislav/bump-homebrew-formula-action#61</a></li>
<li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/63">mislav/bump-homebrew-formula-action#63</a></li>
<li>Bump <code>`@​vercel/ncc</code>` from 0.34.0 to 0.38.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/67">mislav/bump-homebrew-formula-action#67</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.9.1 to 1.10.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/68">mislav/bump-homebrew-formula-action#68</a></li>
<li>Bump <code>`@​octokit/core</code>` from 3.5.1 to 5.0.0 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/65">mislav/bump-homebrew-formula-action#65</a></li>
<li>Bump TypeScript from 4.7 to 5.2</li>
<li>Bump <code>`@​typescript-eslint/eslint-plugin</code>` from 5.43.0 to 6.7.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/66">mislav/bump-homebrew-formula-action#66</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/mislav/bump-homebrew-formula-action/compare/v2.4...v3.0">https://github.com/mislav/bump-homebrew-formula-action/compare/v2.4...v3.0</a></p>
<h2>bump-homebrew-formula 2.4</h2>
<h2>What's Changed</h2>
<ul>
<li>chore: use <code>/archive/refs/tags/${tagName}.tar.gz</code> rather than <code>/archive/${tagName}.tar.gz</code> by <a href="https://github.com/chenrui333"><code>`@​chenrui333</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/53">mislav/bump-homebrew-formula-action#53</a></li>
<li>Fix extracting version tags from GitHub download URLs by <a href="https://github.com/mislav"><code>`@​mislav</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/62">mislav/bump-homebrew-formula-action#62</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/mislav/bump-homebrew-formula-action/compare/v2.3...v2.4">https://github.com/mislav/bump-homebrew-formula-action/compare/v2.3...v2.4</a></p>
<h2>bump-homebrew-formula 2.3</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix formula path after sharding of homebrew-core by <a href="https://github.com/williammartin"><code>`@​williammartin</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/59">mislav/bump-homebrew-formula-action#59</a></li>
<li>(docs): fix if condition in example by <a href="https://github.com/christian-bromann"><code>`@​christian-bromann</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/54">mislav/bump-homebrew-formula-action#54</a></li>
<li>(docs): use environment files instead of set-output by <a href="https://github.com/kyu08"><code>`@​kyu08</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/57">mislav/bump-homebrew-formula-action#57</a></li>
<li>Bump word-wrap from 1.2.3 to 1.2.4 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/55">mislav/bump-homebrew-formula-action#55</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/christian-bromann"><code>`@​christian-bromann</code></a>` made their first contribution in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/54">mislav/bump-homebrew-formula-action#54</a></li>
<li><a href="https://github.com/kyu08"><code>`@​kyu08</code></a>` made their first contribution in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/57">mislav/bump-homebrew-formula-action#57</a></li>
<li><a href="https://github.com/williammartin"><code>`@​williammartin</code></a>` made their first contribution in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/59">mislav/bump-homebrew-formula-action#59</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/mislav/bump-homebrew-formula-action/compare/v2.2...v2.3">https://github.com/mislav/bump-homebrew-formula-action/compare/v2.2...v2.3</a></p>
<h2>bump-homebrew-formula 2.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix scenario with generated GITHUB_TOKEN by <a href="https://github.com/mislav"><code>`@​mislav</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/45">mislav/bump-homebrew-formula-action#45</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.6.0 to 1.9.1 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/39">mislav/bump-homebrew-formula-action#39</a></li>
<li>Bump minimatch from 3.0.4 to 3.1.2 by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/40">mislav/bump-homebrew-formula-action#40</a></li>
<li>Bump got and ava by <a href="https://github.com/dependabot"><code>`@​dependabot</code></a>` in <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/pull/41">mislav/bump-homebrew-formula-action#41</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/mislav/bump-homebrew-formula-action/compare/v2.1...v2.2">https://github.com/mislav/bump-homebrew-formula-action/compare/v2.1...v2.2</a></p>
<h2>bump-homebrew-formula 2.1</h2>
<ul>
<li>Fix extracting complex tag names from GitHub archive and release download URLs <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/37">mislav/bump-homebrew-formula-action#37</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="b3327118b2"><code>b332711</code></a> lib</li>
<li><a href="d1d8ac114e"><code>d1d8ac1</code></a> Merge remote-tracking branch 'origin/main' into v3</li>
<li><a href="cf2d00157f"><code>cf2d001</code></a> Fix calculating checksum for resource at download URL (<a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/77">#77</a>)</li>
<li><a href="2bcfdc9312"><code>2bcfdc9</code></a> Merge pull request <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/72">#72</a> from mislav/dependabot/npm_and_yarn/octokit/plugin-res...</li>
<li><a href="5678601dcb"><code>5678601</code></a> Merge pull request <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/74">#74</a> from mislav/dependabot/npm_and_yarn/eslint-8.50.0</li>
<li><a href="addc60eb43"><code>addc60e</code></a> Bump <code>`@​octokit/plugin-rest-endpoint-methods</code>` from 9.0.0 to 10.0.0</li>
<li><a href="44b3287225"><code>44b3287</code></a> Merge pull request <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/75">#75</a> from mislav/dependabot/npm_and_yarn/octokit/core-5.0.1</li>
<li><a href="fda81994d7"><code>fda8199</code></a> Merge pull request <a href="https://redirect.github.com/mislav/bump-homebrew-formula-action/issues/71">#71</a> from mislav/dependabot/npm_and_yarn/octokit/request-er...</li>
<li><a href="2fd87fd7ea"><code>2fd87fd</code></a> Bump <code>`@​octokit/core</code>` from 5.0.0 to 5.0.1</li>
<li><a href="0c20930845"><code>0c20930</code></a> Bump eslint from 8.49.0 to 8.50.0</li>
<li>Additional commits viewable in <a href="https://github.com/mislav/bump-homebrew-formula-action/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mislav/bump-homebrew-formula-action&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-02 08:48:19 +00:00
49f069ed97 Bump Swatinem/rust-cache from 2.6.2 to 2.7.1
Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.6.2 to 2.7.1.
- [Release notes](https://github.com/swatinem/rust-cache/releases)
- [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md)
- [Commits](https://github.com/swatinem/rust-cache/compare/v2.6.2...v2.7.1)

---
updated-dependencies:
- dependency-name: Swatinem/rust-cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-01 17:57:42 +00:00
be16b99d40 Bump actions/setup-node from 3 to 4
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3 to 4.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](https://github.com/actions/setup-node/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-01 17:57:38 +00:00
ec0c09d17c Bump docker/login-action from 2 to 3
Bumps [docker/login-action](https://github.com/docker/login-action) from 2 to 3.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](https://github.com/docker/login-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-01 17:57:33 +00:00
a9230f6e6c Bump mislav/bump-homebrew-formula-action from 2 to 3
Bumps [mislav/bump-homebrew-formula-action](https://github.com/mislav/bump-homebrew-formula-action) from 2 to 3.
- [Release notes](https://github.com/mislav/bump-homebrew-formula-action/releases)
- [Commits](https://github.com/mislav/bump-homebrew-formula-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: mislav/bump-homebrew-formula-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-01 17:57:30 +00:00
b10c060bf7 Cleanup TOML 2023-11-01 14:03:04 +01:00
e507ef5932 Slow the logging down 2023-11-01 13:49:32 +01:00
c71b1d33ae Sort entries using rayon in the transform sorters 2023-11-01 11:07:16 +01:00
0fc446c62f Add more timing logs to the Transform 2023-11-01 11:07:16 +01:00
0fb6acefc3 Add snapshots for facets 2023-10-31 17:11:08 +01:00
b1d1355b69 remove tests on soft-deleted 2023-10-31 16:36:27 +01:00
f19332466e Extract field value as values instead of Option<Value> 2023-10-31 16:36:27 +01:00
03ddb4f310 use deladd in facet update tests 2023-10-31 16:36:27 +01:00
c855cc2721 Remove unused test 2023-10-31 16:36:27 +01:00
da0503ef80 Fix document count 2023-10-31 16:36:27 +01:00
54f0ee1ed2 Merge #4167
4167: Introduce the `meilitool` command line interface r=Kerollmops a=Kerollmops

This PR introduces a small tool to help the Cloud team:
 - Clear the tasks queue by removing all the tasks
 - Dump a Meilisearch database without having to enqueue the task
 - Access this `meilitool` binary from the Docker Image

## TODO
 - [x] Modify the Docker File to ship with this new tool (`@curquiza,` could you review that, please?)
 - [x] Clear the tasks queue by removing all the tasks
   - [x] Add more logs to explain what is happening
   - [x] Clear the `update_files` folder
 - [x] Dump a Meilisearch database without having to enqueue the task
   - [x] Add more logs to explain what is happening
   - [x] Introduce a flag to skip dumping enqueued and processing tasks.
   - [x] Dump the instance uid.
   - [x] Dump the keys.
   - [x] Dump the tasks with the update files.
   - [x] Dump the index documents and settings.
   - [ ] ~Dump the experimental features~

Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-10-31 14:05:22 +00:00
94206b0055 Update tests 2023-10-31 13:48:47 +01:00
b40253bf18 update snapshots 2023-10-31 10:30:48 +01:00
d8bf3f3fc2 Remove unused snapshots 2023-10-31 10:12:49 +01:00
9d59e8011a fix some tests 2023-10-31 10:08:36 +01:00
dad78cbf8d Bulk facet remove deletes keys from DB when value empty 2023-10-31 09:53:55 +01:00
4e91707a06 Rename test 2023-10-31 09:41:17 +01:00
de10f20732 Fix field distribution again 2023-10-30 17:47:22 +01:00
ce5647e730 Fix Dockerfile WORKDIR path 2023-10-30 17:27:59 +01:00
b57b818b67 Don't use the last version of clap 2023-10-30 16:57:31 +01:00
f7ea94e5f4 Modify the Dockerfile to compile meilisearch and meilitool 2023-10-30 16:32:17 +01:00
be395c7944 Change order of arguments to tokenizer_builder 2023-10-30 16:26:29 +01:00
9fedd8101a Fix tests 2023-10-30 15:11:07 +01:00
54d07a8da3 Update field distribution taking into account both deletions and additions 2023-10-30 14:47:51 +01:00
53382bb1b8 Introduce a new flag to skip dumping enqueued/processing tasks 2023-10-30 14:32:10 +01:00
5b004a2583 Add more logs to the dump exporter 2023-10-30 14:31:55 +01:00
13416ccbf7 Introduce a new meilitool to help the cloud team 2023-10-30 14:30:20 +01:00
58690dfb19 Fix tests compilation after changes to ExternalDocumentsIds API 2023-10-30 13:34:07 +01:00
abf424ebfc Remove unused FromIterator 2023-10-30 11:41:56 +01:00
dfab6293c9 Use an LMDB database to store the external documents ids 2023-10-30 11:41:23 +01:00
fdf3f7f627 Fix facet distribution test 2023-10-30 11:41:23 +01:00
6260cff65f Actually delete documents from DB when the merge function says so 2023-10-30 11:41:22 +01:00
8e0d9c9a5e Recover delete_documents tests that were too eagerly deleted 2023-10-30 11:41:22 +01:00
ae4ec8ea55 Add delete_document_using_wtxn to TempIndex 2023-10-30 11:41:22 +01:00
652ac3052d use new iterator in batch 2023-10-30 11:41:22 +01:00
9a2dccc3bc Add iterator to find external ids of a bitmap of internal ids 2023-10-30 11:41:22 +01:00
a35988550c Fix some snapshots 2023-10-30 11:41:22 +01:00
e78281785c Actually execute the transform even if there are only documents to delete 2023-10-30 11:41:22 +01:00
3c15881818 Add simple delete test 2023-10-30 11:41:22 +01:00
73c06d31d9 snapshot always display stuff in consistent order 2023-10-30 11:41:22 +01:00
290e773d23 remove more warnings and fix some tests 2023-10-30 11:41:22 +01:00
fa6c7f65ca Add TmpIndex::delete_documents 2023-10-30 11:41:22 +01:00
113527f466 Remove soft-deleted related methods from Index 2023-10-30 11:41:22 +01:00
c534a1b687 Stop using delete documents pipeline in batch runner 2023-10-30 11:41:22 +01:00
2263dff02b Stop using removed delete pipelines almost everywhere 2023-10-30 11:41:22 +01:00
d651b3ef01 Remove delete documents files 2023-10-30 11:41:20 +01:00
762b0b47e6 Use deladd merging function in chunks mergers 2023-10-30 11:40:20 +01:00
01d5eedf2f Remove some warnings 2023-10-30 11:40:20 +01:00
073f89db79 Fix facet tests 2023-10-30 11:40:20 +01:00
8370fbc92b Fix snaps 2023-10-30 11:40:20 +01:00
85f42fbc03 Handle external to internal id mapping from TypedChunk::Documents 2023-10-30 11:40:20 +01:00
c6b3c18c85 WIP: Comment out document deletion in other pipelines than update
TODO: fix calls to DELETE route
2023-10-30 11:40:20 +01:00
bafeb892a7 Modify Index after changes to ExternalDocumentsIds 2023-10-30 11:40:20 +01:00
8fb221dae3 Refactor ExternalDocumentsIds
- Remove soft deleted
- Add apply method that takes a list of operations to encapsulate modifications to the external -> internal mapping
2023-10-30 11:40:20 +01:00
5be569e3e2 Update obkv 2023-10-30 11:40:20 +01:00
946c762d28 WIP: reset documents in TypedChunk::Documents 2023-10-30 11:40:20 +01:00
cda6ca1ee6 Remove TypedChunk::NewDocumentIds 2023-10-30 11:40:18 +01:00
696fcf4d18 Fix document insertion into LMDB 2023-10-30 11:39:31 +01:00
476e4d3dbe Use value buffer instead of the initial value when writting the final result in the sorter 2023-10-30 11:39:31 +01:00
576fa9c6da Remove useless comment 2023-10-30 11:39:31 +01:00
77dcbff6b2 Remove and Insert the DelAdd geo points 2023-10-30 11:39:31 +01:00
544440c363 Ignore geo fields when the Del and Add content is the same 2023-10-30 11:39:31 +01:00
a3dae4db9b Extract the geo fields DelAdd and generate a new DelAdd obkv with it 2023-10-30 11:39:31 +01:00
ba90a5ec0e update extract fid word count docids 2023-10-30 11:39:31 +01:00
b26dc9aabe Explanatory code comment 2023-10-30 11:39:31 +01:00
66abac9364 Use specialized KvReaderDelAdd type
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-10-30 11:39:31 +01:00
59f88c14b3 Simplify facet update after removing Index::faceted_documents_ids 2023-10-30 11:39:29 +01:00
14832cb324 Remove Index::faceted_documents_ids 2023-10-30 11:37:32 +01:00
04ec293024 Facet Incremental update 2023-10-30 11:37:30 +01:00
f67ff3a738 Facets Bulk update 2023-10-30 11:36:40 +01:00
560e8f5613 Introduce the CboRoaringBitmapCodec merge_deladd_into and use it 2023-10-30 11:34:55 +01:00
2d3f15f82c Introduce a function to only serialize the Add side of a DelAdd obkv 2023-10-30 11:34:55 +01:00
40186bf403 Rename FieldIdWordCountDocids correctly 2023-10-30 11:34:50 +01:00
87e3d27878 update extract word pair proximity to support deladd obkvs 2023-10-30 11:34:02 +01:00
6bcf8b4f8c update extract word position docids 2023-10-30 11:34:02 +01:00
46aa75abdb update extract word docids 2023-10-30 11:34:02 +01:00
2597bbd107 Make script language docids map taking a tuple of roaring bitmaps expressing the deletions and the additions 2023-10-30 11:34:00 +01:00
e2bc054604 Update extract_facet_string_docids to support deladd obkvs 2023-10-30 11:32:36 +01:00
fcd3a1434d Update extract_facet_number_docids to support deladd obkvs 2023-10-30 11:31:04 +01:00
a82dee21e0 Rename docid_fid into fid_docid 2023-10-30 11:31:02 +01:00
bc45c1206d Implement all the facet extraction paths and simplify them 2023-10-30 11:29:08 +01:00
6ae4100f07 Generate the DelAdd for is_null, is_empty, and exists 2023-10-30 11:29:08 +01:00
0c47defeee Work on fid docid facet values rewrite 2023-10-30 11:29:06 +01:00
313b16bec2 Support diff indexing on extract_docid_word_positions 2023-10-30 11:24:19 +01:00
1dd97578a8 Make the transform struct return diff-based documents obkvs 2023-10-30 11:22:07 +01:00
f5ef69293b deactivate prefix dbs 2023-10-30 11:22:07 +01:00
1c5705c164 clean PR warnings 2023-10-30 11:22:05 +01:00
66c2c82a18 Split wpp in several sorters 2023-10-30 11:15:02 +01:00
28a8d0ccda Fix word pair proximity 2023-10-30 11:15:02 +01:00
96be85396d Use a vecDeque in wpp database 2023-10-30 11:15:02 +01:00
df9e5c8651 Generalize usage of CboRoaringBitmap codec to ease the use 2023-10-30 11:15:02 +01:00
b541d48847 Add buffer to the obkv writter 2023-10-30 11:15:02 +01:00
8ccf32d1a0 Compute word_fid_docids before word_docids and exact_word_docids 2023-10-30 11:15:02 +01:00
db1ca21231 add puffin in sorter into reeder function 2023-10-30 11:15:00 +01:00
11ea5acff9 Fix 2023-10-30 11:13:10 +01:00
8d77736a67 Fix fid_word_docids 2023-10-30 11:13:10 +01:00
748b333161 Add usefull debug assert before key insertion in database 2023-10-30 11:13:10 +01:00
17b647dfe5 Wip 2023-10-30 11:13:08 +01:00
2614e7d9ca Merge #4174
4174: Fix warnings r=dureuill a=irevoire

Fix all the warnings found in the CI: https://github.com/meilisearch/meilisearch/actions/runs/6622576021/job/17988323623

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-10-30 10:12:54 +00:00
e7244aa485 fix warnings 2023-10-30 11:00:46 +01:00
9cacc82307 Merge #4169
4169: update charabia r=curquiza a=ManyTheFish

Update Charabia to v0.8.5 and add the new khmer tokenizer

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-10-26 17:21:30 +00:00
4c6fddb1cb update charabia 2023-10-26 17:01:10 +02:00
62ea81bef6 Merge #4132
4132: Extract the creation and last updated timestamp from v2 dumps r=irevoire a=vivek-26

# Pull Request

## Related issue
Fixes #2989

## What does this PR do?
This PR - 
- extracts the `created_at` and `updated_at` dates from v2 dumps.
- updates the unit tests.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
2023-10-24 08:50:57 +00:00
f28f09ae2f update tests for v2 dumps 2023-10-24 14:10:46 +05:30
ca52021079 Merge #4154
4154: Update version for the next release (v1.5.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2023-10-23 12:00:50 +00:00
ee6f79d60b Update version for the next release (v1.5.0) in Cargo.toml 2023-10-23 11:49:07 +00:00
e4c24ca6a3 Merge #4151
4151: Bring back changes from v1.4.2 into `release-v1.5.0` r=dureuill a=curquiza

This will bring the fixes in v1.4.2 for v1.5.0 release

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
2023-10-23 10:11:11 +00:00
2bae9550c8 Add explanatory comment 2023-10-23 12:06:28 +02:00
32c78ac8b1 add/update tests when search with distinct attribute & pagination with no ranking 2023-10-23 12:06:27 +02:00
5fe7c4545a compute all candidates correctly when skipping 2023-10-23 12:02:45 +02:00
2042229927 Update version for the next release (v1.4.2) in Cargo.toml 2023-10-23 12:02:45 +02:00
eae9eab181 Merge #4126
4126: Make the experimental route /metrics activable via HTTP r=dureuill a=braddotcoffee

# Pull Request

## Related issue
Closes #4086

## What does this PR do?
- [x] Make `/metrics` available via HTTP as described in #4086 
- [x] The users can still launch Meilisearch using the `--experimental-enable-metrics` flag.
- [x] If the flag `--experimental-enable-metrics` is activated, a call to the `GET /experimental-features` route right after the launch will show `"metrics": true` even if the user has not called the `PATCH /experimental-features` route yet.
- [x] Even if the --experimental-enable-metrics flag is present at launch, calling the `PATCH /experimental-features` route with `"metrics": false` disables the experimental feature.
- [x] Update the spec
    - I was unable to find docs in this repository to update about the `/experimental-features` endpoint. I'll happily update if you point me in the right direction!

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: bwbonanno <bradfordbonanno@gmail.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-10-23 08:51:37 +00:00
cf8dad1ca0 index_scheduler.features() is no longer fallible 2023-10-23 10:38:56 +02:00
dd619913da Use RwLock to never persist cli state to db 2023-10-19 12:45:57 -07:00
9b55ff16e9 Merge #4134
4134: Bump rustix from 0.36.15 to 0.36.16 r=Kerollmops a=dependabot[bot]

Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.36.15 to 0.36.16.
<details>
<summary>Commits</summary>
<ul>
<li><a href="6534992521"><code>6534992</code></a> chore: Release rustix version 0.36.16</li>
<li><a href="4928cf7a38"><code>4928cf7</code></a> Disable riscv64 testing.</li>
<li><a href="8cc159c4c3"><code>8cc159c</code></a> Fix the <code>test_ttyname_ok</code> test when /dev/stdin is inaccessable. (<a href="https://redirect.github.com/bytecodealliance/rustix/issues/821">#821</a>)</li>
<li><a href="6dc7ba9478"><code>6dc7ba9</code></a> Downgrade dependencies and disable tests to compile under Rust 1.48.</li>
<li><a href="ded8986e7e"><code>ded8986</code></a> Disable MIPS in CI. (<a href="https://redirect.github.com/bytecodealliance/rustix/issues/793">#793</a>)</li>
<li><a href="739f9c3ba0"><code>739f9c3</code></a> Fixes for <code>Dir</code> on macOS, FreeBSD, and WASI.</li>
<li><a href="87481a97f4"><code>87481a9</code></a> Merge pull request from GHSA-c827-hfw6-qwvm</li>
<li>See full diff in <a href="https://github.com/bytecodealliance/rustix/compare/v0.36.15...v0.36.16">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rustix&package-manager=cargo&previous-version=0.36.15&new-version=0.36.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-19 08:01:36 +00:00
e761db582f Bump rustix from 0.36.15 to 0.36.16
Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.36.15 to 0.36.16.
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.36.15...v0.36.16)

---
updated-dependencies:
- dependency-name: rustix
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-18 18:42:12 +00:00
d8c649b3cd Return recoverable error if we fail to retrieve metrics state 2023-10-18 08:28:24 -07:00
5e0485d8dd Merge #4131
4131: Reduce proximity range from 7 to 3 r=Kerollmops a=ManyTheFish

## Summary
This PR aims to reduce the impact of the proximity databases on the indexing time and on the database size by reducing the maximum distance between two words to be indexed in the proximity database.

## Stats

### Impact on database size and indexing time
![Impact on datasets](https://github.com/meilisearch/meilisearch/assets/6482087/28ed3d96-bdde-41c1-bdac-e90c1b1dbb23)

### Impact on search relevancy

<details>

| dataset_name | host_name        | Relevancy rate (Precision) | completion_rate  25.00% | completion_rate 50.00% | completion_rate 75.00% | completion_rate 100.00% |
|--------------|------------------|------------------------------------|-----------------|-----------------|-----------------|-----------------|
| FBIS         | 1_4_0            | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FBIS         | 1_4_0            | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FBIS         | 1_4_0            | percentile-50 |           0.00% |           0.00% |           5.00% |           5.56% |
| FBIS         | 1_4_0            | percentile-75 |           0.00% |          12.50% |          35.00% |          45.00% |
| FBIS         | 1_4_0            | percentile-90 |          20.00% |          40.00% |                 |         100.00% |
| FBIS         | 1_4_0            | average       |           5.78% |          11.16% |          21.90% |          26.29% |
| FBIS         | reduce_proximity | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FBIS         | reduce_proximity | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FBIS         | reduce_proximity | percentile-50 |           0.00% |           0.00% |           5.00% |           5.56% |
| FBIS         | reduce_proximity | percentile-75 |           0.00% |          15.00% |          35.00% |          40.00% |
| FBIS         | reduce_proximity | percentile-90 |          20.00% |          40.00% |          85.00% |         100.00% |
| FBIS         | reduce_proximity | average       |           5.55% |          11.34% |          21.75% |          26.14% |
| FR94         | 1_4_0            | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | 1_4_0            | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | 1_4_0            | percentile-50 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | 1_4_0            | percentile-75 |           0.00% |           5.00% |          15.00% |          42.11% |
| FR94         | 1_4_0            | percentile-90 |          15.00% |          54.55% |         100.00% |         100.00% |
| FR94         | 1_4_0            | average       |           5.95% |          12.07% |          18.70% |          25.57% |
| FR94         | reduce_proximity | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | reduce_proximity | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | reduce_proximity | percentile-50 |           0.00% |           0.00% |           0.00% |           0.00% |
| FR94         | reduce_proximity | percentile-75 |           0.00% |           5.00% |          15.00% |          42.11% |
| FR94         | reduce_proximity | percentile-90 |          15.00% |          54.55% |         100.00% |         100.00% |
| FR94         | reduce_proximity | average       |           5.79% |          12.00% |          18.70% |          25.53% |
| FT           | 1_4_0            | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FT           | 1_4_0            | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FT           | 1_4_0            | percentile-50 |           0.00% |           0.00% |           5.00% |          10.00% |
| FT           | 1_4_0            | percentile-75 |           0.00% |          15.00% |          30.00% |          40.00% |
| FT           | 1_4_0            | percentile-90 |          20.00% |          50.00% |          65.00% |         100.00% |
| FT           | 1_4_0            | average       |           5.08% |          12.58% |          20.00% |          25.49% |
| FT           | reduce_proximity | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| FT           | reduce_proximity | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| FT           | reduce_proximity | percentile-50 |           0.00% |           0.00% |           5.00% |          10.00% |
| FT           | reduce_proximity | percentile-75 |           0.00% |          15.00% |          30.00% |          40.00% |
| FT           | reduce_proximity | percentile-90 |          10.00% |          45.00% |          60.00% |         100.00% |
| FT           | reduce_proximity | average       |           5.01% |          12.64% |          20.10% |          25.53% |
| LAT          | 1_4_0            | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| LAT          | 1_4_0            | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| LAT          | 1_4_0            | percentile-50 |           0.00% |           0.00% |           5.00% |           5.00% |
| LAT          | 1_4_0            | percentile-75 |           5.00% |          15.00% |          30.00% |          30.00% |
| LAT          | 1_4_0            | percentile-90 |          15.00% |          45.00% |          60.00% |          80.00% |
| LAT          | 1_4_0            | average       |           4.80% |          11.80% |          17.88% |          21.62% |
| LAT          | reduce_proximity | percentile-10 |           0.00% |           0.00% |           0.00% |           0.00% |
| LAT          | reduce_proximity | percentile-25 |           0.00% |           0.00% |           0.00% |           0.00% |
| LAT          | reduce_proximity | percentile-50 |           0.00% |           0.00% |           5.00% |           5.00% |
| LAT          | reduce_proximity | percentile-75 |           0.00% |          11.11% |          25.00% |          35.00% |
| LAT          | reduce_proximity | percentile-90 |          15.00% |          45.00% |          55.00% |          80.00% |
| LAT          | reduce_proximity | average       |           4.43% |          11.23% |          17.32% |          21.45% |

</details>

### Impact on Search time

| dataset_name | host_name        |      25.00% |      50.00% |      75.00% |     100.00% | Average     |
|--------------|------------------|------------:|------------:|------------:|------------:|-------------|
| FBIS         | 1_4_0            |        3.45 | 7.446666667 | 9.773489933 | 9.620300752 | 7.572614338 |
| FBIS         | reduce_proximity | 2.983333333 | 5.316666667 | 6.911073826 | 7.637218045 | 5.712072968 |
| FR94         | 1_4_0            | 2.236666667 |        4.45 | 5.523489933 | 4.560150376 | 4.192576744 |
| FR94         | reduce_proximity |        2.09 | 3.991666667 | 4.981543624 | 4.266917293 | 3.832531896 |
| FT           | 1_4_0            | 5.956666667 | 9.656666667 | 13.86912752 | 10.83270677 |  10.0787919 |
| FT           | reduce_proximity |        4.51 | 5.981666667 | 7.701342282 | 6.766917293 |  6.23998156 |
| LAT          | 1_4_0            | 5.856666667 | 9.233333333 | 12.98322148 | 10.78759398 | 9.715203865 |
| LAT          | reduce_proximity |        6.91 | 6.706666667 | 8.463087248 | 8.265037594 | 7.586197877 |

## Technical approach

- Ensure the MAX_DISTANCE constant is used everywhere needed
- Reduce the MAX_DISTANCE from 8 to 4

## Related

TBD

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-10-18 14:56:08 +00:00
27eec21415 Fix tests 2023-10-18 16:03:22 +02:00
62cc97ba70 update tests to include created_at and updated-at in v2 dumps 2023-10-18 13:31:39 +05:30
fed59cc1d5 extract created_at and updated_at dates from v2 dumps 2023-10-18 13:30:24 +05:30
2b3adef796 Use index_scheduler from configured app_data in middleware 2023-10-17 08:17:13 -07:00
956cfc5487 Add runtime check to metrics middleware 2023-10-16 13:48:57 -07:00
12fc878640 Merge remote-tracking branch 'origin/main' into enable-metrics-http 2023-10-16 13:48:01 -07:00
0a2e8b92a9 Merge #4129
4129: Add webinar banner in README r=curquiza a=curquiza



Co-authored-by: curquiza <clementine@meilisearch.com>
2023-10-16 17:35:48 +00:00
c7a3f80de6 Merge #4073
4073: Simplify Puffin report exports r=ManyTheFish a=Kerollmops

This PR changes how we export Puffin reports by directly writing them to disk when the `exportPuffinReports` [experimental feature is enabled](https://www.meilisearch.com/docs/learn/experimental/overview) on the `/experimental-features` route. It also adds more puffing logging to the deletion phase and grenad helpers. The puffin reports are identified by the date and time at which they are exported.

## Todo List
 - [x] Change the CLI flag to be an API experimental option.
 - [x] Create [a PRD for this experimental feature (private)](https://www.notion.so/meilisearch/Export-Puffin-Reports-091df151e71c4edfb7d72f4bf995b3ea).
 - [x] Create and complete [a product discussion](https://github.com/meilisearch/product/discussions/693) (copy/paste PROFILING markdown?).
 - [x] Update the _PROFILING.md_ markdown file instructions.
 - [x] Change the debug logs of the processing operation (visible in puffin viewer).

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-10-16 15:48:15 +00:00
029d4de043 Add webinar banner in README 2023-10-16 14:38:10 +02:00
549f1bcccf Merge #4125
4125: Rename benchmark CI file to find it easily in the manifest list r=Kerollmops a=curquiza



Co-authored-by: curquiza <clementine@meilisearch.com>
2023-10-16 11:38:28 +00:00
689ec7c7ad Make the experimental route /metrics activable via HTTP 2023-10-13 22:12:54 +00:00
3655d4bdca Move the puffin file export logic into the run function 2023-10-13 13:11:30 +02:00
055ca3935b Update index-scheduler/src/batch.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-10-13 13:11:30 +02:00
1b8871a585 Make cargo insta happy 2023-10-13 13:11:30 +02:00
bf8fac6676 Fix the tests 2023-10-13 13:11:30 +02:00
f2a9e1ebbb Improve the debugging experience in the puffin reports 2023-10-13 13:11:30 +02:00
c45c6cf54c Update the PROFILING.md file 2023-10-13 13:11:30 +02:00
513e61e9a3 Remove the experimental CLI flag 2023-10-13 13:11:29 +02:00
90a626bf80 Use the runtime feature to enable puffin report exporting 2023-10-13 13:11:29 +02:00
0d4acf2daa Fix the metrics product URL 2023-10-13 13:11:29 +02:00
58db8d85ec Add the exportPuffinReports option to the runtime features route 2023-10-13 13:11:29 +02:00
62dfd09dc6 Add more puffin logs to the deletion functions 2023-10-13 13:11:09 +02:00
656dadabea Expose an experimental flag to write the puffin reports to disk 2023-10-13 13:11:09 +02:00
c5f7893fbb Remove the puffin http dependency 2023-10-13 13:11:08 +02:00
8cf2ccf168 Rename benchmark CI file to find it easily in the manifest list 2023-10-12 18:41:26 +02:00
0913373a5e Merge #4122
4122: Bring back changes from `release-v1.4.1` into `main` r=Kerollmops a=curquiza



Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-10-12 15:57:47 +00:00
1a7f1282af Fix test to use new common Value type 2023-10-12 17:37:04 +02:00
bc747aac3a Cut the first 8 characters 2023-10-12 15:04:37 +02:00
be92376ab3 Fix originating commit branch 2023-10-12 13:51:41 +02:00
cf7e355735 Fix originating commit command 2023-10-12 13:12:53 +02:00
5f09d89ad1 Fetch the whole git history when cloning 2023-10-12 12:25:26 +02:00
6ecb26a3f8 Add more info on the commenting CI command 2023-10-12 11:54:56 +02:00
76c6f554d6 Merge #4101
4101: Bump webpki from 0.22.1 to 0.22.2 r=curquiza a=dependabot[bot]

Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.1 to 0.22.2.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a href="https://github.com/briansmith/webpki/commits">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=webpki&package-manager=cargo&previous-version=0.22.1&new-version=0.22.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meilisearch/meilisearch/network/alerts).

</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-12 08:46:04 +00:00
f343ef5f2f Merge #4108
4108: Fix bug where search with distinct attribute and no ranking, returns offset+limit hits r=curquiza a=vivek-26

# Pull Request

## Related issue
Fixes #4078 

## What does this PR do?
This PR - 
- Fixes bug where search with distinct attribute and no ranking, returns offset+limit hits.
- Adds unit and integration tests.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Vivek Kumar <vivek.26@outlook.com>
2023-10-12 07:51:29 +00:00
96982a768a Triggers for every type of issue_comment 2023-10-11 23:18:29 +02:00
fca78fbc46 Merge #4082
4082: Update sprint_issue.md r=curquiza a=curquiza

Following internal recent discussions

Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2023-10-11 15:12:38 +00:00
67a678cfb6 Merge #4089
4089: Use a bufreader and bufwriter everytime there is a grenad<file> r=curquiza a=irevoire

# Pull Request
Wrap all the files we give to a grenad in a `BufReader` or `BufWriter`.

The dump import I tried in the issue went from 2h to 10 minutes on my machine.

I also ran a bunch of benchmarks on my machine, and we're faster by a few seconds everywhere but nothing huge.

-----

The one thing I’m afraid about is if we used to get the inner file in a grenad and then do a read right after without a seek at the beginning of the file or a reopen.
Since we now use a bufreader our read would return the bytes one buffer later and probably completely corrupt what we were supposed to read.

From what I see, it looks like it works, but I may have missed something, I don't know much about this part of the codebase.

This issue should not arise on the bufwriter, though, because if we're not able to write the content of the buffer I ensured that the `into_inner` of the bufwriter should return an internal error.

## Related issue
Fixes #4087


Co-authored-by: Tamo <tamo@meilisearch.com>
2023-10-11 14:27:00 +00:00
d1331d8abf add integration test for distinct search with no ranking 2023-10-11 19:12:56 +05:30
19ba129165 add unit test for distinct search with no ranking 2023-10-11 19:02:27 +05:30
d4da06ff47 fix bug where distinct search with no ranking returns offset+limit hits 2023-10-11 19:02:16 +05:30
3e0471edae Only trigger CI on created or edited comments 2023-10-11 15:15:15 +02:00
432df03c4c Use the correct base filename in the comment bench CI 2023-10-11 14:57:03 +02:00
11958016dd Force a small if to evoid triggering the CI every time 2023-10-11 14:27:51 +02:00
63c250a04d Do not use the GITHUB_REF variable 2023-10-11 13:05:54 +02:00
06d8cd5b72 Make sure that we checkout on the right branch 2023-10-11 12:02:44 +02:00
c0f2724c2d get rids of the new introduced error code in favor of an io::Error 2023-10-10 15:12:23 +02:00
d772073dfa use a bufreader everytime there is a grenad<file> 2023-10-10 15:00:30 +02:00
8fe8ddea79 Merge #4112
4112: Update version for the next release (v1.4.1) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2023-10-10 09:05:10 +00:00
8a95bf28e5 Update version for the next release (v1.4.1) in Cargo.toml 2023-10-10 09:01:45 +00:00
c0fd3dffb8 Setup a Github Token env var 2023-10-09 18:04:49 +02:00
c42fd5375f Fix the git commands again 2023-10-09 17:36:19 +02:00
b418c3a756 Use the PAT token instead 2023-10-09 16:52:04 +02:00
1cde455758 Fix workflow CI 2023-10-09 16:30:46 +02:00
ca19bae72f Prefer using a action to manage commands 2023-10-09 14:56:41 +02:00
705878ff59 Merge #4102
4102: Introduce the first bot that shows benchmarks results r=curquiza a=Kerollmops

TBD

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-10-05 10:11:06 +00:00
92c280d1c8 Update .github/workflows/trigger-benchmarks-on-message.yml
Co-authored-by: Clémentine U. - curqui <clementine@meilisearch.com>
2023-10-05 12:09:52 +02:00
181e7a1e53 Introduce the first bot that triggers benchmarks 2023-10-05 12:05:38 +02:00
2e5abb4d2c Merge #4098
4098: Bump docker/metadata-action from 4 to 5 r=curquiza a=dependabot[bot]

Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/metadata-action/releases">docker/metadata-action's releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<ul>
<li>Node 20 as default runtime (requires <a href="https://github.com/actions/runner/releases/tag/v2.308.0">Actions Runner v2.308.0</a> or later) by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/metadata-action/pull/328">docker/metadata-action#328</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 in <a href="https://redirect.github.com/docker/metadata-action/pull/333">docker/metadata-action#333</a></li>
<li>Bump csv-parse from 5.4.0 to 5.5.0 in <a href="https://redirect.github.com/docker/metadata-action/pull/320">docker/metadata-action#320</a></li>
<li>Bump semver from 7.5.1 to 7.5.2 in <a href="https://redirect.github.com/docker/metadata-action/pull/304">docker/metadata-action#304</a></li>
<li>Bump handlebars from 4.7.7 to 4.7.8 in <a href="https://redirect.github.com/docker/metadata-action/pull/315">docker/metadata-action#315</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.6.0...v5.0.0">https://github.com/docker/metadata-action/compare/v4.6.0...v5.0.0</a></p>
<h2>v4.6.0</h2>
<ul>
<li>Dedup and sort labels by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/metadata-action/pull/301">docker/metadata-action#301</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.3.0 to 0.5.0 in <a href="https://redirect.github.com/docker/metadata-action/pull/302">docker/metadata-action#302</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.5.0...v4.6.0">https://github.com/docker/metadata-action/compare/v4.5.0...v4.6.0</a></p>
<h2>v4.5.0</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.1.0 to 0.3.0 in <a href="https://redirect.github.com/docker/metadata-action/pull/296">docker/metadata-action#296</a></li>
<li>Bump csv-parse from 5.3.8 to 5.4.0 in <a href="https://redirect.github.com/docker/metadata-action/pull/294">docker/metadata-action#294</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.4.0...v4.5.0">https://github.com/docker/metadata-action/compare/v4.4.0...v4.5.0</a></p>
<h2>v4.4.0</h2>
<ul>
<li>Add <code>context</code> input to define the metadata provider by <a href="https://github.com/neilime"><code>`@​neilime</code></a>` in <a href="https://redirect.github.com/docker/metadata-action/pull/248">docker/metadata-action#248</a></li>
<li>Switch to actions-toolkit implementation by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/metadata-action/pull/266">docker/metadata-action#266</a> <a href="https://redirect.github.com/docker/metadata-action/pull/273">docker/metadata-action#273</a> <a href="https://redirect.github.com/docker/metadata-action/pull/284">docker/metadata-action#284</a></li>
<li>Bump csv-parse from 5.3.3 to 5.3.8 in <a href="https://redirect.github.com/docker/metadata-action/pull/271">docker/metadata-action#271</a> <a href="https://redirect.github.com/docker/metadata-action/pull/286">docker/metadata-action#286</a></li>
<li>Bump moment-timezone from 0.5.40 to 0.5.43 in <a href="https://redirect.github.com/docker/metadata-action/pull/268">docker/metadata-action#268</a> <a href="https://redirect.github.com/docker/metadata-action/pull/278">docker/metadata-action#278</a> <a href="https://redirect.github.com/docker/metadata-action/pull/281">docker/metadata-action#281</a></li>
<li>Bump semver from 7.4.0 to 7.5.0 in <a href="https://redirect.github.com/docker/metadata-action/pull/285">docker/metadata-action#285</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.3.0...v4.4.0">https://github.com/docker/metadata-action/compare/v4.3.0...v4.4.0</a></p>
<h2>v4.3.0</h2>
<ul>
<li>Provide outputs as env vars by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/metadata-action/issues/257">#257</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.2.0...v4.3.0">https://github.com/docker/metadata-action/compare/v4.2.0...v4.3.0</a></p>
<h2>v4.2.0</h2>
<ul>
<li>Add <code>tz</code> attribute to handlebar date function by <a href="https://github.com/chroju"><code>`@​chroju</code></a>` (<a href="https://redirect.github.com/docker/metadata-action/issues/251">#251</a>)</li>
<li>Bump minimatch from 3.0.4 to 3.1.2 (<a href="https://redirect.github.com/docker/metadata-action/issues/242">#242</a>)</li>
<li>Bump csv-parse from 5.3.1 to 5.3.3 (<a href="https://redirect.github.com/docker/metadata-action/issues/245">#245</a>)</li>
<li>Bump json5 from 2.2.0 to 2.2.3 (<a href="https://redirect.github.com/docker/metadata-action/issues/252">#252</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.1.1...v4.2.0">https://github.com/docker/metadata-action/compare/v4.1.1...v4.2.0</a></p>
<h2>v4.1.1</h2>
<ul>
<li>Revert changes to set associated head sha on pull request event by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/metadata-action/issues/239">#239</a>)
<ul>
<li>User can still set associated head sha on PR by setting the env var <code>DOCKER_METADATA_PR_HEAD_SHA=true</code></li>
</ul>
</li>
<li>Bump csv-parse from 5.3.0 to 5.3.1 (<a href="https://redirect.github.com/docker/metadata-action/issues/237">#237</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/metadata-action/compare/v4.1.0...v4.1.1">https://github.com/docker/metadata-action/compare/v4.1.0...v4.1.1</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Upgrade guide</summary>
<p><em>Sourced from <a href="https://github.com/docker/metadata-action/blob/master/UPGRADE.md">docker/metadata-action's upgrade guide</a>.</em></p>
<blockquote>
<h1>Upgrade notes</h1>
<h2>v2 to v3</h2>
<ul>
<li>Repository has been moved to docker org. Replace <code>crazy-max/ghaction-docker-meta@v2</code>
with <code>docker/metadata-action@v5</code></li>
<li>The default bake target has been changed: <code>ghaction-docker-meta</code> &gt; <code>docker-metadata-action</code></li>
</ul>
<h2>v1 to v2</h2>
<ul>
<li><a href="https://github.com/docker/metadata-action/blob/master/#inputs">inputs</a>
<ul>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-sha"><code>tag-sha</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-edge--tag-edge-branch"><code>tag-edge</code> / <code>tag-edge-branch</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-semver"><code>tag-semver</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-match--tag-match-group"><code>tag-match</code> / <code>tag-match-group</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-latest"><code>tag-latest</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-schedule"><code>tag-schedule</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#tag-custom--tag-custom-only"><code>tag-custom</code> / <code>tag-custom-only</code></a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#label-custom"><code>label-custom</code></a></li>
</ul>
</li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#basic-workflow">Basic workflow</a></li>
<li><a href="https://github.com/docker/metadata-action/blob/master/#semver-workflow">Semver workflow</a></li>
</ul>
<h3>inputs</h3>
<table>
<thead>
<tr>
<th>New</th>
<th>Unchanged</th>
<th>Removed</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>tags</code></td>
<td><code>images</code></td>
<td><code>tag-sha</code></td>
</tr>
<tr>
<td><code>flavor</code></td>
<td><code>sep-tags</code></td>
<td><code>tag-edge</code></td>
</tr>
<tr>
<td><code>labels</code></td>
<td><code>sep-labels</code></td>
<td><code>tag-edge-branch</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-semver</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-match</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-match-group</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-latest</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-schedule</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-custom</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>tag-custom-only</code></td>
</tr>
<tr>
<td></td>
<td></td>
<td><code>label-custom</code></td>
</tr>
</tbody>
</table>
<h4><code>tag-sha</code></h4>
<pre lang="yaml"><code>tags: |
  type=sha
</code></pre>
<h4><code>tag-edge</code> / <code>tag-edge-branch</code></h4>
<pre lang="yaml"><code>tags: |
  # default branch
&lt;/tr&gt;&lt;/table&gt; 
</code></pre>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="96383f4557"><code>96383f4</code></a> Merge pull request <a href="https://redirect.github.com/docker/metadata-action/issues/320">#320</a> from docker/dependabot/npm_and_yarn/csv-parse-5.5.0</li>
<li><a href="f138b9677b"><code>f138b96</code></a> chore: update generated content</li>
<li><a href="9cf7015b15"><code>9cf7015</code></a> Bump csv-parse from 5.4.0 to 5.5.0</li>
<li><a href="5a8a5ff8df"><code>5a8a5ff</code></a> Merge pull request <a href="https://redirect.github.com/docker/metadata-action/issues/315">#315</a> from docker/dependabot/npm_and_yarn/handlebars-4.7.8</li>
<li><a href="2279d9af58"><code>2279d9a</code></a> chore: update generated content</li>
<li><a href="c659933213"><code>c659933</code></a> Bump handlebars from 4.7.7 to 4.7.8</li>
<li><a href="48d23ccc05"><code>48d23cc</code></a> Merge pull request <a href="https://redirect.github.com/docker/metadata-action/issues/333">#333</a> from docker/dependabot/npm_and_yarn/actions/core-1.10.1</li>
<li><a href="b83ffb48d6"><code>b83ffb4</code></a> chore: update generated content</li>
<li><a href="3207f2405f"><code>3207f24</code></a> Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1</li>
<li><a href="63f4a263e5"><code>63f4a26</code></a> Merge pull request <a href="https://redirect.github.com/docker/metadata-action/issues/328">#328</a> from crazy-max/update-node20</li>
<li>Additional commits viewable in <a href="https://github.com/docker/metadata-action/compare/v4...v5">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/metadata-action&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-04 08:40:29 +00:00
44aaf5d9e3 Bump docker/metadata-action from 4 to 5
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 4 to 5.
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Upgrade guide](https://github.com/docker/metadata-action/blob/master/UPGRADE.md)
- [Commits](https://github.com/docker/metadata-action/compare/v4...v5)

---
updated-dependencies:
- dependency-name: docker/metadata-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-03 23:09:04 +00:00
ff0ababf65 Merge #4097
4097: Bump docker/setup-buildx-action from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/setup-buildx-action/releases">docker/setup-buildx-action's releases</a>.</em></p>
<blockquote>
<h2>v3.0.0</h2>
<ul>
<li>Node 20 as default runtime (requires <a href="https://github.com/actions/runner/releases/tag/v2.308.0">Actions Runner v2.308.0</a> or later) by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/264">docker/setup-buildx-action#264</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/267">docker/setup-buildx-action#267</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.10.0...v3.0.0">https://github.com/docker/setup-buildx-action/compare/v2.10.0...v3.0.0</a></p>
<h2>v2.10.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.7.1 to 0.10.0 by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/258">docker/setup-buildx-action#258</a></li>
<li>Bump word-wrap from 1.2.3 to 1.2.5 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/253">docker/setup-buildx-action#253</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.9.1...v2.10.0">https://github.com/docker/setup-buildx-action/compare/v2.9.1...v2.10.0</a></p>
<h2>v2.9.1</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.7.0 to 0.7.1 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/248">docker/setup-buildx-action#248</a>
<ul>
<li>Fixes an issue where building Buildx does not match the local platform (<a href="https://redirect.github.com/docker/actions-toolkit/pull/135">docker/actions-toolkit#135</a>)</li>
</ul>
</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.9.0...v2.9.1">https://github.com/docker/setup-buildx-action/compare/v2.9.0...v2.9.1</a></p>
<h2>v2.9.0</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.6.0 to 0.7.0 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/246">docker/setup-buildx-action#246</a>
<ul>
<li>Adds support to cache Buildx binary to hosted tool cache and GHA cache backend</li>
</ul>
</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.8.0...v2.9.0">https://github.com/docker/setup-buildx-action/compare/v2.8.0...v2.9.0</a></p>
<h2>v2.8.0</h2>
<ul>
<li>Only set specific flags for drivers supporting them by <a href="https://github.com/nicks"><code>`@​nicks</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/241">docker/setup-buildx-action#241</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.5.0 to 0.6.0 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/242">docker/setup-buildx-action#242</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.7.0...v2.8.0">https://github.com/docker/setup-buildx-action/compare/v2.7.0...v2.8.0</a></p>
<h2>v2.7.0</h2>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.3.0 to 0.5.0 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/237">docker/setup-buildx-action#237</a> <a href="https://redirect.github.com/docker/setup-buildx-action/pull/238">docker/setup-buildx-action#238</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.6.0...v2.7.0">https://github.com/docker/setup-buildx-action/compare/v2.6.0...v2.7.0</a></p>
<h2>v2.6.0</h2>
<ul>
<li>Set node name for k8s driver when appending nodes by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/219">docker/setup-buildx-action#219</a></li>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.1.0-beta.18 to 0.3.0 in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/220">docker/setup-buildx-action#220</a> <a href="https://redirect.github.com/docker/setup-buildx-action/pull/229">docker/setup-buildx-action#229</a> <a href="https://redirect.github.com/docker/setup-buildx-action/pull/231">docker/setup-buildx-action#231</a> <a href="https://redirect.github.com/docker/setup-buildx-action/pull/236">docker/setup-buildx-action#236</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.5.0...v2.6.0">https://github.com/docker/setup-buildx-action/compare/v2.5.0...v2.6.0</a></p>
<h2>v2.5.0</h2>
<ul>
<li><code>cleanup</code> input to remove builder and temp files by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/213">docker/setup-buildx-action#213</a></li>
<li>do not remove builder using the <code>docker</code> driver by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/218">docker/setup-buildx-action#218</a></li>
<li>fix current context as builder name for <code>docker</code> driver by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-buildx-action/pull/209">docker/setup-buildx-action#209</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-buildx-action/compare/v2.4.1...v2.5.0">https://github.com/docker/setup-buildx-action/compare/v2.4.1...v2.5.0</a></p>
<h2>v2.4.1</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="f95db51fdd"><code>f95db51</code></a> Merge pull request <a href="https://redirect.github.com/docker/setup-buildx-action/issues/267">#267</a> from docker/dependabot/npm_and_yarn/actions/core-1.10.1</li>
<li><a href="998a87c2c1"><code>998a87c</code></a> chore: update generated content</li>
<li><a href="28bae59336"><code>28bae59</code></a> build(deps): bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1</li>
<li><a href="c215341715"><code>c215341</code></a> Merge pull request <a href="https://redirect.github.com/docker/setup-buildx-action/issues/264">#264</a> from crazy-max/update-node20</li>
<li><a href="02e9319239"><code>02e9319</code></a> chore: node 20 as default runtime</li>
<li><a href="5c9160effc"><code>5c9160e</code></a> chore: update generated content</li>
<li><a href="1283140f57"><code>1283140</code></a> chore: fix author in package.json</li>
<li><a href="c6afe06e4a"><code>c6afe06</code></a> vendor: bump <code>`@​docker/actions-toolkit</code>` from 0.10.0 to 0.12.0</li>
<li><a href="f35e0d5a04"><code>f35e0d5</code></a> chore: update dev dependencies</li>
<li><a href="baeb468fb2"><code>baeb468</code></a> dev: remove unneeded binaries</li>
<li>Additional commits viewable in <a href="https://github.com/docker/setup-buildx-action/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/setup-buildx-action&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 17:14:34 +00:00
c5336af1c5 Bump docker/setup-buildx-action from 2 to 3
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-03 15:41:06 +00:00
1567758a56 Merge #4099
4099: Bump docker/build-push-action from 4 to 5 r=curquiza a=dependabot[bot]

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/build-push-action/releases">docker/build-push-action's releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<ul>
<li>Node 20 as default runtime (requires <a href="https://github.com/actions/runner/releases/tag/v2.308.0">Actions Runner v2.308.0</a> or later) by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/954">docker/build-push-action#954</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 in <a href="https://redirect.github.com/docker/build-push-action/pull/959">docker/build-push-action#959</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v4.2.1...v5.0.0">https://github.com/docker/build-push-action/compare/v4.2.1...v5.0.0</a></p>
<h2>v4.2.1</h2>
<blockquote>
<p><strong>Note</strong></p>
<p>Buildx v0.10 enables support for a minimal <a href="https://slsa.dev/provenance/">SLSA Provenance</a> attestation, which requires support for <a href="https://github.com/opencontainers/image-spec">OCI-compliant</a> multi-platform images. This may introduce issues with registry and runtime support (e.g. <a href="https://redirect.github.com/docker/buildx/issues/1533">Google Cloud Run and AWS Lambda</a>). You can optionally disable the default provenance attestation functionality using <code>provenance: false</code>.</p>
</blockquote>
<ul>
<li>warn if docker config can't be parsed by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/957">docker/build-push-action#957</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v4.2.0...v4.2.1">https://github.com/docker/build-push-action/compare/v4.2.0...v4.2.1</a></p>
<h2>v4.2.0</h2>
<blockquote>
<p><strong>Note</strong></p>
<p>Buildx v0.10 enables support for a minimal <a href="https://slsa.dev/provenance/">SLSA Provenance</a> attestation, which requires support for <a href="https://github.com/opencontainers/image-spec">OCI-compliant</a> multi-platform images. This may introduce issues with registry and runtime support (e.g. <a href="https://redirect.github.com/docker/buildx/issues/1533">Google Cloud Run and AWS Lambda</a>). You can optionally disable the default provenance attestation functionality using <code>provenance: false</code>.</p>
</blockquote>
<ul>
<li>display proxy configuration  by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/872">docker/build-push-action#872</a></li>
<li>chore(deps): Bump <code>`@​docker/actions-toolkit</code>` from 0.6.0 to 0.8.0 in <a href="https://redirect.github.com/docker/build-push-action/pull/930">docker/build-push-action#930</a></li>
<li>chore(deps): Bump word-wrap from 1.2.3 to 1.2.5 in <a href="https://redirect.github.com/docker/build-push-action/pull/925">docker/build-push-action#925</a></li>
<li>chore(deps): Bump semver from 6.3.0 to 6.3.1 in <a href="https://redirect.github.com/docker/build-push-action/pull/902">docker/build-push-action#902</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v4.1.1...v4.2.0">https://github.com/docker/build-push-action/compare/v4.1.1...v4.2.0</a></p>
<h2>v4.1.1</h2>
<blockquote>
<p><strong>Note</strong></p>
<p>Buildx v0.10 enables support for a minimal <a href="https://slsa.dev/provenance/">SLSA Provenance</a> attestation, which requires support for <a href="https://github.com/opencontainers/image-spec">OCI-compliant</a> multi-platform images. This may introduce issues with registry and runtime support (e.g. <a href="https://redirect.github.com/docker/buildx/issues/1533">Google Cloud Run and AWS Lambda</a>). You can optionally disable the default provenance attestation functionality using <code>provenance: false</code>.</p>
</blockquote>
<ul>
<li>Bump <code>`@​docker/actions-toolkit</code>` from 0.3.0 to 0.5.0 by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/880">docker/build-push-action#880</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v4.1.0...v4.1.1">https://github.com/docker/build-push-action/compare/v4.1.0...v4.1.1</a></p>
<h2>v4.1.0</h2>
<blockquote>
<p><strong>Note</strong></p>
<p>Buildx v0.10 enables support for a minimal <a href="https://slsa.dev/provenance/">SLSA Provenance</a> attestation, which requires support for <a href="https://github.com/opencontainers/image-spec">OCI-compliant</a> multi-platform images. This may introduce issues with registry and runtime support (e.g. <a href="https://redirect.github.com/docker/buildx/issues/1533">Google Cloud Run and AWS Lambda</a>). You can optionally disable the default provenance attestation functionality using <code>provenance: false</code>.</p>
</blockquote>
<ul>
<li>Switch to actions-toolkit implementation by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/811">docker/build-push-action#811</a>  <a href="https://redirect.github.com/docker/build-push-action/pull/838">docker/build-push-action#838</a> <a href="https://redirect.github.com/docker/build-push-action/pull/855">docker/build-push-action#855</a> <a href="https://redirect.github.com/docker/build-push-action/pull/860">docker/build-push-action#860</a> <a href="https://redirect.github.com/docker/build-push-action/pull/875">docker/build-push-action#875</a></li>
<li>e2e: quay.io by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/799">docker/build-push-action#799</a> <a href="https://redirect.github.com/docker/build-push-action/pull/805">docker/build-push-action#805</a></li>
<li>e2e: local harbor and nexus by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/800">docker/build-push-action#800</a></li>
<li>e2e: add artifactory container registry to test against by <a href="https://github.com/jedevc"><code>`@​jedevc</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/804">docker/build-push-action#804</a></li>
<li>e2e: add distribution tests by <a href="https://github.com/jedevc"><code>`@​jedevc</code></a>` in <a href="https://redirect.github.com/docker/build-push-action/pull/814">docker/build-push-action#814</a> <a href="https://redirect.github.com/docker/build-push-action/pull/815">docker/build-push-action#815</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/build-push-action/compare/v4.0.0...v4.1.0">https://github.com/docker/build-push-action/compare/v4.0.0...v4.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="0565240e2d"><code>0565240</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/959">#959</a> from docker/dependabot/npm_and_yarn/actions/core-1.10.1</li>
<li><a href="3ab07f8801"><code>3ab07f8</code></a> chore: update generated content</li>
<li><a href="b9e7e4daec"><code>b9e7e4d</code></a> chore(deps): Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1</li>
<li><a href="04d1a3b049"><code>04d1a3b</code></a> Merge pull request <a href="https://redirect.github.com/docker/build-push-action/issues/954">#954</a> from crazy-max/update-node20</li>
<li><a href="1a4d1a13fb"><code>1a4d1a1</code></a> chore: node 20 as default runtime</li>
<li><a href="675965c0e1"><code>675965c</code></a> chore: update generated content</li>
<li><a href="58ee34cb6b"><code>58ee34c</code></a> chore: fix author in package.json</li>
<li><a href="c97c4060bd"><code>c97c406</code></a> fix ProxyConfig type when checking length</li>
<li><a href="47d5369e0b"><code>47d5369</code></a> vendor: bump <code>`@​docker/actions-toolkit</code>` from 0.8.0 to 0.12.0</li>
<li><a href="8895c7468f"><code>8895c74</code></a> chore: update dev dependencies</li>
<li>Additional commits viewable in <a href="https://github.com/docker/build-push-action/compare/v4...v5">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/build-push-action&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 11:55:31 +00:00
37953afe1a Bump docker/build-push-action from 4 to 5
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 4 to 5.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v4...v5)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-03 11:53:53 +00:00
de3f992ae4 Merge #4095
4095: Bump docker/setup-qemu-action from 2 to 3 r=curquiza a=dependabot[bot]

Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/docker/setup-qemu-action/releases">docker/setup-qemu-action's releases</a>.</em></p>
<blockquote>
<h2>v3.0.0</h2>
<ul>
<li>Node 20 as default runtime (requires <a href="https://github.com/actions/runner/releases/tag/v2.308.0">Actions Runner v2.308.0</a> or later) by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-qemu-action/pull/102">docker/setup-qemu-action#102</a></li>
<li>Bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1 in <a href="https://redirect.github.com/docker/setup-qemu-action/pull/103">docker/setup-qemu-action#103</a></li>
<li>Bump semver from 6.3.0 to 6.3.1 in <a href="https://redirect.github.com/docker/setup-qemu-action/pull/89">docker/setup-qemu-action#89</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-qemu-action/compare/v2.2.0...v3.0.0">https://github.com/docker/setup-qemu-action/compare/v2.2.0...v3.0.0</a></p>
<h2>v2.2.0</h2>
<ul>
<li>Trim off spaces in <code>platforms</code> input by <a href="https://github.com/Chocobo1"><code>`@​Chocobo1</code></a>` in <a href="https://redirect.github.com/docker/setup-qemu-action/pull/64">docker/setup-qemu-action#64</a></li>
<li>Switch to actions-toolkit implementation by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` in <a href="https://redirect.github.com/docker/setup-qemu-action/pull/70">docker/setup-qemu-action#70</a> <a href="https://redirect.github.com/docker/setup-qemu-action/pull/80">docker/setup-qemu-action#80</a> <a href="https://redirect.github.com/docker/setup-qemu-action/pull/83">docker/setup-qemu-action#83</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-qemu-action/compare/v2.1.0...v2.2.0">https://github.com/docker/setup-qemu-action/compare/v2.1.0...v2.2.0</a></p>
<h2>v2.1.0</h2>
<ul>
<li>Use context for inputs by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/setup-qemu-action/issues/62">#62</a>)</li>
<li>Use built-in <code>getExecOutput</code> by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/setup-qemu-action/issues/61">#61</a>)</li>
<li>Remove workaround for <code>setOutput</code> by <a href="https://github.com/crazy-max"><code>`@​crazy-max</code></a>` (<a href="https://redirect.github.com/docker/setup-qemu-action/issues/63">#63</a>)</li>
<li>Bump <code>`@​actions/core</code>` from 1.6.0 to 1.10.0 (<a href="https://redirect.github.com/docker/setup-qemu-action/issues/54">#54</a> <a href="https://redirect.github.com/docker/setup-qemu-action/issues/58">#58</a> <a href="https://redirect.github.com/docker/setup-qemu-action/issues/59">#59</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/docker/setup-qemu-action/compare/v2.0.0...v2.1.0">https://github.com/docker/setup-qemu-action/compare/v2.0.0...v2.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="68827325e0"><code>6882732</code></a> Merge pull request <a href="https://redirect.github.com/docker/setup-qemu-action/issues/103">#103</a> from docker/dependabot/npm_and_yarn/actions/core-1.10.1</li>
<li><a href="183f4af504"><code>183f4af</code></a> chore: update generated content</li>
<li><a href="f17493529e"><code>f174935</code></a> build(deps): bump <code>`@​actions/core</code>` from 1.10.0 to 1.10.1</li>
<li><a href="2e423eb500"><code>2e423eb</code></a> Merge pull request <a href="https://redirect.github.com/docker/setup-qemu-action/issues/89">#89</a> from docker/dependabot/npm_and_yarn/semver-6.3.1</li>
<li><a href="ecc406afa7"><code>ecc406a</code></a> Bump semver from 6.3.0 to 6.3.1</li>
<li><a href="12dec5e201"><code>12dec5e</code></a> Merge pull request <a href="https://redirect.github.com/docker/setup-qemu-action/issues/102">#102</a> from crazy-max/update-node20</li>
<li><a href="c29b312130"><code>c29b312</code></a> chore: node 20 as default runtime</li>
<li><a href="34ae628c8f"><code>34ae628</code></a> chore: update generated content</li>
<li><a href="1f3d2e1ac0"><code>1f3d2e1</code></a> chore: fix author in package.json</li>
<li><a href="277dbe8c9c"><code>277dbe8</code></a> vendor: bump <code>`@​docker/actions-toolkit</code>` from 0.3.0 to 0.12.0</li>
<li>Additional commits viewable in <a href="https://github.com/docker/setup-qemu-action/compare/v2...v3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=docker/setup-qemu-action&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 08:20:50 +00:00
c668a29ed5 Bump webpki from 0.22.1 to 0.22.2
Bumps [webpki](https://github.com/briansmith/webpki) from 0.22.1 to 0.22.2.
- [Commits](https://github.com/briansmith/webpki/commits)

---
updated-dependencies:
- dependency-name: webpki
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-02 21:53:45 +00:00
98f0618065 Bump docker/setup-qemu-action from 2 to 3
Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](https://github.com/docker/setup-qemu-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-01 17:18:20 +00:00
b10eeb0e41 Update .github/ISSUE_TEMPLATE/sprint_issue.md 2023-09-26 16:47:04 +02:00
4a8515e9fc Update sprint_issue.md 2023-09-26 16:46:18 +02:00
401 changed files with 22642 additions and 11647 deletions

2
.cargo/config.toml Normal file
View File

@ -0,0 +1,2 @@
[alias]
xtask = "run --release --package xtask --"

View File

@ -7,19 +7,17 @@ assignees: ''
---
Related product team resources: [roadmap card]() (_internal only_) and [PRD]() (_internal only_)
Related product team resources: [PRD]() (_internal only_)
Related product discussion:
Related spec: WIP
## Motivation
<!---Copy/paste the information in the roadmap resources or briefly detail the product motivation. Ask product team if any hesitation.-->
<!---Copy/paste the information in PRD or briefly detail the product motivation. Ask product team if any hesitation.-->
## Usage
<!---Write a quick description of the usage if the usage has already been defined-->
Refer to the final spec to know the details and the final decisions about the usage.
<!---Link to the public part of the PRD, or to the related product discussion for experimental features-->
## TODO
@ -29,6 +27,23 @@ Refer to the final spec to know the details and the final decisions about the us
- [ ] If prototype validated, merge changes into `main`
- [ ] Update the spec
### Reminders when modifying the Setting API
<!--- Special steps to remind when adding a new index setting -->
- [ ] Ensure the new setting route is at least tested by the [`test_setting_routes` macro](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/tests/settings/get_settings.rs#L276)
- [ ] Ensure Analytics are fully implemented
- [ ] `/settings/my-new-setting` configurated in the [`make_setting_routes` macro](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/src/routes/indexes/settings.rs#L141-L165)
- [ ] global `/settings` route configurated in the [`update_all` function](https://github.com/meilisearch/meilisearch/blob/5204c0b60b384cbc79621b6b2176fca086069e8e/meilisearch/src/routes/indexes/settings.rs#L655-L751)
- [ ] Ensure the dump serializing is consistent with the `/settings` route serializing, e.g., enums case can be different (`camelCase` in route and `PascalCase` in the dump)
#### Special cases when adding a setting for an experimental feature
- [ ] ⚠️ API stability: The setting does not appear on the main settings route when the feature has never been enabled (e.g. mark it `Unset` when returned from the index in this situation. See [an example](https://github.com/meilisearch/meilisearch/blob/7a89abd2a025606a42f8b219e539117eb2eb029f/meilisearch-types/src/settings.rs#L608))
- [ ] The setting cannot be set when the feature is disabled, either by the main settings route or the subroute (see [`validate_settings` function](https://github.com/meilisearch/meilisearch/blob/7a89abd2a025606a42f8b219e539117eb2eb029f/meilisearch/src/routes/indexes/settings.rs#L811))
- [ ] If possible, the setting is reset when the feature is disabled (hard if it requires reindexing)
## Impacted teams
<!---Ping the related teams. Ask for the engine manager if any hesitation-->
<!---@meilisearch/docs-team when there is any API change, e.g. settings addition-->

30
.github/workflows/bench-manual.yml vendored Normal file
View File

@ -0,0 +1,30 @@
name: Bench (manual)
on:
workflow_dispatch:
inputs:
workload:
description: 'The path to the workloads to execute (workloads/...)'
required: true
default: 'workloads/movies.json'
env:
WORKLOAD_NAME: ${{ github.event.inputs.workload }}
jobs:
benchmarks:
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Run benchmarks - workload ${WORKLOAD_NAME} - branch ${{ github.ref }} - commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Manual [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- ${WORKLOAD_NAME}

46
.github/workflows/bench-pr.yml vendored Normal file
View File

@ -0,0 +1,46 @@
name: Bench (PR)
on:
issue_comment:
types: [created]
permissions:
issues: write
env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs:
run-benchmarks-on-comment:
if: startsWith(github.event.comment.body, '/bench')
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- name: Check for Command
id: command
uses: xt0rted/slash-command-action@v2
with:
command: bench
reaction-type: "rocket"
repo-token: ${{ env.GH_TOKEN }}
- uses: xt0rted/pull-request-comment-branch@v2
id: comment-branch
with:
repo_token: ${{ env.GH_TOKEN }}
- uses: actions/checkout@v3
if: success()
with:
fetch-depth: 0 # fetch full history to be able to get main commit sha
ref: ${{ steps.comment-branch.outputs.head_ref }}
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Run benchmarks on PR ${{ github.event.issue.id }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "[Comment](${{ github.event.comment.url }}) on [#${{github.event.issue.id}}](${{ github.event.issue.url }})" -- ${{ steps.command.outputs.command-arguments }}

View File

@ -0,0 +1,25 @@
name: Indexing bench (push)
on:
push:
branches:
- main
jobs:
benchmarks:
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 180 # 3h
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
# Run benchmarks
- name: Run benchmarks - Dataset ${BENCH_NAME} - Branch main - Commit ${{ github.sha }}
run: |
cargo xtask bench --api-key "${{ secrets.BENCHMARK_API_KEY }}" --dashboard-url "${{ vars.BENCHMARK_DASHBOARD_URL }}" --reason "Push on `main` [Run #${{ github.run_id }}](https://github.com/meilisearch/meilisearch/actions/runs/${{ github.run_id }})" -- workloads/*.json

View File

@ -74,4 +74,4 @@ jobs:
echo "${{ steps.file.outputs.basename }}.json has just been pushed."
echo 'How to compare this benchmark with another one?'
echo ' - Check the available files with: ./benchmarks/scripts/list.sh'
echo " - Run the following command: ./benchmaks/scipts/compare.sh <file-to-compare-with> ${{ steps.file.outputs.basename }}.json"
echo " - Run the following command: ./benchmaks/scripts/compare.sh <file-to-compare-with> ${{ steps.file.outputs.basename }}.json"

98
.github/workflows/benchmarks-pr.yml vendored Normal file
View File

@ -0,0 +1,98 @@
name: Benchmarks (PR)
on: issue_comment
permissions:
issues: write
env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs:
run-benchmarks-on-comment:
if: startsWith(github.event.comment.body, '/benchmark')
name: Run and upload benchmarks
runs-on: benchmarks
timeout-minutes: 4320 # 72h
steps:
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Check for Command
id: command
uses: xt0rted/slash-command-action@v2
with:
command: benchmark
reaction-type: "eyes"
repo-token: ${{ env.GH_TOKEN }}
- uses: xt0rted/pull-request-comment-branch@v2
id: comment-branch
with:
repo_token: ${{ env.GH_TOKEN }}
- uses: actions/checkout@v3
if: success()
with:
fetch-depth: 0 # fetch full history to be able to get main commit sha
ref: ${{ steps.comment-branch.outputs.head_ref }}
# Set variables
- name: Set current branch name
shell: bash
run: echo "name=$(git rev-parse --abbrev-ref HEAD)" >> $GITHUB_OUTPUT
id: current_branch
- name: Set normalized current branch name # Replace `/` by `_` in branch name to avoid issues when pushing to S3
shell: bash
run: echo "name=$(git rev-parse --abbrev-ref HEAD | tr '/' '_')" >> $GITHUB_OUTPUT
id: normalized_current_branch
- name: Set shorter commit SHA
shell: bash
run: echo "short=$(echo $GITHUB_SHA | cut -c1-8)" >> $GITHUB_OUTPUT
id: commit_sha
- name: Set file basename with format "dataset_branch_commitSHA"
shell: bash
run: echo "basename=$(echo ${{ steps.command.outputs.command-arguments }}_${{ steps.normalized_current_branch.outputs.name }}_${{ steps.commit_sha.outputs.short }})" >> $GITHUB_OUTPUT
id: file
# Run benchmarks
- name: Run benchmarks - Dataset ${{ steps.command.outputs.command-arguments }} - Branch ${{ steps.current_branch.outputs.name }} - Commit ${{ steps.commit_sha.outputs.short }}
run: |
cd benchmarks
cargo bench --bench ${{ steps.command.outputs.command-arguments }} -- --save-baseline ${{ steps.file.outputs.basename }}
# Generate critcmp files
- name: Install critcmp
uses: taiki-e/install-action@v2
with:
tool: critcmp
- name: Export cripcmp file
run: |
critcmp --export ${{ steps.file.outputs.basename }} > ${{ steps.file.outputs.basename }}.json
# Upload benchmarks
- name: Upload ${{ steps.file.outputs.basename }}.json to DO Spaces # DigitalOcean Spaces = S3
uses: BetaHuhn/do-spaces-action@v2
with:
access_key: ${{ secrets.DO_SPACES_ACCESS_KEY }}
secret_key: ${{ secrets.DO_SPACES_SECRET_KEY }}
space_name: ${{ secrets.DO_SPACES_SPACE_NAME }}
space_region: ${{ secrets.DO_SPACES_SPACE_REGION }}
source: ${{ steps.file.outputs.basename }}.json
out_dir: critcmp_results
# Compute the diff of the benchmarks and send a message on the GitHub PR
- name: Compute and send a message in the PR
env:
GITHUB_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
run: |
set -x
export base_ref=$(git merge-base origin/main ${{ steps.comment-branch.outputs.head_ref }} | head -c8)
export base_filename=$(echo ${{ steps.command.outputs.command-arguments }}_main_${base_ref}.json)
export bench_name=$(echo ${{ steps.command.outputs.command-arguments }})
echo "Here are your $bench_name benchmarks diff 👊" >> body.txt
echo '```' >> body.txt
./benchmarks/scripts/compare.sh $base_filename ${{ steps.file.outputs.basename }}.json >> body.txt
echo '```' >> body.txt
gh pr comment ${{ steps.current_branch.outputs.name }} --body-file body.txt

View File

@ -50,7 +50,7 @@ jobs:
needs: check-version
steps:
- name: Create PR to Homebrew
uses: mislav/bump-homebrew-formula-action@v2
uses: mislav/bump-homebrew-formula-action@v3
with:
formula-name: meilisearch
formula-path: Formula/m/meilisearch.rb

View File

@ -57,20 +57,20 @@ jobs:
echo "date=$commit_date" >> $GITHUB_OUTPUT
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v2
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Docker meta
id: meta
uses: docker/metadata-action@v4
uses: docker/metadata-action@v5
with:
images: getmeili/meilisearch
# Prevent `latest` to be updated for each new tag pushed.
@ -83,7 +83,7 @@ jobs:
type=raw,value=latest,enable=${{ steps.check-tag-format.outputs.stable == 'true' && steps.check-tag-format.outputs.latest == 'true' }}
- name: Build and push
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
push: true
platforms: linux/amd64,linux/arm64
@ -97,7 +97,7 @@ jobs:
- name: Send CI information to Cloud team
# Do not send if nightly build (i.e. 'schedule' or 'workflow_dispatch' event)
if: github.event_name == 'push'
uses: peter-evans/repository-dispatch@v2
uses: peter-evans/repository-dispatch@v3
with:
token: ${{ secrets.MEILI_BOT_GH_PAT }}
repository: meilisearch/meilisearch-cloud

View File

@ -22,7 +22,7 @@ jobs:
outputs:
docker-image: ${{ steps.define-image.outputs.docker-image }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Define the Docker image we need to use
id: define-image
run: |
@ -46,11 +46,11 @@ jobs:
MEILISEARCH_VERSION: ${{ needs.define-docker-image.outputs.docker-image }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-dotnet
- name: Setup .NET Core
uses: actions/setup-dotnet@v3
uses: actions/setup-dotnet@v4
with:
dotnet-version: "6.0.x"
- name: Install dependencies
@ -75,12 +75,12 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-dart
- uses: dart-lang/setup-dart@v1
with:
sdk: 3.1.1
sdk: 'latest'
- name: Install dependencies
run: dart pub get
- name: Run integration tests
@ -100,10 +100,10 @@ jobs:
- '7700:7700'
steps:
- name: Set up Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: stable
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-go
- name: Get dependencies
@ -129,11 +129,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-java
- name: Set up Java
uses: actions/setup-java@v3
uses: actions/setup-java@v4
with:
java-version: 8
distribution: 'zulu'
@ -156,11 +156,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-js
- name: Setup node
uses: actions/setup-node@v3
uses: actions/setup-node@v4
with:
cache: 'yarn'
- name: Install dependencies
@ -191,7 +191,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-php
- name: Install PHP
@ -220,11 +220,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-python
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
- name: Install pipenv
uses: dschep/install-pipenv-action@v1
- name: Install dependencies
@ -245,7 +245,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-ruby
- name: Set up Ruby 3
@ -270,7 +270,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-rust
- name: Build
@ -291,7 +291,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-swift
- name: Run tests
@ -314,11 +314,11 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-js-plugins
- name: Setup node
uses: actions/setup-node@v3
uses: actions/setup-node@v4
with:
cache: yarn
- name: Install dependencies
@ -345,7 +345,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-rails
- name: Set up Ruby 3
@ -369,7 +369,7 @@ jobs:
ports:
- '7700:7700'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
repository: meilisearch/meilisearch-symfony
- name: Install PHP

View File

@ -31,19 +31,12 @@ jobs:
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- name: Setup test with Rust stable
if: github.event_name != 'schedule'
uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Setup test with Rust nightly
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
uses: actions-rs/toolchain@v1
with:
toolchain: nightly
override: true
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.6.2
uses: Swatinem/rust-cache@v2.7.1
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -65,7 +58,11 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.6.2
uses: Swatinem/rust-cache@v2.7.1
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -78,7 +75,7 @@ jobs:
args: --locked --release --all
test-all-features:
name: Tests all features
name: Tests almost all features
runs-on: ubuntu-latest
container:
# Use ubuntu-18.04 to compile with glibc 2.27, which are the production expectations
@ -94,16 +91,12 @@ jobs:
with:
toolchain: stable
override: true
- name: Run cargo build with all features
uses: actions-rs/cargo@v1
with:
command: build
args: --workspace --locked --release --all-features
- name: Run cargo test with all features
uses: actions-rs/cargo@v1
with:
command: test
args: --workspace --locked --release --all-features
- name: Run cargo build with almost all features
run: |
cargo build --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
- name: Run cargo test with almost all features
run: |
cargo test --workspace --locked --release --features "$(cargo xtask list-features --exclude-feature cuda)"
test-disabled-tokenization:
name: Test disabled tokenization
@ -149,7 +142,7 @@ jobs:
toolchain: stable
override: true
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.6.2
uses: Swatinem/rust-cache@v2.7.1
- name: Run tests in debug
uses: actions-rs/cargo@v1
with:
@ -164,11 +157,11 @@ jobs:
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: 1.71.1
toolchain: 1.75.0
override: true
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.6.2
uses: Swatinem/rust-cache@v2.7.1
- name: Run cargo clippy
uses: actions-rs/cargo@v1
with:
@ -187,7 +180,7 @@ jobs:
override: true
components: rustfmt
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.6.2
uses: Swatinem/rust-cache@v2.7.1
- name: Run cargo fmt
# Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file.
# Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate

2
.gitignore vendored
View File

@ -9,6 +9,8 @@
/data.ms
/snapshots
/dumps
/bench
/_xtask_benchmark.ms
# Snapshots
## ... large

View File

@ -75,6 +75,12 @@ If you get a "Too many open files" error you might want to increase the open fil
ulimit -Sn 3000
```
#### Build tools
Meilisearch follows the [cargo xtask](https://github.com/matklad/cargo-xtask) workflow to provide some build tools.
Run `cargo xtask --help` from the root of the repository to find out what is available.
## Git Guidelines
### Git Branches

2550
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -2,6 +2,7 @@
resolver = "2"
members = [
"meilisearch",
"meilitool",
"meilisearch-types",
"meilisearch-auth",
"meili-snap",
@ -15,11 +16,16 @@ members = [
"json-depth-checker",
"benchmarks",
"fuzzers",
"tracing-trace",
"xtask", "build-info",
]
[workspace.package]
version = "1.4.0"
authors = ["Quentin de Quelen <quentin@dequelen.me>", "Clément Renault <clement@meilisearch.com>"]
version = "1.7.0"
authors = [
"Quentin de Quelen <quentin@dequelen.me>",
"Clément Renault <clement@meilisearch.com>",
]
description = "Meilisearch HTTP server"
homepage = "https://meilisearch.com"
readme = "README.md"

View File

@ -1,14 +1,14 @@
# Compile
FROM rust:alpine3.16 AS compiler
FROM rust:1.75.0-alpine3.18 AS compiler
RUN apk add -q --update-cache --no-cache build-base openssl-dev
WORKDIR /meilisearch
WORKDIR /
ARG COMMIT_SHA
ARG COMMIT_DATE
ARG GIT_TAG
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE} VERGEN_GIT_SEMVER_LIGHTWEIGHT=${GIT_TAG}
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE} VERGEN_GIT_DESCRIBE=${GIT_TAG}
ENV RUSTFLAGS="-C target-feature=-crt-static"
COPY . .
@ -17,7 +17,7 @@ RUN set -eux; \
if [ "$apkArch" = "aarch64" ]; then \
export JEMALLOC_SYS_WITH_LG_PAGE=16; \
fi && \
cargo build --release
cargo build --release -p meilisearch -p meilitool
# Run
FROM alpine:3.16
@ -28,9 +28,10 @@ ENV MEILI_SERVER_PROVIDER docker
RUN apk update --quiet \
&& apk add -q --no-cache libgcc tini curl
# add meilisearch to the `/bin` so you can run it from anywhere and it's easy
# to find.
COPY --from=compiler /meilisearch/target/release/meilisearch /bin/meilisearch
# add meilisearch and meilitool to the `/bin` so you can run it from anywhere
# and it's easy to find.
COPY --from=compiler /target/release/meilisearch /bin/meilisearch
COPY --from=compiler /target/release/meilitool /bin/meilitool
# To stay compatible with the older version of the container (pre v0.27.0) we're
# going to symlink the meilisearch binary in the path to `/meilisearch`
RUN ln -s /bin/meilisearch /meilisearch

View File

@ -1,6 +1,6 @@
MIT License
Copyright (c) 2019-2022 Meili SAS
Copyright (c) 2019-2024 Meili SAS
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@ -1,14 +1,14 @@
# Profiling Meilisearch
Search engine technologies are complex pieces of software that require thorough profiling tools. We chose to use [Puffin](https://github.com/EmbarkStudios/puffin), which the Rust gaming industry uses extensively. You can export and import the profiling reports using the top bar's _File_ menu options.
Search engine technologies are complex pieces of software that require thorough profiling tools. We chose to use [Puffin](https://github.com/EmbarkStudios/puffin), which the Rust gaming industry uses extensively. You can export and import the profiling reports using the top bar's _File_ menu options [in Puffin Viewer](https://github.com/embarkstudios/puffin#ui).
![An example profiling with Puffin viewer](assets/profiling-example.png)
## Profiling the Indexing Process
When you enable the `profile-with-puffin` feature of Meilisearch, a Puffin HTTP server will run on Meilisearch and listen on the default _0.0.0.0:8585_ address. This server will record a "frame" whenever it executes the `IndexScheduler::tick` method.
When you enable [the `exportPuffinReports` experimental feature](https://www.meilisearch.com/docs/learn/experimental/overview) of Meilisearch, Puffin reports with the `.puffin` extension will be automatically exported to disk. When this option is enabled, the engine will automatically create a "frame" whenever it executes the `IndexScheduler::tick` method.
Once your Meilisearch is running and awaits new indexation operations, you must [install and run the `puffin_viewer` tool](https://github.com/EmbarkStudios/puffin/tree/main/puffin_viewer) to see the profiling results. I advise you to run the viewer with the `RUST_LOG=puffin_http::client=debug` environment variable to see the client trying to connect to your server.
[Puffin Viewer](https://github.com/EmbarkStudios/puffin/tree/main/puffin_viewer) is used to analyze the reports. Those reports show areas where Meilisearch spent time during indexing.
Another piece of advice on the Puffin viewer UI interface is to consider the _Merge children with same ID_ option. It can hide the exact actual timings at which events were sent. Please turn it off when you see strange gaps on the Flamegraph. It can help.

View File

@ -41,10 +41,10 @@ Meilisearch helps you shape a delightful search experience in a snap, offering f
## ✨ Features
- **Search-as-you-type:** find search results in less than 50 milliseconds
- **[Typo tolerance](https://www.meilisearch.com/docs/learn/getting_started/customizing_relevancy?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features#typo-tolerance):** get relevant matches even when queries contain typos and misspellings
- **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your user's search experience with custom filters and build a faceted search interface in a few lines of code
- **[Typo tolerance](https://www.meilisearch.com/docs/learn/configuration/typo_tolerance?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** get relevant matches even when queries contain typos and misspellings
- **[Filtering](https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features) and [faceted search](https://www.meilisearch.com/docs/learn/fine_tuning_results/faceted_search?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** enhance your users' search experience with custom filters and build a faceted search interface in a few lines of code
- **[Sorting](https://www.meilisearch.com/docs/learn/fine_tuning_results/sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** sort results based on price, date, or pretty much anything else your users need
- **[Synonym support](https://www.meilisearch.com/docs/learn/getting_started/customizing_relevancy?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features#synonyms):** configure synonyms to include more relevant content in your search results
- **[Synonym support](https://www.meilisearch.com/docs/learn/configuration/synonyms?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** configure synonyms to include more relevant content in your search results
- **[Geosearch](https://www.meilisearch.com/docs/learn/fine_tuning_results/geosearch?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** filter and sort documents based on geographic data
- **[Extensive language support](https://www.meilisearch.com/docs/learn/what_is_meilisearch/language?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** search datasets in any language, with optimized support for Chinese, Japanese, Hebrew, and languages using the Latin alphabet
- **[Security management](https://www.meilisearch.com/docs/learn/security/master_api_keys?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=features):** control which users can access what data with API keys that allow fine-grained permissions handling
@ -61,8 +61,6 @@ You can consult Meilisearch's documentation at [https://www.meilisearch.com/docs
For basic instructions on how to set up Meilisearch, add documents to an index, and search for documents, take a look at our [Quick Start](https://www.meilisearch.com/docs/learn/getting_started/quick_start?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) guide.
You may also want to check out [Meilisearch 101](https://www.meilisearch.com/docs/learn/getting_started/filtering_and_sorting?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=get-started) for an introduction to some of Meilisearch's most popular features.
## ⚡ Supercharge your Meilisearch experience
Say goodbye to server deployment and manual updates with [Meilisearch Cloud](https://www.meilisearch.com/cloud?utm_campaign=oss&utm_source=github&utm_medium=meilisearch). No credit card required.
@ -101,7 +99,7 @@ Meilisearch is a search engine created by [Meili](https://www.welcometothejungle
- For feature requests, please visit our [product repository](https://github.com/meilisearch/product/discussions)
- Found a bug? Open an [issue](https://github.com/meilisearch/meilisearch/issues)!
- Want to be part of our Discord community? [Join us!](https://discord.gg/meilisearch)
- Want to be part of our Discord community? [Join us!](https://discord.meilisearch.com/?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=contact)
Thank you for your support!

View File

@ -106,7 +106,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_index_count{job=\"meilisearch\", instance=\"$instance\"}",
"expr": "meilisearch_index_count{job=\"$job\", instance=\"$instance\"}",
"interval": "",
"legendFormat": "",
"range": true,
@ -165,7 +165,7 @@
"type": "prometheus"
},
"editorMode": "builder",
"expr": "meilisearch_index_docs_count{job=\"meilisearch\", index=\"$Index\", instance=\"$instance\"}",
"expr": "meilisearch_index_docs_count{job=\"$job\", index=\"$Index\", instance=\"$instance\"}",
"hide": false,
"range": true,
"refId": "A"
@ -228,7 +228,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"meilisearch\"}[1h]))",
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"$job\"}[1h]))",
"interval": "",
"legendFormat": "",
"range": true,
@ -288,7 +288,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"meilisearch\"}[24h]))",
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"$job\"}[24h]))",
"interval": "",
"legendFormat": "",
"range": true,
@ -348,7 +348,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"meilisearch\"}[30d]))",
"expr": "round(increase(meilisearch_http_requests_total{method=\"POST\", path=\"/indexes/$Index/search\", job=\"$job\"}[30d]))",
"interval": "",
"legendFormat": "",
"range": true,
@ -447,7 +447,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_db_size_bytes{job=\"meilisearch\", instance=\"$instance\"}",
"expr": "meilisearch_db_size_bytes{job=\"$job\", instance=\"$instance\"}",
"interval": "",
"legendFormat": "Database size on disk",
"range": true,
@ -458,7 +458,7 @@
"type": "prometheus"
},
"editorMode": "builder",
"expr": "meilisearch_used_db_size_bytes{job=\"meilisearch\", instance=\"$instance\"}",
"expr": "meilisearch_used_db_size_bytes{job=\"$job\", instance=\"$instance\"}",
"hide": false,
"legendFormat": "Used bytes",
"range": true,
@ -553,7 +553,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "rate(meilisearch_http_response_time_seconds_sum{instance=\"$instance\", job=\"meilisearch\"}[5m]) / rate(meilisearch_http_response_time_seconds_count[5m])",
"expr": "rate(meilisearch_http_response_time_seconds_sum{instance=\"$instance\", job=\"$job\"}[5m]) / rate(meilisearch_http_response_time_seconds_count[5m])",
"interval": "",
"legendFormat": "{{method}} {{path}}",
"range": true,
@ -646,7 +646,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "rate(meilisearch_http_requests_total{instance=\"$instance\", job=\"meilisearch\"}[5m])",
"expr": "rate(meilisearch_http_requests_total{instance=\"$instance\", job=\"$job\"}[5m])",
"interval": "",
"legendFormat": "{{method}} {{path}}",
"range": true,
@ -744,7 +744,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "sum by(le) (increase(meilisearch_http_response_time_seconds_bucket{path=\"/indexes/$Index/search\", instance=\"$instance\", job=\"meilisearch\"}[30s]))",
"expr": "sum by(le) (increase(meilisearch_http_response_time_seconds_bucket{path=\"/indexes/$Index/search\", instance=\"$instance\", job=\"$job\"}[30s]))",
"format": "heatmap",
"interval": "",
"legendFormat": "{{le}}",
@ -854,7 +854,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"meilisearch\", kind=\"statuses\"}",
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"$job\", kind=\"statuses\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
@ -947,7 +947,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"meilisearch\", kind=\"types\"}",
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"$job\", kind=\"types\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
@ -1040,7 +1040,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"meilisearch\", kind=\"indexes\"}",
"expr": "meilisearch_nb_tasks{instance=\"$instance\", job=\"$job\", kind=\"indexes\"}",
"interval": "",
"legendFormat": "{{value}} ",
"range": true,
@ -1161,7 +1161,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "rate(process_cpu_seconds_total{job=\"meilisearch\", instance=\"$instance\"}[1m])",
"expr": "rate(process_cpu_seconds_total{job=\"$job\", instance=\"$instance\"}[1m])",
"interval": "",
"legendFormat": "process",
"range": true,
@ -1264,7 +1264,7 @@
},
"editorMode": "builder",
"exemplar": true,
"expr": "process_resident_memory_bytes{job=\"meilisearch\", instance=\"$instance\"} / 1024 / 1024",
"expr": "process_resident_memory_bytes{job=\"$job\", instance=\"$instance\"} / 1024 / 1024",
"interval": "",
"legendFormat": "process",
"range": true,
@ -1342,6 +1342,33 @@
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"current": {
"selected": true,
"text": "meilisearch",
"value": "meilisearch"
},
"datasource": {
"type": "prometheus"
},
"definition": "label_values(job)",
"description": "Prometheus job_name from scrape config (default is meilisearch)",
"hide": 0,
"includeAll": false,
"label": "Job",
"multi": false,
"name": "job",
"options": [],
"query": {
"query": "label_values(job)",
"refId": "StandardVariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},

View File

@ -11,24 +11,24 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
csv = "1.2.1"
anyhow = "1.0.79"
csv = "1.3.0"
milli = { path = "../milli" }
mimalloc = { version = "0.1.37", default-features = false }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
mimalloc = { version = "0.1.39", default-features = false }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
rand = "0.8.5"
rand_chacha = "0.3.1"
roaring = "0.10.1"
roaring = "0.10.2"
[build-dependencies]
anyhow = "1.0.70"
bytes = "1.4.0"
anyhow = "1.0.79"
bytes = "1.5.0"
convert_case = "0.6.0"
flate2 = "1.0.25"
reqwest = { version = "0.11.16", features = ["blocking", "rustls-tls"], default-features = false }
flate2 = "1.0.28"
reqwest = { version = "0.11.23", features = ["blocking", "rustls-tls"], default-features = false }
[features]
default = ["milli/all-tokenizations"]

View File

@ -6,9 +6,7 @@ use std::path::Path;
use criterion::{criterion_group, criterion_main, Criterion};
use milli::heed::{EnvOpenOptions, RwTxn};
use milli::update::{
DeleteDocuments, IndexDocuments, IndexDocumentsConfig, IndexerConfig, Settings,
};
use milli::update::{IndexDocuments, IndexDocumentsConfig, IndexerConfig, Settings};
use milli::Index;
use rand::seq::SliceRandom;
use rand_chacha::rand_core::SeedableRng;
@ -38,7 +36,7 @@ fn setup_index() -> Index {
}
fn setup_settings<'t>(
wtxn: &mut RwTxn<'t, '_>,
wtxn: &mut RwTxn<'t>,
index: &'t Index,
primary_key: &str,
searchable_fields: &[&str],
@ -266,17 +264,7 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
(index, document_ids_to_delete)
},
move |(index, document_ids_to_delete)| {
let mut wtxn = index.write_txn().unwrap();
for ids in document_ids_to_delete {
let mut builder = DeleteDocuments::new(&mut wtxn, &index).unwrap();
builder.delete_documents(&ids);
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
delete_documents_from_ids(index, document_ids_to_delete)
},
)
});
@ -613,17 +601,7 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
(index, document_ids_to_delete)
},
move |(index, document_ids_to_delete)| {
let mut wtxn = index.write_txn().unwrap();
for ids in document_ids_to_delete {
let mut builder = DeleteDocuments::new(&mut wtxn, &index).unwrap();
builder.delete_documents(&ids);
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
delete_documents_from_ids(index, document_ids_to_delete)
},
)
});
@ -875,22 +853,31 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
(index, document_ids_to_delete)
},
move |(index, document_ids_to_delete)| {
let mut wtxn = index.write_txn().unwrap();
for ids in document_ids_to_delete {
let mut builder = DeleteDocuments::new(&mut wtxn, &index).unwrap();
builder.delete_documents(&ids);
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
delete_documents_from_ids(index, document_ids_to_delete)
},
)
});
}
fn delete_documents_from_ids(index: Index, document_ids_to_delete: Vec<RoaringBitmap>) {
let mut wtxn = index.write_txn().unwrap();
let indexer_config = IndexerConfig::default();
for ids in document_ids_to_delete {
let config = IndexDocumentsConfig::default();
let mut builder =
IndexDocuments::new(&mut wtxn, &index, &indexer_config, config, |_| (), || false)
.unwrap();
(builder, _) = builder.remove_documents_from_db_no_batch(&ids).unwrap();
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
}
fn indexing_movies_in_three_batches(c: &mut Criterion) {
let mut group = c.benchmark_group("indexing");
group.sample_size(BENCHMARK_ITERATION);
@ -1112,17 +1099,7 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
(index, document_ids_to_delete)
},
move |(index, document_ids_to_delete)| {
let mut wtxn = index.write_txn().unwrap();
for ids in document_ids_to_delete {
let mut builder = DeleteDocuments::new(&mut wtxn, &index).unwrap();
builder.delete_documents(&ids);
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
delete_documents_from_ids(index, document_ids_to_delete)
},
)
});
@ -1338,17 +1315,7 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
(index, document_ids_to_delete)
},
move |(index, document_ids_to_delete)| {
let mut wtxn = index.write_txn().unwrap();
for ids in document_ids_to_delete {
let mut builder = DeleteDocuments::new(&mut wtxn, &index).unwrap();
builder.delete_documents(&ids);
builder.execute().unwrap();
}
wtxn.commit().unwrap();
index.prepare_for_closing().wait();
delete_documents_from_ids(index, document_ids_to_delete)
},
)
});

18
build-info/Cargo.toml Normal file
View File

@ -0,0 +1,18 @@
[package]
name = "build-info"
version.workspace = true
authors.workspace = true
description.workspace = true
homepage.workspace = true
readme.workspace = true
edition.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
time = { version = "0.3.34", features = ["parsing"] }
[build-dependencies]
anyhow = "1.0.80"
vergen-git2 = "1.0.0-beta.2"

22
build-info/build.rs Normal file
View File

@ -0,0 +1,22 @@
fn main() {
if let Err(err) = emit_git_variables() {
println!("cargo:warning=vergen: {}", err);
}
}
fn emit_git_variables() -> anyhow::Result<()> {
// Note: any code that needs VERGEN_ environment variables should take care to define them manually in the Dockerfile and pass them
// in the corresponding GitHub workflow (publish_docker.yml).
// This is due to the Dockerfile building the binary outside of the git directory.
let mut builder = vergen_git2::Git2Builder::default();
builder.branch(true);
builder.commit_timestamp(true);
builder.commit_message(true);
builder.describe(true, true, None);
builder.sha(false);
let git2 = builder.build()?;
vergen_git2::Emitter::default().fail_on_error().add_instructions(&git2)?.emit()
}

203
build-info/src/lib.rs Normal file
View File

@ -0,0 +1,203 @@
use time::format_description::well_known::Iso8601;
#[derive(Debug, Clone)]
pub struct BuildInfo {
pub branch: Option<&'static str>,
pub describe: Option<DescribeResult>,
pub commit_sha1: Option<&'static str>,
pub commit_msg: Option<&'static str>,
pub commit_timestamp: Option<time::OffsetDateTime>,
}
impl BuildInfo {
pub fn from_build() -> Self {
let branch: Option<&'static str> = option_env!("VERGEN_GIT_BRANCH");
let describe = DescribeResult::from_build();
let commit_sha1 = option_env!("VERGEN_GIT_SHA");
let commit_msg = option_env!("VERGEN_GIT_COMMIT_MESSAGE");
let commit_timestamp = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP");
let commit_timestamp = commit_timestamp.and_then(|commit_timestamp| {
time::OffsetDateTime::parse(commit_timestamp, &Iso8601::DEFAULT).ok()
});
Self { branch, describe, commit_sha1, commit_msg, commit_timestamp }
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum DescribeResult {
Prototype { name: &'static str },
Release { version: &'static str, major: u64, minor: u64, patch: u64 },
Prerelease { version: &'static str, major: u64, minor: u64, patch: u64, rc: u64 },
NotATag { describe: &'static str },
}
impl DescribeResult {
pub fn new(describe: &'static str) -> Self {
if let Some(name) = prototype_name(describe) {
Self::Prototype { name }
} else if let Some(release) = release_version(describe) {
release
} else if let Some(prerelease) = prerelease_version(describe) {
prerelease
} else {
Self::NotATag { describe }
}
}
pub fn from_build() -> Option<Self> {
let describe: &'static str = option_env!("VERGEN_GIT_DESCRIBE")?;
Some(Self::new(describe))
}
pub fn as_tag(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { version, .. } => Some(version),
DescribeResult::Prerelease { version, .. } => Some(version),
DescribeResult::NotATag { describe: _ } => None,
}
}
pub fn as_prototype(&self) -> Option<&'static str> {
match self {
DescribeResult::Prototype { name } => Some(name),
DescribeResult::Release { .. }
| DescribeResult::Prerelease { .. }
| DescribeResult::NotATag { .. } => None,
}
}
}
/// Parses the input as a prototype name.
///
/// Returns `Some(prototype_name)` if the following conditions are met on this value:
///
/// 1. starts with `prototype-`,
/// 2. ends with `-<some_number>`,
/// 3. does not end with `<some_number>-<some_number>`.
///
/// Otherwise, returns `None`.
fn prototype_name(describe: &'static str) -> Option<&'static str> {
if !describe.starts_with("prototype-") {
return None;
}
let mut rsplit_prototype = describe.rsplit('-');
// last component MUST be a number
rsplit_prototype.next()?.parse::<u64>().ok()?;
// before than last component SHALL NOT be a number
rsplit_prototype.next()?.parse::<u64>().err()?;
Some(describe)
}
fn release_version(describe: &'static str) -> Option<DescribeResult> {
if !describe.starts_with('v') {
return None;
}
// full release version don't contain a `-`
if describe.contains('-') {
return None;
}
// full release version parse as vX.Y.Z, with X, Y, Z numbers.
let mut dots = describe[1..].split('.');
let major: u64 = dots.next()?.parse().ok()?;
let minor: u64 = dots.next()?.parse().ok()?;
let patch: u64 = dots.next()?.parse().ok()?;
if dots.next().is_some() {
return None;
}
Some(DescribeResult::Release { version: describe, major, minor, patch })
}
fn prerelease_version(describe: &'static str) -> Option<DescribeResult> {
// prerelease version is in the shape vM.N.P-rc.C
let mut hyphen = describe.rsplit('-');
let prerelease = hyphen.next()?;
if !prerelease.starts_with("rc.") {
return None;
}
let rc: u64 = prerelease[3..].parse().ok()?;
let release = hyphen.next()?;
let DescribeResult::Release { version: _, major, minor, patch } = release_version(release)?
else {
return None;
};
Some(DescribeResult::Prerelease { version: describe, major, minor, patch, rc })
}
#[cfg(test)]
mod test {
use super::DescribeResult;
fn assert_not_a_tag(describe: &'static str) {
assert_eq!(DescribeResult::NotATag { describe }, DescribeResult::new(describe))
}
fn assert_proto(describe: &'static str) {
assert_eq!(DescribeResult::Prototype { name: describe }, DescribeResult::new(describe))
}
fn assert_release(describe: &'static str, major: u64, minor: u64, patch: u64) {
assert_eq!(
DescribeResult::Release { version: describe, major, minor, patch },
DescribeResult::new(describe)
)
}
fn assert_prerelease(describe: &'static str, major: u64, minor: u64, patch: u64, rc: u64) {
assert_eq!(
DescribeResult::Prerelease { version: describe, major, minor, patch, rc },
DescribeResult::new(describe)
)
}
#[test]
fn not_a_tag() {
assert_not_a_tag("whatever-fuzzy");
assert_not_a_tag("whatever-fuzzy-5-ggg-dirty");
assert_not_a_tag("whatever-fuzzy-120-ggg-dirty");
// technically a tag, but not a proto nor a version, so not parsed as a tag
assert_not_a_tag("whatever");
// dirty version
assert_not_a_tag("v1.7.0-1-ggga-dirty");
assert_not_a_tag("v1.7.0-rc.1-1-ggga-dirty");
// after version
assert_not_a_tag("v1.7.0-1-ggga");
assert_not_a_tag("v1.7.0-rc.1-1-ggga");
// after proto
assert_not_a_tag("protoype-tag-0-1-ggga");
assert_not_a_tag("protoype-tag-0-1-ggga-dirty");
}
#[test]
fn prototype() {
assert_proto("prototype-tag-0");
assert_proto("prototype-tag-10");
assert_proto("prototype-long-name-tag-10");
}
#[test]
fn release() {
assert_release("v1.7.2", 1, 7, 2);
}
#[test]
fn prerelease() {
assert_prerelease("v1.7.2-rc.3", 1, 7, 2, 3);
}
}

View File

@ -129,3 +129,6 @@ experimental_enable_metrics = false
# Experimental RAM reduction during indexing, do not use in production, see: <https://github.com/meilisearch/product/discussions/652>
experimental_reduce_indexing_memory_usage = false
# Experimentally reduces the maximum number of tasks that will be processed at once, see: <https://github.com/orgs/meilisearch/discussions/713>
# experimental_max_number_of_batched_tasks = 100

View File

@ -11,22 +11,22 @@ readme.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
flate2 = "1.0.25"
http = "0.2.9"
log = "0.4.17"
anyhow = "1.0.79"
flate2 = "1.0.28"
http = "0.2.11"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
once_cell = "1.17.1"
regex = "1.7.3"
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
tar = "0.4.38"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
uuid = { version = "1.3.1", features = ["serde", "v4"] }
once_cell = "1.19.0"
regex = "1.10.2"
roaring = { version = "0.10.2", features = ["serde"] }
serde = { version = "1.0.195", features = ["derive"] }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
tar = "0.4.40"
tempfile = "3.9.0"
thiserror = "1.0.56"
time = { version = "0.3.31", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tracing = "0.1.40"
uuid = { version = "1.6.1", features = ["serde", "v4"] }
[dev-dependencies]
big_s = "1.0.2"

View File

@ -267,6 +267,7 @@ pub(crate) mod test {
dictionary: Setting::NotSet,
synonyms: Setting::NotSet,
distinct_attribute: Setting::NotSet,
proximity_precision: Setting::NotSet,
typo_tolerance: Setting::NotSet,
faceting: Setting::Set(FacetingSettings {
max_values_per_facet: Setting::Set(111),
@ -275,6 +276,7 @@ pub(crate) mod test {
),
}),
pagination: Setting::NotSet,
embedders: Setting::NotSet,
_kind: std::marker::PhantomData,
};
settings.check()

View File

@ -120,7 +120,7 @@ impl From<v1::settings::Settings> for v2::Settings<v2::Unchecked> {
criterion.as_ref().map(ToString::to_string)
}
Err(()) => {
log::warn!(
tracing::warn!(
"Could not import the following ranking rule: `{}`.",
ranking_rule
);
@ -152,11 +152,11 @@ impl From<v1::update::UpdateStatus> for Option<v2::updates::UpdateStatus> {
use v2::updates::UpdateStatus as UpdateStatusV2;
Some(match source {
UpdateStatusV1::Enqueued { content } => {
log::warn!(
tracing::warn!(
"Cannot import task {} (importing enqueued tasks from v1 dumps is unsupported)",
content.update_id
);
log::warn!("Task will be skipped in the queue of imported tasks.");
tracing::warn!("Task will be skipped in the queue of imported tasks.");
return None;
}
@ -229,7 +229,7 @@ impl From<v1::update::UpdateType> for Option<v2::updates::UpdateMeta> {
Some(match source {
v1::update::UpdateType::ClearAll => v2::updates::UpdateMeta::ClearDocuments,
v1::update::UpdateType::Customs => {
log::warn!("Ignoring task with type 'Customs' that is no longer supported");
tracing::warn!("Ignoring task with type 'Customs' that is no longer supported");
return None;
}
v1::update::UpdateType::DocumentsAddition { .. } => {
@ -296,7 +296,7 @@ impl From<v1::settings::RankingRule> for Option<v2::settings::Criterion> {
v1::settings::RankingRule::Proximity => Some(v2::settings::Criterion::Proximity),
v1::settings::RankingRule::Attribute => Some(v2::settings::Criterion::Attribute),
v1::settings::RankingRule::WordsPosition => {
log::warn!("Removing the 'WordsPosition' ranking rule that is no longer supported, please check the resulting ranking rules of your indexes");
tracing::warn!("Removing the 'WordsPosition' ranking rule that is no longer supported, please check the resulting ranking rules of your indexes");
None
}
v1::settings::RankingRule::Exactness => Some(v2::settings::Criterion::Exactness),

View File

@ -1,4 +1,3 @@
use std::convert::TryInto;
use std::str::FromStr;
use time::OffsetDateTime;
@ -146,8 +145,8 @@ impl From<v2::updates::UpdateStatus> for v3::updates::UpdateStatus {
started_processing_at: processing.started_processing_at,
}),
Err(e) => {
log::warn!("Error with task {}: {}", processing.from.update_id, e);
log::warn!("Task will be marked as `Failed`.");
tracing::warn!("Error with task {}: {}", processing.from.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
@ -172,8 +171,8 @@ impl From<v2::updates::UpdateStatus> for v3::updates::UpdateStatus {
enqueued_at: enqueued.enqueued_at,
}),
Err(e) => {
log::warn!("Error with task {}: {}", enqueued.update_id, e);
log::warn!("Task will be marked as `Failed`.");
tracing::warn!("Error with task {}: {}", enqueued.update_id, e);
tracing::warn!("Task will be marked as `Failed`.");
v3::updates::UpdateStatus::Failed(v3::updates::Failed {
from: v3::updates::Processing {
from: v3::updates::Enqueued {
@ -353,7 +352,7 @@ impl From<String> for v3::Code {
"malformed_payload" => v3::Code::MalformedPayload,
"missing_payload" => v3::Code::MissingPayload,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v3::Code::UnretrievableErrorCode
}
}

View File

@ -76,20 +76,20 @@ impl CompatV3ToV4 {
let index_uid = match index_uid {
Some(uid) => uid,
None => {
log::warn!(
tracing::warn!(
"Error while importing the update {}.",
task.update.id()
);
log::warn!(
tracing::warn!(
"The index associated to the uuid `{}` could not be retrieved.",
task.uuid.to_string()
);
if task.update.is_finished() {
// we're fucking with his history but not his data, that's ok-ish.
log::warn!("The index-uuid will be set as `unknown`.");
tracing::warn!("The index-uuid will be set as `unknown`.");
String::from("unknown")
} else {
log::warn!("The task will be ignored.");
tracing::warn!("The task will be ignored.");
return None;
}
}

View File

@ -305,7 +305,7 @@ impl From<v4::ResponseError> for v5::ResponseError {
"invalid_api_key_expires_at" => v5::Code::InvalidApiKeyExpiresAt,
"invalid_api_key_description" => v5::Code::InvalidApiKeyDescription,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v5::Code::UnretrievableErrorCode
}
};

View File

@ -304,7 +304,7 @@ impl From<v5::ResponseError> for v6::ResponseError {
"immutable_field" => v6::Code::BadRequest,
"api_key_already_exists" => v6::Code::ApiKeyAlreadyExists,
other => {
log::warn!("Unknown error code {}", other);
tracing::warn!("Unknown error code {}", other);
v6::Code::UnretrievableErrorCode
}
};
@ -329,7 +329,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
new_ranking_rules.push(new_rule);
}
Err(_) => {
log::warn!("Error while importing settings. The ranking rule `{rule}` does not exist anymore.")
tracing::warn!("Error while importing settings. The ranking rule `{rule}` does not exist anymore.")
}
}
}
@ -345,6 +345,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
dictionary: v6::Setting::NotSet,
synonyms: settings.synonyms.into(),
distinct_attribute: settings.distinct_attribute.into(),
proximity_precision: v6::Setting::NotSet,
typo_tolerance: match settings.typo_tolerance {
v5::Setting::Set(typo) => v6::Setting::Set(v6::TypoTolerance {
enabled: typo.enabled.into(),
@ -377,6 +378,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
embedders: v6::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}

View File

@ -13,12 +13,12 @@ use crate::{Result, Version};
mod compat;
pub(self) mod v1;
pub(self) mod v2;
pub(self) mod v3;
pub(self) mod v4;
pub(self) mod v5;
pub(self) mod v6;
mod v1;
mod v2;
mod v3;
mod v4;
mod v5;
mod v6;
pub type Document = serde_json::Map<String, serde_json::Value>;
pub type UpdateFile = dyn Iterator<Item = Result<Document>>;
@ -526,12 +526,12 @@ pub(crate) mod test {
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:22.688964637Z",
"updatedAt": "2022-10-09T20:27:23.951017769Z"
}
"###);
@ -541,12 +541,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:22.197788495Z",
"updatedAt": "2022-10-09T20:28:01.93111053Z"
}
"###);
@ -571,12 +571,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:24.242683494Z",
"updatedAt": "2022-10-09T20:27:24.312809641Z"
}
"###);
@ -617,12 +617,12 @@ pub(crate) mod test {
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:56.595257Z",
"updatedAt": "2023-01-30T16:25:58.70348Z"
}
"###);
@ -632,12 +632,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:56.192178Z",
"updatedAt": "2023-01-30T16:25:56.455714Z"
}
"###);
@ -647,12 +647,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"0227598af846e574139ee0b80e03a720");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:58.876405Z",
"updatedAt": "2023-01-30T16:25:59.079906Z"
}
"###);

View File

@ -1,24 +0,0 @@
---
source: dump/src/reader/mod.rs
expression: spells.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View File

@ -1,38 +0,0 @@
---
source: dump/src/reader/mod.rs
expression: products.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [],
"sortableAttributes": [],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness"
],
"stopWords": [],
"synonyms": {
"android": [
"phone",
"smartphone"
],
"iphone": [
"phone",
"smartphone"
],
"phone": [
"android",
"iphone",
"smartphone"
]
},
"distinctAttribute": null
}

View File

@ -1,31 +0,0 @@
---
source: dump/src/reader/mod.rs
expression: movies.settings().unwrap()
---
{
"displayedAttributes": [
"*"
],
"searchableAttributes": [
"*"
],
"filterableAttributes": [
"genres",
"id"
],
"sortableAttributes": [
"genres",
"id"
],
"rankingRules": [
"typo",
"words",
"proximity",
"attribute",
"exactness",
"release_date:asc"
],
"stopWords": [],
"synonyms": {},
"distinctAttribute": null
}

View File

@ -56,8 +56,7 @@ pub enum RankingRule {
Desc(String),
}
static ASC_DESC_REGEX: Lazy<Regex> =
Lazy::new(|| Regex::new(r#"(asc|desc)\(([\w_-]+)\)"#).unwrap());
static ASC_DESC_REGEX: Lazy<Regex> = Lazy::new(|| Regex::new(r"(asc|desc)\(([\w_-]+)\)").unwrap());
impl FromStr for RankingRule {
type Err = ();

View File

@ -46,6 +46,7 @@ pub type Checked = settings::Checked;
pub type Unchecked = settings::Unchecked;
pub type Task = updates::UpdateEntry;
pub type Kind = updates::UpdateMeta;
// everything related to the errors
pub type ResponseError = errors::ResponseError;
@ -107,8 +108,11 @@ impl V2Reader {
pub fn indexes(&self) -> Result<impl Iterator<Item = Result<V2IndexReader>> + '_> {
Ok(self.index_uuid.iter().map(|index| -> Result<_> {
V2IndexReader::new(
index.uid.clone(),
&self.dump.path().join("indexes").join(format!("index-{}", index.uuid)),
index,
BufReader::new(
File::open(self.dump.path().join("updates").join("data.jsonl")).unwrap(),
),
)
}))
}
@ -143,16 +147,41 @@ pub struct V2IndexReader {
}
impl V2IndexReader {
pub fn new(name: String, path: &Path) -> Result<Self> {
pub fn new(path: &Path, index_uuid: &IndexUuid, tasks: BufReader<File>) -> Result<Self> {
let meta = File::open(path.join("meta.json"))?;
let meta: DumpMeta = serde_json::from_reader(meta)?;
let mut created_at = None;
let mut updated_at = None;
for line in tasks.lines() {
let task: Task = serde_json::from_str(&line?)?;
if !(task.uuid == index_uuid.uuid && task.is_finished()) {
continue;
}
let new_created_at = match task.update.meta() {
Kind::DocumentsAddition { .. } | Kind::Settings(_) => task.update.finished_at(),
_ => None,
};
let new_updated_at = task.update.finished_at();
if created_at.is_none() || created_at > new_created_at {
created_at = new_created_at;
}
if updated_at.is_none() || updated_at < new_updated_at {
updated_at = new_updated_at;
}
}
let current_time = OffsetDateTime::now_utc();
let metadata = IndexMetadata {
uid: name,
uid: index_uuid.uid.clone(),
primary_key: meta.primary_key,
// FIXME: Iterate over the whole task queue to find the creation and last update date.
created_at: OffsetDateTime::now_utc(),
updated_at: OffsetDateTime::now_utc(),
created_at: created_at.unwrap_or(current_time),
updated_at: updated_at.unwrap_or(current_time),
};
let ret = V2IndexReader {
@ -248,12 +277,12 @@ pub(crate) mod test {
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:22.688964637Z",
"updatedAt": "2022-10-09T20:27:23.951017769Z"
}
"###);
@ -263,12 +292,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:22.197788495Z",
"updatedAt": "2022-10-09T20:28:01.93111053Z"
}
"###);
@ -293,12 +322,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"d751713988987e9331980363e24189ce");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2022-10-09T20:27:24.242683494Z",
"updatedAt": "2022-10-09T20:27:24.312809641Z"
}
"###);
@ -340,12 +369,12 @@ pub(crate) mod test {
assert!(indexes.is_empty());
// products
insta::assert_json_snapshot!(products.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(products.metadata(), @r###"
{
"uid": "products",
"primaryKey": "sku",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:56.595257Z",
"updatedAt": "2023-01-30T16:25:58.70348Z"
}
"###);
@ -355,12 +384,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"548284a84de510f71e88e6cdea495cf5");
// movies
insta::assert_json_snapshot!(movies.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(movies.metadata(), @r###"
{
"uid": "movies",
"primaryKey": "id",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:56.192178Z",
"updatedAt": "2023-01-30T16:25:56.455714Z"
}
"###);
@ -370,12 +399,12 @@ pub(crate) mod test {
meili_snap::snapshot_hash!(format!("{:#?}", documents), @"0227598af846e574139ee0b80e03a720");
// spells
insta::assert_json_snapshot!(spells.metadata(), { ".createdAt" => "[now]", ".updatedAt" => "[now]" }, @r###"
insta::assert_json_snapshot!(spells.metadata(), @r###"
{
"uid": "dnd_spells",
"primaryKey": "index",
"createdAt": "[now]",
"updatedAt": "[now]"
"createdAt": "2023-01-30T16:25:58.876405Z",
"updatedAt": "2023-01-30T16:25:59.079906Z"
}
"###);

View File

@ -227,4 +227,14 @@ impl UpdateStatus {
_ => None,
}
}
pub fn finished_at(&self) -> Option<OffsetDateTime> {
match self {
UpdateStatus::Processing(_) => None,
UpdateStatus::Enqueued(_) => None,
UpdateStatus::Processed(u) => Some(u.processed_at),
UpdateStatus::Aborted(_) => None,
UpdateStatus::Failed(u) => Some(u.failed_at),
}
}
}

View File

@ -1,5 +1,6 @@
use serde::{Deserialize, Serialize};
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error

View File

@ -95,6 +95,7 @@ impl fmt::Display for ErrorType {
}
}
#[allow(clippy::enum_variant_names)]
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error

View File

@ -31,6 +31,7 @@ impl ResponseError {
}
}
#[allow(clippy::enum_variant_names)]
#[derive(Deserialize, Debug, Clone, Copy)]
#[cfg_attr(test, derive(serde::Serialize))]
pub enum Code {

View File

@ -2,10 +2,10 @@ use std::fs::{self, File};
use std::io::{BufRead, BufReader, ErrorKind};
use std::path::Path;
use log::debug;
pub use meilisearch_types::milli;
use tempfile::TempDir;
use time::OffsetDateTime;
use tracing::debug;
use uuid::Uuid;
use super::Document;

View File

@ -11,9 +11,10 @@ edition.workspace = true
license.workspace = true
[dependencies]
tempfile = "3.5.0"
thiserror = "1.0.40"
uuid = { version = "1.3.1", features = ["serde", "v4"] }
tempfile = "3.9.0"
thiserror = "1.0.56"
tracing = "0.1.40"
uuid = { version = "1.6.1", features = ["serde", "v4"] }
[dev-dependencies]
faux = "0.1.9"
faux = "0.1.10"

View File

@ -1,5 +1,5 @@
use std::fs::File as StdFile;
use std::ops::{Deref, DerefMut};
use std::io::Write;
use std::path::{Path, PathBuf};
use std::str::FromStr;
@ -22,20 +22,6 @@ pub enum Error {
pub type Result<T> = std::result::Result<T, Error>;
impl Deref for File {
type Target = NamedTempFile;
fn deref(&self) -> &Self::Target {
&self.file
}
}
impl DerefMut for File {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.file
}
}
#[derive(Clone, Debug)]
pub struct FileStore {
path: PathBuf,
@ -56,7 +42,7 @@ impl FileStore {
let file = NamedTempFile::new_in(&self.path)?;
let uuid = Uuid::new_v4();
let path = self.path.join(uuid.to_string());
let update_file = File { file, path };
let update_file = File { file: Some(file), path };
Ok((uuid, update_file))
}
@ -67,7 +53,7 @@ impl FileStore {
let file = NamedTempFile::new_in(&self.path)?;
let uuid = Uuid::from_u128(uuid);
let path = self.path.join(uuid.to_string());
let update_file = File { file, path };
let update_file = File { file: Some(file), path };
Ok((uuid, update_file))
}
@ -75,7 +61,13 @@ impl FileStore {
/// Returns the file corresponding to the requested uuid.
pub fn get_update(&self, uuid: Uuid) -> Result<StdFile> {
let path = self.get_update_path(uuid);
let file = StdFile::open(path)?;
let file = match StdFile::open(path) {
Ok(file) => file,
Err(e) => {
tracing::error!("Can't access update file {uuid}: {e}");
return Err(e.into());
}
};
Ok(file)
}
@ -110,8 +102,12 @@ impl FileStore {
pub fn delete(&self, uuid: Uuid) -> Result<()> {
let path = self.path.join(uuid.to_string());
std::fs::remove_file(path)?;
Ok(())
if let Err(e) = std::fs::remove_file(path) {
tracing::error!("Can't delete file {uuid}: {e}");
Err(e.into())
} else {
Ok(())
}
}
/// List the Uuids of the files in the FileStore
@ -136,16 +132,40 @@ impl FileStore {
pub struct File {
path: PathBuf,
file: NamedTempFile,
file: Option<NamedTempFile>,
}
impl File {
pub fn dry_file() -> Result<Self> {
Ok(Self { path: PathBuf::new(), file: None })
}
pub fn persist(self) -> Result<()> {
self.file.persist(&self.path)?;
if let Some(file) = self.file {
file.persist(&self.path)?;
}
Ok(())
}
}
impl Write for File {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
if let Some(file) = self.file.as_mut() {
file.write(buf)
} else {
Ok(buf.len())
}
}
fn flush(&mut self) -> std::io::Result<()> {
if let Some(file) = self.file.as_mut() {
file.flush()
} else {
Ok(())
}
}
}
#[cfg(test)]
mod test {
use std::io::Write;

View File

@ -13,8 +13,8 @@ license.workspace = true
[dependencies]
nom = "7.1.3"
nom_locate = "4.1.0"
unescaper = "0.1.2"
nom_locate = "4.2.0"
unescaper = "0.1.3"
[dev-dependencies]
insta = "1.29.0"
insta = "1.34.0"

View File

@ -564,10 +564,10 @@ pub mod tests {
#[test]
fn parse_escaped() {
insta::assert_display_snapshot!(p(r#"title = 'foo\\'"#), @r#"{title} = {foo\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\'"#), @r#"{title} = {foo\\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\\\'"#), @r#"{title} = {foo\\\}"#);
insta::assert_display_snapshot!(p(r#"title = 'foo\\\\\\\\'"#), @r#"{title} = {foo\\\\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\'"), @r#"{title} = {foo\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\'"), @r#"{title} = {foo\\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\\\'"), @r#"{title} = {foo\\\}"#);
insta::assert_display_snapshot!(p(r"title = 'foo\\\\\\\\'"), @r#"{title} = {foo\\\\}"#);
// but it also works with other sequencies
insta::assert_display_snapshot!(p(r#"title = 'foo\x20\n\t\"\'"'"#), @"{title} = {foo \n\t\"\'\"}");
}

View File

@ -270,8 +270,8 @@ pub mod test {
("aaaa", "", rtok("", "aaaa"), "aaaa"),
(r#"aa"aa"#, r#""aa"#, rtok("", "aa"), "aa"),
(r#"aa\"aa"#, r#""#, rtok("", r#"aa\"aa"#), r#"aa"aa"#),
(r#"aa\\\aa"#, r#""#, rtok("", r#"aa\\\aa"#), r#"aa\\\aa"#),
(r#"aa\\"\aa"#, r#""\aa"#, rtok("", r#"aa\\"#), r#"aa\\"#),
(r"aa\\\aa", r#""#, rtok("", r"aa\\\aa"), r"aa\\\aa"),
(r#"aa\\"\aa"#, r#""\aa"#, rtok("", r"aa\\"), r"aa\\"),
(r#"aa\\\"\aa"#, r#""#, rtok("", r#"aa\\\"\aa"#), r#"aa\\"\aa"#),
(r#"\"\""#, r#""#, rtok("", r#"\"\""#), r#""""#),
];
@ -301,12 +301,12 @@ pub mod test {
);
// simple quote
assert_eq!(
unescape(Span::new_extra(r#"Hello \'World\'"#, ""), '\''),
unescape(Span::new_extra(r"Hello \'World\'", ""), '\''),
r#"Hello 'World'"#.to_string()
);
assert_eq!(
unescape(Span::new_extra(r#"Hello \\\'World\\\'"#, ""), '\''),
r#"Hello \\'World\\'"#.to_string()
unescape(Span::new_extra(r"Hello \\\'World\\\'", ""), '\''),
r"Hello \\'World\\'".to_string()
);
}
@ -335,19 +335,19 @@ pub mod test {
("\"cha'nnel\"", "cha'nnel", false),
("I'm tamo", "I", false),
// escaped thing but not quote
(r#""\\""#, r#"\"#, true),
(r#""\\\\\\""#, r#"\\\"#, true),
(r#""aa\\aa""#, r#"aa\aa"#, true),
(r#""\\""#, r"\", true),
(r#""\\\\\\""#, r"\\\", true),
(r#""aa\\aa""#, r"aa\aa", true),
// with double quote
(r#""Hello \"world\"""#, r#"Hello "world""#, true),
(r#""Hello \\\"world\\\"""#, r#"Hello \"world\""#, true),
(r#""I'm \"super\" tamo""#, r#"I'm "super" tamo"#, true),
(r#""\"\"""#, r#""""#, true),
// with simple quote
(r#"'Hello \'world\''"#, r#"Hello 'world'"#, true),
(r#"'Hello \\\'world\\\''"#, r#"Hello \'world\'"#, true),
(r"'Hello \'world\''", r#"Hello 'world'"#, true),
(r"'Hello \\\'world\\\''", r"Hello \'world\'", true),
(r#"'I\'m "super" tamo'"#, r#"I'm "super" tamo"#, true),
(r#"'\'\''"#, r#"''"#, true),
(r"'\'\''", r#"''"#, true),
];
for (input, expected, escaped) in test_case {

View File

@ -11,10 +11,10 @@ edition.workspace = true
license.workspace = true
[dependencies]
arbitrary = { version = "1.3.0", features = ["derive"] }
clap = { version = "4.3.0", features = ["derive"] }
fastrand = "2.0.0"
arbitrary = { version = "1.3.2", features = ["derive"] }
clap = { version = "4.4.17", features = ["derive"] }
fastrand = "2.0.1"
milli = { path = "../milli" }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
tempfile = "3.5.0"
serde = { version = "1.0.195", features = ["derive"] }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
tempfile = "3.9.0"

View File

@ -113,7 +113,7 @@ fn main() {
index.documents(&wtxn, res.documents_ids).unwrap();
progression.fetch_add(1, Ordering::Relaxed);
}
wtxn.abort().unwrap();
wtxn.abort();
});
if let err @ Err(_) = handle.join() {
stop.store(true, Ordering::Relaxed);

View File

@ -11,30 +11,37 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.70"
anyhow = "1.0.79"
bincode = "1.3.3"
csv = "1.2.1"
csv = "1.3.0"
derive_builder = "0.12.0"
dump = { path = "../dump" }
enum-iterator = "1.4.0"
enum-iterator = "1.5.0"
file-store = { path = "../file-store" }
log = "0.4.17"
flate2 = "1.0.28"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
page_size = "0.5.0"
puffin = "0.16.0"
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
puffin = { version = "0.16.0", features = ["serialization"] }
rayon = "1.8.1"
roaring = { version = "0.10.2", features = ["serde"] }
serde = { version = "1.0.195", features = ["derive"] }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
synchronoise = "1.0.1"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
uuid = { version = "1.3.1", features = ["serde", "v4"] }
tempfile = "3.9.0"
thiserror = "1.0.56"
time = { version = "0.3.31", features = [
"serde-well-known",
"formatting",
"parsing",
"macros",
] }
tracing = "0.1.40"
ureq = "2.9.1"
uuid = { version = "1.6.1", features = ["serde", "v4"] }
[dev-dependencies]
big_s = "1.0.2"
crossbeam = "0.8.2"
insta = { version = "1.29.0", features = ["json", "redactions"] }
crossbeam = "0.8.4"
insta = { version = "1.34.0", features = ["json", "redactions"] }
meili-snap = { path = "../meili-snap" }
nelson = { git = "https://github.com/meilisearch/nelson.git", rev = "675f13885548fb415ead8fbb447e9e6d9314000a"}

View File

@ -19,20 +19,19 @@ one indexing operation.
use std::collections::{BTreeSet, HashSet};
use std::ffi::OsStr;
use std::fmt;
use std::fs::{self, File};
use std::io::BufWriter;
use dump::IndexMetadata;
use log::{debug, error, info};
use meilisearch_types::error::Code;
use meilisearch_types::heed::{RoTxn, RwTxn};
use meilisearch_types::milli::documents::{obkv_to_object, DocumentsBatchReader};
use meilisearch_types::milli::heed::CompactionOption;
use meilisearch_types::milli::update::{
DeleteDocuments, DocumentDeletionResult, IndexDocumentsConfig, IndexDocumentsMethod,
Settings as MilliSettings,
IndexDocumentsConfig, IndexDocumentsMethod, IndexerConfig, Settings as MilliSettings,
};
use meilisearch_types::milli::{self, Filter, BEU32};
use meilisearch_types::milli::{self, Filter};
use meilisearch_types::settings::{apply_settings_to_builder, Settings, Unchecked};
use meilisearch_types::tasks::{Details, IndexSwap, Kind, KindWithContent, Status, Task};
use meilisearch_types::{compression, Index, VERSION_FILE_NAME};
@ -43,7 +42,7 @@ use uuid::Uuid;
use crate::autobatcher::{self, BatchKind};
use crate::utils::{self, swap_index_uid_in_task};
use crate::{Error, IndexScheduler, ProcessingTasks, Result, TaskId};
use crate::{Error, IndexScheduler, MustStopProcessing, ProcessingTasks, Result, TaskId};
/// Represents a combination of tasks that can all be processed at the same time.
///
@ -60,7 +59,7 @@ pub(crate) enum Batch {
/// The list of tasks that were processing when this task cancelation appeared.
previous_processing_tasks: RoaringBitmap,
},
TaskDeletion(Task),
TaskDeletions(Vec<Task>),
SnapshotCreation(Vec<Task>),
Dump(Task),
IndexOperation {
@ -104,12 +103,6 @@ pub(crate) enum IndexOperation {
operations: Vec<DocumentOperation>,
tasks: Vec<Task>,
},
DocumentDeletion {
index_uid: String,
// The vec associated with each document deletion tasks.
documents: Vec<Vec<String>>,
tasks: Vec<Task>,
},
IndexDocumentDeletionByFilter {
index_uid: String,
task: Task,
@ -149,24 +142,28 @@ pub(crate) enum IndexOperation {
impl Batch {
/// Return the task ids associated with this batch.
pub fn ids(&self) -> Vec<TaskId> {
pub fn ids(&self) -> RoaringBitmap {
match self {
Batch::TaskCancelation { task, .. }
| Batch::TaskDeletion(task)
| Batch::Dump(task)
| Batch::IndexCreation { task, .. }
| Batch::IndexUpdate { task, .. } => vec![task.uid],
Batch::SnapshotCreation(tasks) | Batch::IndexDeletion { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
| Batch::IndexUpdate { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
Batch::SnapshotCreation(tasks)
| Batch::TaskDeletions(tasks)
| Batch::IndexDeletion { tasks, .. } => {
RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid))
}
Batch::IndexOperation { op, .. } => match op {
IndexOperation::DocumentOperation { tasks, .. }
| IndexOperation::DocumentDeletion { tasks, .. }
| IndexOperation::Settings { tasks, .. }
| IndexOperation::DocumentClear { tasks, .. } => {
tasks.iter().map(|task| task.uid).collect()
RoaringBitmap::from_iter(tasks.iter().map(|task| task.uid))
}
IndexOperation::IndexDocumentDeletionByFilter { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
IndexOperation::IndexDocumentDeletionByFilter { task, .. } => vec![task.uid],
IndexOperation::SettingsAndDocumentOperation {
document_import_tasks: tasks,
settings_tasks: other,
@ -176,9 +173,11 @@ impl Batch {
cleared_tasks: tasks,
settings_tasks: other,
..
} => tasks.iter().chain(other).map(|task| task.uid).collect(),
} => RoaringBitmap::from_iter(tasks.iter().chain(other).map(|task| task.uid)),
},
Batch::IndexSwap { task } => vec![task.uid],
Batch::IndexSwap { task } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
}
}
@ -187,7 +186,7 @@ impl Batch {
use Batch::*;
match self {
TaskCancelation { .. }
| TaskDeletion(_)
| TaskDeletions(_)
| SnapshotCreation(_)
| Dump(_)
| IndexSwap { .. } => None,
@ -199,11 +198,33 @@ impl Batch {
}
}
impl fmt::Display for Batch {
/// A text used when we debug the profiling reports.
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let index_uid = self.index_uid();
let tasks = self.ids();
match self {
Batch::TaskCancelation { .. } => f.write_str("TaskCancelation")?,
Batch::TaskDeletions(_) => f.write_str("TaskDeletion")?,
Batch::SnapshotCreation(_) => f.write_str("SnapshotCreation")?,
Batch::Dump(_) => f.write_str("Dump")?,
Batch::IndexOperation { op, .. } => write!(f, "{op}")?,
Batch::IndexCreation { .. } => f.write_str("IndexCreation")?,
Batch::IndexUpdate { .. } => f.write_str("IndexUpdate")?,
Batch::IndexDeletion { .. } => f.write_str("IndexDeletion")?,
Batch::IndexSwap { .. } => f.write_str("IndexSwap")?,
};
match index_uid {
Some(name) => f.write_fmt(format_args!(" on {name:?} from tasks: {tasks:?}")),
None => f.write_fmt(format_args!(" from tasks: {tasks:?}")),
}
}
}
impl IndexOperation {
pub fn index_uid(&self) -> &str {
match self {
IndexOperation::DocumentOperation { index_uid, .. }
| IndexOperation::DocumentDeletion { index_uid, .. }
| IndexOperation::IndexDocumentDeletionByFilter { index_uid, .. }
| IndexOperation::DocumentClear { index_uid, .. }
| IndexOperation::Settings { index_uid, .. }
@ -213,6 +234,27 @@ impl IndexOperation {
}
}
impl fmt::Display for IndexOperation {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
IndexOperation::DocumentOperation { .. } => {
f.write_str("IndexOperation::DocumentOperation")
}
IndexOperation::IndexDocumentDeletionByFilter { .. } => {
f.write_str("IndexOperation::IndexDocumentDeletionByFilter")
}
IndexOperation::DocumentClear { .. } => f.write_str("IndexOperation::DocumentClear"),
IndexOperation::Settings { .. } => f.write_str("IndexOperation::Settings"),
IndexOperation::DocumentClearAndSetting { .. } => {
f.write_str("IndexOperation::DocumentClearAndSetting")
}
IndexOperation::SettingsAndDocumentOperation { .. } => {
f.write_str("IndexOperation::SettingsAndDocumentOperation")
}
}
}
}
impl IndexScheduler {
/// Convert an [`BatchKind`](crate::autobatcher::BatchKind) into a [`Batch`].
///
@ -300,18 +342,27 @@ impl IndexScheduler {
BatchKind::DocumentDeletion { deletion_ids } => {
let tasks = self.get_existing_tasks(rtxn, deletion_ids)?;
let mut documents = Vec::new();
let mut operations = Vec::with_capacity(tasks.len());
let mut documents_counts = Vec::with_capacity(tasks.len());
for task in &tasks {
match task.kind {
KindWithContent::DocumentDeletion { ref documents_ids, .. } => {
documents.push(documents_ids.clone())
operations.push(DocumentOperation::Delete(documents_ids.clone()));
documents_counts.push(documents_ids.len() as u64);
}
_ => unreachable!(),
}
}
Ok(Some(Batch::IndexOperation {
op: IndexOperation::DocumentDeletion { index_uid, documents, tasks },
op: IndexOperation::DocumentOperation {
index_uid,
primary_key: None,
method: IndexDocumentsMethod::ReplaceDocuments,
documents_counts,
operations,
tasks,
},
must_create_index,
}))
}
@ -470,6 +521,7 @@ impl IndexScheduler {
/// 3. We get the *next* snapshot to process.
/// 4. We get the *next* dump to process.
/// 5. We get the *next* tasks to process for a specific index.
#[tracing::instrument(level = "trace", skip(self, rtxn), target = "indexing::scheduler")]
pub(crate) fn create_next_batch(&self, rtxn: &RoTxn) -> Result<Option<Batch>> {
#[cfg(test)]
self.maybe_fail(crate::tests::FailureLocation::InsideCreateBatch)?;
@ -494,9 +546,9 @@ impl IndexScheduler {
// 2. we get the next task to delete
let to_delete = self.get_kind(rtxn, Kind::TaskDeletion)? & enqueued;
if let Some(task_id) = to_delete.min() {
let task = self.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
return Ok(Some(Batch::TaskDeletion(task)));
if !to_delete.is_empty() {
let tasks = self.get_existing_tasks(rtxn, to_delete)?;
return Ok(Some(Batch::TaskDeletions(tasks)));
}
// 3. we batch the snapshot.
@ -539,7 +591,9 @@ impl IndexScheduler {
let index_tasks = self.index_tasks(rtxn, index_name)? & enqueued;
// If autobatching is disabled we only take one task at a time.
let tasks_limit = if self.autobatching_enabled { usize::MAX } else { 1 };
// Otherwise, we take only a maximum of tasks to create batches.
let tasks_limit =
if self.autobatching_enabled { self.max_number_of_batched_tasks } else { 1 };
let enqueued = index_tasks
.into_iter()
@ -573,6 +627,7 @@ impl IndexScheduler {
/// The list of tasks that were processed. The metadata of each task in the returned
/// list is updated accordingly, with the exception of the its date fields
/// [`finished_at`](meilisearch_types::tasks::Task::finished_at) and [`started_at`](meilisearch_types::tasks::Task::started_at).
#[tracing::instrument(level = "trace", skip(self, batch), target = "indexing::scheduler", fields(batch=batch.to_string()))]
pub(crate) fn process_batch(&self, batch: Batch) -> Result<Vec<Task>> {
#[cfg(test)]
{
@ -581,7 +636,7 @@ impl IndexScheduler {
self.breakpoint(crate::Breakpoint::InsideProcessBatch);
}
puffin::profile_function!(format!("{:?}", batch));
puffin::profile_function!(batch.to_string());
match batch {
Batch::TaskCancelation { mut task, previous_started_at, previous_processing_tasks } => {
@ -622,9 +677,10 @@ impl IndexScheduler {
Ok(()) => {
for content_uuid in canceled_tasks_content_uuids {
if let Err(error) = self.delete_update_file(content_uuid) {
error!(
"We failed deleting the content file indentified as {}: {}",
content_uuid, error
tracing::error!(
file_content_uuid = %content_uuid,
%error,
"Failed deleting content file"
)
}
}
@ -634,31 +690,43 @@ impl IndexScheduler {
Ok(vec![task])
}
Batch::TaskDeletion(mut task) => {
Batch::TaskDeletions(mut tasks) => {
// 1. Retrieve the tasks that matched the query at enqueue-time.
let matched_tasks =
let mut matched_tasks = RoaringBitmap::new();
for task in tasks.iter() {
if let KindWithContent::TaskDeletion { tasks, query: _ } = &task.kind {
tasks
matched_tasks |= tasks;
} else {
unreachable!()
}
}
let mut wtxn = self.env.write_txn()?;
let mut deleted_tasks = self.delete_matched_tasks(&mut wtxn, &matched_tasks)?;
wtxn.commit()?;
for task in tasks.iter_mut() {
task.status = Status::Succeeded;
let KindWithContent::TaskDeletion { tasks, query: _ } = &task.kind else {
unreachable!()
};
let mut wtxn = self.env.write_txn()?;
let deleted_tasks_count = self.delete_matched_tasks(&mut wtxn, matched_tasks)?;
let deleted_tasks_count = deleted_tasks.intersection_len(tasks);
deleted_tasks -= tasks;
task.status = Status::Succeeded;
match &mut task.details {
Some(Details::TaskDeletion {
matched_tasks: _,
deleted_tasks,
original_filter: _,
}) => {
*deleted_tasks = Some(deleted_tasks_count);
match &mut task.details {
Some(Details::TaskDeletion {
matched_tasks: _,
deleted_tasks,
original_filter: _,
}) => {
*deleted_tasks = Some(deleted_tasks_count);
}
_ => unreachable!(),
}
_ => unreachable!(),
}
wtxn.commit()?;
Ok(vec![task])
Ok(tasks)
}
Batch::SnapshotCreation(mut tasks) => {
fs::create_dir_all(&self.snapshots_path)?;
@ -670,7 +738,7 @@ impl IndexScheduler {
// 2. Snapshot the index-scheduler LMDB env
//
// When we call copy_to_path, LMDB opens a read transaction by itself,
// When we call copy_to_file, LMDB opens a read transaction by itself,
// we can't provide our own. It is an issue as we would like to know
// the update files to copy but new ones can be enqueued between the copy
// of the env and the new transaction we open to retrieve the enqueued tasks.
@ -683,7 +751,7 @@ impl IndexScheduler {
// 2.1 First copy the LMDB env of the index-scheduler
let dst = temp_snapshot_dir.path().join("tasks");
fs::create_dir_all(&dst)?;
self.env.copy_to_path(dst.join("data.mdb"), CompactionOption::Enabled)?;
self.env.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
// 2.2 Create a read transaction on the index-scheduler
let rtxn = self.env.read_txn()?;
@ -708,7 +776,7 @@ impl IndexScheduler {
let index = self.index_mapper.index(&rtxn, name)?;
let dst = temp_snapshot_dir.path().join("indexes").join(uuid.to_string());
fs::create_dir_all(&dst)?;
index.copy_to_path(dst.join("data.mdb"), CompactionOption::Enabled)?;
index.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
}
drop(rtxn);
@ -721,7 +789,7 @@ impl IndexScheduler {
.map_size(1024 * 1024 * 1024) // 1 GiB
.max_dbs(2)
.open(&self.auth_path)?;
auth.copy_to_path(dst.join("data.mdb"), CompactionOption::Enabled)?;
auth.copy_to_file(dst.join("data.mdb"), CompactionOption::Enabled)?;
// 5. Copy and tarball the flat snapshot
// 5.1 Find the original name of the database
@ -777,6 +845,10 @@ impl IndexScheduler {
// 2. dump the tasks
let mut dump_tasks = dump.create_tasks_queue()?;
for ret in self.all_tasks.iter(&rtxn)? {
if self.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
let (_, mut t) = ret?;
let status = t.status;
let content_file = t.content_uuid();
@ -797,6 +869,9 @@ impl IndexScheduler {
// 2.1. Dump the `content_file` associated with the task if there is one and the task is not finished yet.
if let Some(content_file) = content_file {
if self.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
if status == Status::Enqueued {
let content_file = self.file_store.get_update(content_file)?;
@ -836,6 +911,9 @@ impl IndexScheduler {
// 3.1. Dump the documents
for ret in index.all_documents(&rtxn)? {
if self.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
let (_id, doc) = ret?;
let document = milli::obkv_to_json(&all_fields, &fields_ids_map, doc)?;
index_dumper.push_document(&document)?;
@ -848,13 +926,16 @@ impl IndexScheduler {
})?;
// 4. Dump experimental feature settings
let features = self.features()?.runtime_features();
let features = self.features().runtime_features();
dump.create_experimental_features(features)?;
let dump_uid = started_at.format(format_description!(
"[year repr:full][month repr:numerical][day padding:zero]-[hour padding:zero][minute padding:zero][second padding:zero][subsecond digits:3]"
)).unwrap();
if self.must_stop_processing.get() {
return Err(Error::AbortedTask);
}
let path = self.dumps_path.join(format!("{}.dump", dump_uid));
let file = File::create(path)?;
dump.persist_to(BufWriter::new(file))?;
@ -875,6 +956,10 @@ impl IndexScheduler {
self.index_mapper.index(&rtxn, &index_uid)?
};
// the index operation can take a long time, so save this handle to make it available to the search for the duration of the tick
self.index_mapper
.set_currently_updating_index(Some((index_uid.clone(), index.clone())));
let mut index_wtxn = index.write_txn()?;
let tasks = self.apply_index_operation(&mut index_wtxn, &index, op)?;
index_wtxn.commit()?;
@ -894,7 +979,10 @@ impl IndexScheduler {
match res {
Ok(_) => (),
Err(e) => error!("Could not write the stats of the index {}", e),
Err(e) => tracing::error!(
error = &e as &dyn std::error::Error,
"Could not write the stats of the index"
),
}
Ok(tasks)
@ -922,7 +1010,7 @@ impl IndexScheduler {
builder.set_primary_key(primary_key);
let must_stop_processing = self.must_stop_processing.clone();
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
index_wtxn.commit()?;
@ -949,7 +1037,10 @@ impl IndexScheduler {
match res {
Ok(_) => (),
Err(e) => error!("Could not write the stats of the index {}", e),
Err(e) => tracing::error!(
error = &e as &dyn std::error::Error,
"Could not write the stats of the index"
),
}
Ok(vec![task])
@ -1044,7 +1135,7 @@ impl IndexScheduler {
for task_id in &index_lhs_task_ids | &index_rhs_task_ids {
let mut task = self.get_task(wtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
swap_index_uid_in_task(&mut task, (lhs, rhs));
self.all_tasks.put(wtxn, &BEU32::new(task_id), &task)?;
self.all_tasks.put(wtxn, &task_id, &task)?;
}
// 4. remove the task from indexuid = before_name
@ -1068,9 +1159,14 @@ impl IndexScheduler {
///
/// ## Return
/// The list of processed tasks.
#[tracing::instrument(
level = "trace",
skip(self, index_wtxn, index),
target = "indexing::scheduler"
)]
fn apply_index_operation<'i>(
&self,
index_wtxn: &mut RwTxn<'i, '_>,
index_wtxn: &mut RwTxn<'i>,
index: &'i Index,
operation: IndexOperation,
) -> Result<Vec<Task>> {
@ -1128,7 +1224,7 @@ impl IndexScheduler {
milli::update::Settings::new(index_wtxn, index, indexer_config);
builder.set_primary_key(primary_key);
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.clone().get(),
)?;
primary_key_has_been_set = true;
@ -1138,12 +1234,16 @@ impl IndexScheduler {
let config = IndexDocumentsConfig { update_method: method, ..Default::default() };
let embedder_configs = index.embedding_configs(index_wtxn)?;
// TODO: consider Arc'ing the map too (we only need read access + we'll be cloning it multiple times, so really makes sense)
let embedders = self.embedders(embedder_configs)?;
let mut builder = milli::update::IndexDocuments::new(
index_wtxn,
index,
indexer_config,
config,
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::trace!(?indexing_step, "Update"),
|| must_stop_processing.get(),
)?;
@ -1156,6 +1256,8 @@ impl IndexScheduler {
let (new_builder, user_result) = builder.add_documents(reader)?;
builder = new_builder;
builder = builder.with_embedders(embedders.clone());
let received_documents =
if let Some(Details::DocumentAdditionOrUpdate {
received_documents,
@ -1190,7 +1292,8 @@ impl IndexScheduler {
let (new_builder, user_result) =
builder.remove_documents(document_ids)?;
builder = new_builder;
// Uses Invariant: remove documents actually always returns Ok for the inner result
let count = user_result.unwrap();
let provided_ids =
if let Some(Details::DocumentDeletion { provided_ids, .. }) =
task.details
@ -1201,30 +1304,18 @@ impl IndexScheduler {
unreachable!();
};
match user_result {
Ok(count) => {
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletion {
provided_ids,
deleted_documents: Some(count),
});
}
Err(e) => {
task.status = Status::Failed;
task.details = Some(Details::DocumentDeletion {
provided_ids,
deleted_documents: Some(0),
});
task.error = Some(milli::Error::from(e).into());
}
}
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletion {
provided_ids,
deleted_documents: Some(count),
});
}
}
}
if !tasks.iter().all(|res| res.error.is_some()) {
let addition = builder.execute()?;
info!("document addition done: {:?}", addition);
tracing::info!(indexing_result = ?addition, "document indexing done");
} else if primary_key_has_been_set {
// Everything failed but we've set a primary key.
// We need to remove it.
@ -1232,31 +1323,13 @@ impl IndexScheduler {
milli::update::Settings::new(index_wtxn, index, indexer_config);
builder.reset_primary_key();
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::trace!(update = ?indexing_step),
|| must_stop_processing.clone().get(),
)?;
}
Ok(tasks)
}
IndexOperation::DocumentDeletion { index_uid: _, documents, mut tasks } => {
let mut builder = milli::update::DeleteDocuments::new(index_wtxn, index)?;
documents.iter().flatten().for_each(|id| {
builder.delete_external_id(id);
});
let DocumentDeletionResult { deleted_documents, .. } = builder.execute()?;
for (task, documents) in tasks.iter_mut().zip(documents) {
task.status = Status::Succeeded;
task.details = Some(Details::DocumentDeletion {
provided_ids: documents.len(),
deleted_documents: Some(deleted_documents.min(documents.len() as u64)),
});
}
Ok(tasks)
}
IndexOperation::IndexDocumentDeletionByFilter { mut task, index_uid: _ } => {
let filter =
if let KindWithContent::DocumentDeletionByFilter { filter_expr, .. } =
@ -1266,7 +1339,13 @@ impl IndexScheduler {
} else {
unreachable!()
};
let deleted_documents = delete_document_by_filter(index_wtxn, filter, index);
let deleted_documents = delete_document_by_filter(
index_wtxn,
filter,
self.index_mapper.indexer_config(),
self.must_stop_processing.clone(),
index,
);
let original_filter = if let Some(Details::DocumentDeletionByFilter {
original_filter,
deleted_documents: _,
@ -1314,7 +1393,7 @@ impl IndexScheduler {
let must_stop_processing = self.must_stop_processing.clone();
builder.execute(
|indexing_step| debug!("update: {:?}", indexing_step),
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
@ -1388,7 +1467,11 @@ impl IndexScheduler {
/// Delete each given task from all the databases (if it is deleteable).
///
/// Return the number of tasks that were actually deleted.
fn delete_matched_tasks(&self, wtxn: &mut RwTxn, matched_tasks: &RoaringBitmap) -> Result<u64> {
fn delete_matched_tasks(
&self,
wtxn: &mut RwTxn,
matched_tasks: &RoaringBitmap,
) -> Result<RoaringBitmap> {
// 1. Remove from this list the tasks that we are not allowed to delete
let enqueued_tasks = self.get_status(wtxn, Status::Enqueued)?;
let processing_tasks = &self.processing_tasks.read().unwrap().processing.clone();
@ -1440,10 +1523,9 @@ impl IndexScheduler {
}
for task in to_delete_tasks.iter() {
self.all_tasks.delete(wtxn, &BEU32::new(task))?;
self.all_tasks.delete(wtxn, &task)?;
}
for canceled_by in affected_canceled_by {
let canceled_by = BEU32::new(canceled_by);
if let Some(mut tasks) = self.canceled_by.get(wtxn, &canceled_by)? {
tasks -= &to_delete_tasks;
if tasks.is_empty() {
@ -1454,7 +1536,7 @@ impl IndexScheduler {
}
}
Ok(to_delete_tasks.len())
Ok(to_delete_tasks)
}
/// Cancel each given task from all the databases (if it is cancelable).
@ -1491,15 +1573,17 @@ impl IndexScheduler {
task.details = task.details.map(|d| d.to_failed());
self.update_task(wtxn, &task)?;
}
self.canceled_by.put(wtxn, &BEU32::new(cancel_task_id), &tasks_to_cancel)?;
self.canceled_by.put(wtxn, &cancel_task_id, &tasks_to_cancel)?;
Ok(content_files_to_delete)
}
}
fn delete_document_by_filter<'a>(
wtxn: &mut RwTxn<'a, '_>,
wtxn: &mut RwTxn<'a>,
filter: &serde_json::Value,
indexer_config: &IndexerConfig,
must_stop_processing: MustStopProcessing,
index: &'a Index,
) -> Result<u64> {
let filter = Filter::from_json(filter)?;
@ -1510,9 +1594,26 @@ fn delete_document_by_filter<'a>(
}
e => e.into(),
})?;
let mut delete_operation = DeleteDocuments::new(wtxn, index)?;
delete_operation.delete_documents(&candidates);
delete_operation.execute().map(|result| result.deleted_documents)?
let config = IndexDocumentsConfig {
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
};
let mut builder = milli::update::IndexDocuments::new(
wtxn,
index,
indexer_config,
config,
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)?;
let (new_builder, count) = builder.remove_documents_from_db_no_batch(&candidates)?;
builder = new_builder;
let _ = builder.execute()?;
count
} else {
0
})

View File

@ -48,6 +48,8 @@ impl From<DateField> for Code {
pub enum Error {
#[error("{1}")]
WithCustomErrorCode(Code, Box<Self>),
#[error("Received bad task id: {received} should be >= to {expected}.")]
BadTaskId { received: TaskId, expected: TaskId },
#[error("Index `{0}` not found.")]
IndexNotFound(String),
#[error("Index `{0}` already exists.")]
@ -108,6 +110,8 @@ pub enum Error {
TaskDeletionWithEmptyQuery,
#[error("Query parameters to filter the tasks to cancel are missing. Available query parameters are: `uids`, `indexUids`, `statuses`, `types`, `canceledBy`, `beforeEnqueuedAt`, `afterEnqueuedAt`, `beforeStartedAt`, `afterStartedAt`, `beforeFinishedAt`, `afterFinishedAt`.")]
TaskCancelationWithEmptyQuery,
#[error("Aborted task")]
AbortedTask,
#[error(transparent)]
Dump(#[from] dump::Error),
@ -159,6 +163,7 @@ impl Error {
match self {
Error::IndexNotFound(_)
| Error::WithCustomErrorCode(_, _)
| Error::BadTaskId { .. }
| Error::IndexAlreadyExists(_)
| Error::SwapDuplicateIndexFound(_)
| Error::SwapDuplicateIndexesFound(_)
@ -175,6 +180,7 @@ impl Error {
| Error::TaskNotFound(_)
| Error::TaskDeletionWithEmptyQuery
| Error::TaskCancelationWithEmptyQuery
| Error::AbortedTask
| Error::Dump(_)
| Error::Heed(_)
| Error::Milli(_)
@ -202,6 +208,7 @@ impl ErrorCode for Error {
fn error_code(&self) -> Code {
match self {
Error::WithCustomErrorCode(code, _) => *code,
Error::BadTaskId { .. } => Code::BadRequest,
Error::IndexNotFound(_) => Code::IndexNotFound,
Error::IndexAlreadyExists(_) => Code::IndexAlreadyExists,
Error::SwapDuplicateIndexesFound(_) => Code::InvalidSwapDuplicateIndexFound,
@ -236,6 +243,9 @@ impl ErrorCode for Error {
Error::TaskDatabaseUpdate(_) => Code::Internal,
Error::CreateBatch(_) => Code::Internal,
// This one should never be seen by the end user
Error::AbortedTask => Code::Internal,
#[cfg(test)]
Error::PlannedFailure => Code::Internal,
}

View File

@ -1,6 +1,8 @@
use std::sync::{Arc, RwLock};
use meilisearch_types::features::{InstanceTogglableFeatures, RuntimeTogglableFeatures};
use meilisearch_types::heed::types::{SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::heed::{Database, Env, RwTxn};
use crate::error::FeatureNotEnabledError;
use crate::Result;
@ -9,73 +11,94 @@ const EXPERIMENTAL_FEATURES: &str = "experimental-features";
#[derive(Clone)]
pub(crate) struct FeatureData {
runtime: Database<Str, SerdeJson<RuntimeTogglableFeatures>>,
instance: InstanceTogglableFeatures,
persisted: Database<Str, SerdeJson<RuntimeTogglableFeatures>>,
runtime: Arc<RwLock<RuntimeTogglableFeatures>>,
}
#[derive(Debug, Clone, Copy)]
pub struct RoFeatures {
runtime: RuntimeTogglableFeatures,
instance: InstanceTogglableFeatures,
}
impl RoFeatures {
fn new(txn: RoTxn<'_>, data: &FeatureData) -> Result<Self> {
let runtime = data.runtime_features(txn)?;
Ok(Self { runtime, instance: data.instance })
fn new(data: &FeatureData) -> Self {
let runtime = data.runtime_features();
Self { runtime }
}
pub fn runtime_features(&self) -> RuntimeTogglableFeatures {
self.runtime
}
pub fn check_score_details(&self) -> Result<()> {
if self.runtime.score_details {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Computing score details",
feature: "score details",
issue_link: "https://github.com/meilisearch/product/discussions/674",
}
.into())
}
}
pub fn check_metrics(&self) -> Result<()> {
if self.instance.metrics {
if self.runtime.metrics {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Getting metrics",
feature: "metrics",
issue_link: "https://github.com/meilisearch/meilisearch/discussions/3518",
issue_link: "https://github.com/meilisearch/product/discussions/625",
}
.into())
}
}
pub fn check_vector(&self) -> Result<()> {
pub fn check_logs_route(&self) -> Result<()> {
if self.runtime.logs_route {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Modifying logs through the `/logs/*` routes",
feature: "logs route",
issue_link: "https://github.com/orgs/meilisearch/discussions/721",
}
.into())
}
}
pub fn check_vector(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.vector_store {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Passing `vector` as a query parameter",
disabled_action,
feature: "vector store",
issue_link: "https://github.com/meilisearch/product/discussions/677",
}
.into())
}
}
pub fn check_puffin(&self) -> Result<()> {
if self.runtime.export_puffin_reports {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Outputting Puffin reports to disk",
feature: "export puffin reports",
issue_link: "https://github.com/meilisearch/product/discussions/693",
}
.into())
}
}
}
impl FeatureData {
pub fn new(env: &Env, instance_features: InstanceTogglableFeatures) -> Result<Self> {
let mut wtxn = env.write_txn()?;
let runtime_features = env.create_database(&mut wtxn, Some(EXPERIMENTAL_FEATURES))?;
let runtime_features_db = env.create_database(&mut wtxn, Some(EXPERIMENTAL_FEATURES))?;
wtxn.commit()?;
Ok(Self { runtime: runtime_features, instance: instance_features })
let txn = env.read_txn()?;
let persisted_features: RuntimeTogglableFeatures =
runtime_features_db.get(&txn, EXPERIMENTAL_FEATURES)?.unwrap_or_default();
let runtime = Arc::new(RwLock::new(RuntimeTogglableFeatures {
metrics: instance_features.metrics || persisted_features.metrics,
logs_route: instance_features.logs_route || persisted_features.logs_route,
..persisted_features
}));
Ok(Self { persisted: runtime_features_db, runtime })
}
pub fn put_runtime_features(
@ -83,16 +106,25 @@ impl FeatureData {
mut wtxn: RwTxn,
features: RuntimeTogglableFeatures,
) -> Result<()> {
self.runtime.put(&mut wtxn, EXPERIMENTAL_FEATURES, &features)?;
self.persisted.put(&mut wtxn, EXPERIMENTAL_FEATURES, &features)?;
wtxn.commit()?;
// safe to unwrap, the lock will only fail if:
// 1. requested by the same thread concurrently -> it is called and released in methods that don't call each other
// 2. there's a panic while the thread is held -> it is only used for an assignment here.
let mut toggled_features = self.runtime.write().unwrap();
*toggled_features = features;
Ok(())
}
fn runtime_features(&self, txn: RoTxn) -> Result<RuntimeTogglableFeatures> {
Ok(self.runtime.get(&txn, EXPERIMENTAL_FEATURES)?.unwrap_or_default())
fn runtime_features(&self) -> RuntimeTogglableFeatures {
// sound to unwrap, the lock will only fail if:
// 1. requested by the same thread concurrently -> it is called and released in methods that don't call each other
// 2. there's a panic while the thread is held -> it is only used for copying the data here
*self.runtime.read().unwrap()
}
pub fn features(&self, txn: RoTxn) -> Result<RoFeatures> {
RoFeatures::new(txn, self)
pub fn features(&self) -> RoFeatures {
RoFeatures::new(self)
}
}

View File

@ -1,12 +1,8 @@
/// the map size to use when we don't succeed in reading it in indexes.
const DEFAULT_MAP_SIZE: usize = 10 * 1024 * 1024 * 1024; // 10 GiB
use std::collections::BTreeMap;
use std::path::Path;
use std::time::Duration;
use meilisearch_types::heed::flags::Flags;
use meilisearch_types::heed::{EnvClosingEvent, EnvOpenOptions};
use meilisearch_types::heed::{EnvClosingEvent, EnvFlags, EnvOpenOptions};
use meilisearch_types::milli::Index;
use time::OffsetDateTime;
use uuid::Uuid;
@ -236,7 +232,7 @@ impl IndexMap {
enable_mdb_writemap: bool,
map_size_growth: usize,
) {
let map_size = index.map_size().unwrap_or(DEFAULT_MAP_SIZE) + map_size_growth;
let map_size = index.map_size() + map_size_growth;
let closing_event = index.prepare_for_closing();
let generation = self.next_generation();
self.unavailable.insert(
@ -309,7 +305,7 @@ fn create_or_open_index(
options.map_size(clamp_to_page_size(map_size));
options.max_readers(1024);
if enable_mdb_writemap {
unsafe { options.flag(Flags::MdbWriteMap) };
unsafe { options.flags(EnvFlags::WRITE_MAP) };
}
if let Some((created, updated)) = date {
@ -388,7 +384,7 @@ mod tests {
fn assert_index_size(index: Index, expected: usize) {
let expected = clamp_to_page_size(expected);
let index_map_size = index.map_size().unwrap();
let index_map_size = index.map_size();
assert_eq!(index_map_size, expected);
}
}

View File

@ -3,13 +3,13 @@ use std::sync::{Arc, RwLock};
use std::time::Duration;
use std::{fs, thread};
use log::error;
use meilisearch_types::heed::types::{SerdeJson, Str};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::{FieldDistribution, Index};
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use tracing::error;
use uuid::Uuid;
use self::index_map::IndexMap;
@ -69,6 +69,10 @@ pub struct IndexMapper {
/// Whether we open a meilisearch index with the MDB_WRITEMAP option or not.
enable_mdb_writemap: bool,
pub indexer_config: Arc<IndexerConfig>,
/// A few types of long running batches of tasks that act on a single index set this field
/// so that a handle to the index is available from other threads (search) in an optimized manner.
currently_updating_index: Arc<RwLock<Option<(String, Index)>>>,
}
/// Whether the index is available for use or is forbidden to be inserted back in the index map
@ -151,6 +155,7 @@ impl IndexMapper {
index_growth_amount,
enable_mdb_writemap,
indexer_config: Arc::new(indexer_config),
currently_updating_index: Default::default(),
})
}
@ -303,6 +308,14 @@ impl IndexMapper {
/// Return an index, may open it if it wasn't already opened.
pub fn index(&self, rtxn: &RoTxn, name: &str) -> Result<Index> {
if let Some((current_name, current_index)) =
self.currently_updating_index.read().unwrap().as_ref()
{
if current_name == name {
return Ok(current_index.clone());
}
}
let uuid = self
.index_mapping
.get(rtxn, name)?
@ -474,4 +487,8 @@ impl IndexMapper {
pub fn indexer_config(&self) -> &IndexerConfig {
&self.indexer_config
}
pub fn set_currently_updating_index(&self, index: Option<(String, Index)>) {
*self.currently_updating_index.write().unwrap() = index;
}
}

View File

@ -1,7 +1,7 @@
use std::collections::BTreeSet;
use std::fmt::Write;
use meilisearch_types::heed::types::{OwnedType, SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::types::{SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{Database, RoTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, RoaringBitmapCodec, BEU32};
use meilisearch_types::tasks::{Details, Task};
@ -15,6 +15,7 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
let IndexScheduler {
autobatching_enabled,
cleanup_enabled: _,
must_stop_processing: _,
processing_tasks,
file_store,
@ -30,14 +31,19 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
index_mapper,
features: _,
max_number_of_tasks: _,
max_number_of_batched_tasks: _,
puffin_frame: _,
wake_up: _,
dumps_path: _,
snapshots_path: _,
auth_path: _,
version_file_path: _,
webhook_url: _,
webhook_authorization_header: _,
test_breakpoint_sdr: _,
planned_failures: _,
run_loop_iteration: _,
embedders: _,
} = scheduler;
let rtxn = env.read_txn().unwrap();
@ -113,7 +119,7 @@ pub fn snapshot_bitmap(r: &RoaringBitmap) -> String {
snap
}
pub fn snapshot_all_tasks(rtxn: &RoTxn, db: Database<OwnedType<BEU32>, SerdeJson<Task>>) -> String {
pub fn snapshot_all_tasks(rtxn: &RoTxn, db: Database<BEU32, SerdeJson<Task>>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
for next in iter {
@ -123,10 +129,7 @@ pub fn snapshot_all_tasks(rtxn: &RoTxn, db: Database<OwnedType<BEU32>, SerdeJson
snap
}
pub fn snapshot_date_db(
rtxn: &RoTxn,
db: Database<OwnedType<BEI128>, CboRoaringBitmapCodec>,
) -> String {
pub fn snapshot_date_db(rtxn: &RoTxn, db: Database<BEI128, CboRoaringBitmapCodec>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
for next in iter {
@ -246,10 +249,7 @@ pub fn snapshot_index_tasks(rtxn: &RoTxn, db: Database<Str, RoaringBitmapCodec>)
}
snap
}
pub fn snapshot_canceled_by(
rtxn: &RoTxn,
db: Database<OwnedType<BEU32>, RoaringBitmapCodec>,
) -> String {
pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
for next in iter {

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,35 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { dump_uid: None }, kind: DumpCreation { keys: [], instance_uid: None }}
----------------------------------------------------------------------
### Status:
enqueued [0,]
----------------------------------------------------------------------
### Kind:
"dumpCreation" [0,]
----------------------------------------------------------------------
### Index Tasks:
----------------------------------------------------------------------
### Index Mapper:
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
----------------------------------------------------------------------
### Started At:
----------------------------------------------------------------------
### Finished At:
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,45 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: canceled, canceled_by: 1, details: { dump_uid: None }, kind: DumpCreation { keys: [], instance_uid: None }}
1 {uid: 1, status: succeeded, details: { matched_tasks: 1, canceled_tasks: Some(0), original_filter: "cancel dump" }, kind: TaskCancelation { query: "cancel dump", tasks: RoaringBitmap<[0]> }}
----------------------------------------------------------------------
### Status:
enqueued []
succeeded [1,]
canceled [0,]
----------------------------------------------------------------------
### Kind:
"taskCancelation" [1,]
"dumpCreation" [0,]
----------------------------------------------------------------------
### Index Tasks:
----------------------------------------------------------------------
### Index Mapper:
----------------------------------------------------------------------
### Canceled By:
1 [0,]
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -0,0 +1,38 @@
---
source: index-scheduler/src/lib.rs
---
### Autobatching Enabled = true
### Processing Tasks:
[0,]
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { dump_uid: None }, kind: DumpCreation { keys: [], instance_uid: None }}
1 {uid: 1, status: enqueued, details: { matched_tasks: 1, canceled_tasks: None, original_filter: "cancel dump" }, kind: TaskCancelation { query: "cancel dump", tasks: RoaringBitmap<[0]> }}
----------------------------------------------------------------------
### Status:
enqueued [0,1,]
----------------------------------------------------------------------
### Kind:
"taskCancelation" [1,]
"dumpCreation" [0,]
----------------------------------------------------------------------
### Index Tasks:
----------------------------------------------------------------------
### Index Mapper:
----------------------------------------------------------------------
### Canceled By:
----------------------------------------------------------------------
### Enqueued At:
[timestamp] [0,]
[timestamp] [1,]
----------------------------------------------------------------------
### Started At:
----------------------------------------------------------------------
### Finished At:
----------------------------------------------------------------------
### File Store:
----------------------------------------------------------------------

View File

@ -34,12 +34,10 @@ catto: { number_of_documents: 1, field_distribution: {"id": 1} }
[timestamp] [3,]
----------------------------------------------------------------------
### Started At:
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [2,3,]
----------------------------------------------------------------------
### Finished At:
[timestamp] [2,]
[timestamp] [3,]
[timestamp] [2,3,]
----------------------------------------------------------------------
### File Store:
00000000-0000-0000-0000-000000000001

View File

@ -0,0 +1,90 @@
---
source: index-scheduler/src/lib.rs
---
[
{
"uid": 0,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "succeeded",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 1,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": {
"message": "Index `doggo` already exists.",
"code": "index_already_exists",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_already_exists"
},
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "failed",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 2,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "enqueued",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 3,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "enqueued",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
}
]

View File

@ -0,0 +1,90 @@
---
source: index-scheduler/src/lib.rs
---
[
{
"uid": 0,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "succeeded",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 1,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": {
"message": "Index `doggo` already exists.",
"code": "index_already_exists",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_already_exists"
},
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "failed",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 2,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "enqueued",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
},
{
"uid": 3,
"enqueuedAt": "[date]",
"startedAt": "[date]",
"finishedAt": "[date]",
"error": null,
"canceledBy": null,
"details": {
"IndexInfo": {
"primary_key": null
}
},
"status": "enqueued",
"kind": {
"indexCreation": {
"index_uid": "doggo",
"primary_key": null
}
}
}
]

View File

@ -3,9 +3,9 @@
use std::collections::{BTreeSet, HashSet};
use std::ops::Bound;
use meilisearch_types::heed::types::{DecodeIgnore, OwnedType};
use meilisearch_types::heed::types::DecodeIgnore;
use meilisearch_types::heed::{Database, RoTxn, RwTxn};
use meilisearch_types::milli::{CboRoaringBitmapCodec, BEU32};
use meilisearch_types::milli::CboRoaringBitmapCodec;
use meilisearch_types::tasks::{Details, IndexSwap, Kind, KindWithContent, Status};
use roaring::{MultiOps, RoaringBitmap};
use time::OffsetDateTime;
@ -18,7 +18,7 @@ impl IndexScheduler {
}
pub(crate) fn last_task_id(&self, rtxn: &RoTxn) -> Result<Option<TaskId>> {
Ok(self.all_tasks.remap_data_type::<DecodeIgnore>().last(rtxn)?.map(|(k, _)| k.get() + 1))
Ok(self.all_tasks.remap_data_type::<DecodeIgnore>().last(rtxn)?.map(|(k, _)| k + 1))
}
pub(crate) fn next_task_id(&self, rtxn: &RoTxn) -> Result<TaskId> {
@ -26,7 +26,7 @@ impl IndexScheduler {
}
pub(crate) fn get_task(&self, rtxn: &RoTxn, task_id: TaskId) -> Result<Option<Task>> {
Ok(self.all_tasks.get(rtxn, &BEU32::new(task_id))?)
Ok(self.all_tasks.get(rtxn, &task_id)?)
}
/// Convert an iterator to a `Vec` of tasks. The tasks MUST exist or a
@ -88,7 +88,7 @@ impl IndexScheduler {
}
}
self.all_tasks.put(wtxn, &BEU32::new(task.uid), task)?;
self.all_tasks.put(wtxn, &task.uid, task)?;
Ok(())
}
@ -169,11 +169,11 @@ impl IndexScheduler {
pub(crate) fn insert_task_datetime(
wtxn: &mut RwTxn,
database: Database<OwnedType<BEI128>, CboRoaringBitmapCodec>,
database: Database<BEI128, CboRoaringBitmapCodec>,
time: OffsetDateTime,
task_id: TaskId,
) -> Result<()> {
let timestamp = BEI128::new(time.unix_timestamp_nanos());
let timestamp = time.unix_timestamp_nanos();
let mut task_ids = database.get(wtxn, &timestamp)?.unwrap_or_default();
task_ids.insert(task_id);
database.put(wtxn, &timestamp, &RoaringBitmap::from_iter(task_ids))?;
@ -182,11 +182,11 @@ pub(crate) fn insert_task_datetime(
pub(crate) fn remove_task_datetime(
wtxn: &mut RwTxn,
database: Database<OwnedType<BEI128>, CboRoaringBitmapCodec>,
database: Database<BEI128, CboRoaringBitmapCodec>,
time: OffsetDateTime,
task_id: TaskId,
) -> Result<()> {
let timestamp = BEI128::new(time.unix_timestamp_nanos());
let timestamp = time.unix_timestamp_nanos();
if let Some(mut existing) = database.get(wtxn, &timestamp)? {
existing.remove(task_id);
if existing.is_empty() {
@ -202,7 +202,7 @@ pub(crate) fn remove_task_datetime(
pub(crate) fn keep_tasks_within_datetimes(
rtxn: &RoTxn,
tasks: &mut RoaringBitmap,
database: Database<OwnedType<BEI128>, CboRoaringBitmapCodec>,
database: Database<BEI128, CboRoaringBitmapCodec>,
after: Option<OffsetDateTime>,
before: Option<OffsetDateTime>,
) -> Result<()> {
@ -213,8 +213,8 @@ pub(crate) fn keep_tasks_within_datetimes(
(Some(after), Some(before)) => (Bound::Excluded(*after), Bound::Excluded(*before)),
};
let mut collected_task_ids = RoaringBitmap::new();
let start = map_bound(start, |b| BEI128::new(b.unix_timestamp_nanos()));
let end = map_bound(end, |b| BEI128::new(b.unix_timestamp_nanos()));
let start = map_bound(start, |b| b.unix_timestamp_nanos());
let end = map_bound(end, |b| b.unix_timestamp_nanos());
let iter = database.range(rtxn, &(start, end))?;
for r in iter {
let (_timestamp, task_ids) = r?;
@ -337,8 +337,6 @@ impl IndexScheduler {
let rtxn = self.env.read_txn().unwrap();
for task in self.all_tasks.iter(&rtxn).unwrap() {
let (task_id, task) = task.unwrap();
let task_id = task_id.get();
let task_index_uid = task.index_uid().map(ToOwned::to_owned);
let Task {
@ -361,16 +359,13 @@ impl IndexScheduler {
.unwrap()
.contains(task.uid));
}
let db_enqueued_at = self
.enqueued_at
.get(&rtxn, &BEI128::new(enqueued_at.unix_timestamp_nanos()))
.unwrap()
.unwrap();
let db_enqueued_at =
self.enqueued_at.get(&rtxn, &enqueued_at.unix_timestamp_nanos()).unwrap().unwrap();
assert!(db_enqueued_at.contains(task_id));
if let Some(started_at) = started_at {
let db_started_at = self
.started_at
.get(&rtxn, &BEI128::new(started_at.unix_timestamp_nanos()))
.get(&rtxn, &started_at.unix_timestamp_nanos())
.unwrap()
.unwrap();
assert!(db_started_at.contains(task_id));
@ -378,7 +373,7 @@ impl IndexScheduler {
if let Some(finished_at) = finished_at {
let db_finished_at = self
.finished_at
.get(&rtxn, &BEI128::new(finished_at.unix_timestamp_nanos()))
.get(&rtxn, &finished_at.unix_timestamp_nanos())
.unwrap()
.unwrap();
assert!(db_finished_at.contains(task_id));

View File

@ -1,7 +1,6 @@
use std::borrow::Cow;
use std::convert::TryInto;
use meilisearch_types::heed::{BytesDecode, BytesEncode};
use meilisearch_types::heed::{BoxedError, BytesDecode, BytesEncode};
use uuid::Uuid;
/// A heed codec for value of struct Uuid.
@ -10,15 +9,15 @@ pub struct UuidCodec;
impl<'a> BytesDecode<'a> for UuidCodec {
type DItem = Uuid;
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
bytes.try_into().ok().map(Uuid::from_bytes)
fn bytes_decode(bytes: &'a [u8]) -> Result<Self::DItem, BoxedError> {
bytes.try_into().map(Uuid::from_bytes).map_err(Into::into)
}
}
impl BytesEncode<'_> for UuidCodec {
type EItem = Uuid;
fn bytes_encode(item: &Self::EItem) -> Option<Cow<[u8]>> {
Some(Cow::Borrowed(item.as_bytes()))
fn bytes_encode(item: &Self::EItem) -> Result<Cow<[u8]>, BoxedError> {
Ok(Cow::Borrowed(item.as_bytes()))
}
}

View File

@ -11,6 +11,6 @@ edition.workspace = true
license.workspace = true
[dependencies]
insta = { version = "^1.29.0", features = ["json", "redactions"] }
insta = { version = "^1.34.0", features = ["json", "redactions"] }
md5 = "0.7.0"
once_cell = "1.17"
once_cell = "1.19"

View File

@ -11,16 +11,16 @@ edition.workspace = true
license.workspace = true
[dependencies]
base64 = "0.21.0"
enum-iterator = "1.4.0"
base64 = "0.21.7"
enum-iterator = "1.5.0"
hmac = "0.12.1"
maplit = "1.0.2"
meilisearch-types = { path = "../meilisearch-types" }
rand = "0.8.5"
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
sha2 = "0.10.6"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
uuid = { version = "1.3.1", features = ["serde", "v4"] }
roaring = { version = "0.10.2", features = ["serde"] }
serde = { version = "1.0.195", features = ["derive"] }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
sha2 = "0.10.8"
thiserror = "1.0.56"
time = { version = "0.3.31", features = ["serde-well-known", "formatting", "parsing", "macros"] }
uuid = { version = "1.6.1", features = ["serde", "v4"] }

View File

@ -1,20 +1,22 @@
use std::borrow::Cow;
use std::cmp::Reverse;
use std::collections::HashSet;
use std::convert::{TryFrom, TryInto};
use std::fs::create_dir_all;
use std::path::Path;
use std::result::Result as StdResult;
use std::str;
use std::str::FromStr;
use std::sync::Arc;
use hmac::{Hmac, Mac};
use meilisearch_types::heed::BoxedError;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::KeyId;
use meilisearch_types::milli;
use meilisearch_types::milli::heed::types::{ByteSlice, DecodeIgnore, SerdeJson};
use meilisearch_types::milli::heed::types::{Bytes, DecodeIgnore, SerdeJson};
use meilisearch_types::milli::heed::{Database, Env, EnvOpenOptions, RwTxn};
use sha2::Sha256;
use thiserror::Error;
use time::OffsetDateTime;
use uuid::fmt::Hyphenated;
use uuid::Uuid;
@ -30,7 +32,7 @@ const KEY_ID_ACTION_INDEX_EXPIRATION_DB_NAME: &str = "keyid-action-index-expirat
#[derive(Clone)]
pub struct HeedAuthStore {
env: Arc<Env>,
keys: Database<ByteSlice, SerdeJson<Key>>,
keys: Database<Bytes, SerdeJson<Key>>,
action_keyid_index_expiration: Database<KeyIdActionCodec, SerdeJson<Option<OffsetDateTime>>>,
should_close_on_drop: bool,
}
@ -276,7 +278,7 @@ impl HeedAuthStore {
fn delete_key_from_inverted_db(&self, wtxn: &mut RwTxn, key: &KeyId) -> Result<()> {
let mut iter = self
.action_keyid_index_expiration
.remap_types::<ByteSlice, DecodeIgnore>()
.remap_types::<Bytes, DecodeIgnore>()
.prefix_iter_mut(wtxn, key.as_bytes())?;
while iter.next().transpose()?.is_some() {
// safety: we don't keep references from inside the LMDB database.
@ -294,23 +296,24 @@ pub struct KeyIdActionCodec;
impl<'a> milli::heed::BytesDecode<'a> for KeyIdActionCodec {
type DItem = (KeyId, Action, Option<&'a [u8]>);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let (key_id_bytes, action_bytes) = try_split_array_at(bytes)?;
let (action_bytes, index) = match try_split_array_at(action_bytes)? {
(action, []) => (action, None),
(action, index) => (action, Some(index)),
};
fn bytes_decode(bytes: &'a [u8]) -> StdResult<Self::DItem, BoxedError> {
let (key_id_bytes, action_bytes) = try_split_array_at(bytes).ok_or(SliceTooShortError)?;
let (&action_byte, index) =
match try_split_array_at(action_bytes).ok_or(SliceTooShortError)? {
([action], []) => (action, None),
([action], index) => (action, Some(index)),
};
let key_id = Uuid::from_bytes(*key_id_bytes);
let action = Action::from_repr(u8::from_be_bytes(*action_bytes))?;
let action = Action::from_repr(action_byte).ok_or(InvalidActionError { action_byte })?;
Some((key_id, action, index))
Ok((key_id, action, index))
}
}
impl<'a> milli::heed::BytesEncode<'a> for KeyIdActionCodec {
type EItem = (&'a KeyId, &'a Action, Option<&'a [u8]>);
fn bytes_encode((key_id, action, index): &Self::EItem) -> Option<Cow<[u8]>> {
fn bytes_encode((key_id, action, index): &Self::EItem) -> StdResult<Cow<[u8]>, BoxedError> {
let mut bytes = Vec::new();
bytes.extend_from_slice(key_id.as_bytes());
@ -320,10 +323,20 @@ impl<'a> milli::heed::BytesEncode<'a> for KeyIdActionCodec {
bytes.extend_from_slice(index);
}
Some(Cow::Owned(bytes))
Ok(Cow::Owned(bytes))
}
}
#[derive(Error, Debug)]
#[error("the slice is too short")]
pub struct SliceTooShortError;
#[derive(Error, Debug)]
#[error("cannot construct a valid Action from {action_byte}")]
pub struct InvalidActionError {
pub action_byte: u8,
}
pub fn generate_key_as_hexa(uid: Uuid, master_key: &[u8]) -> String {
// format uid as hyphenated allowing user to generate their own keys.
let mut uid_buffer = [0; Hyphenated::LENGTH];

View File

@ -11,31 +11,31 @@ edition.workspace = true
license.workspace = true
[dependencies]
actix-web = { version = "4.3.1", default-features = false }
anyhow = "1.0.70"
actix-web = { version = "4.4.1", default-features = false }
anyhow = "1.0.79"
convert_case = "0.6.0"
csv = "1.2.1"
deserr = { version = "0.6.0", features = ["actix-web"]}
either = { version = "1.8.1", features = ["serde"] }
enum-iterator = "1.4.0"
csv = "1.3.0"
deserr = { version = "0.6.1", features = ["actix-web"] }
either = { version = "1.9.0", features = ["serde"] }
enum-iterator = "1.5.0"
file-store = { path = "../file-store" }
flate2 = "1.0.25"
flate2 = "1.0.28"
fst = "0.4.7"
memmap2 = "0.7.1"
milli = { path = "../milli" }
roaring = { version = "0.10.1", features = ["serde"] }
serde = { version = "1.0.160", features = ["derive"] }
roaring = { version = "0.10.2", features = ["serde"] }
serde = { version = "1.0.195", features = ["derive"] }
serde-cs = "0.2.4"
serde_json = "1.0.95"
tar = "0.4.38"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tokio = "1.27"
uuid = { version = "1.3.1", features = ["serde", "v4"] }
serde_json = "1.0.111"
tar = "0.4.40"
tempfile = "3.9.0"
thiserror = "1.0.56"
time = { version = "0.3.31", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tokio = "1.35"
uuid = { version = "1.6.1", features = ["serde", "v4"] }
[dev-dependencies]
insta = "1.29.0"
insta = "1.34.0"
meili-snap = { path = "../meili-snap" }
[features]
@ -50,6 +50,9 @@ hebrew = ["milli/hebrew"]
japanese = ["milli/japanese"]
# thai specialized tokenization
thai = ["milli/thai"]
# allow greek specialized tokenization
greek = ["milli/greek"]
# allow khmer specialized tokenization
khmer = ["milli/khmer"]
# allow vietnamese specialized tokenization
vietnamese = ["milli/vietnamese"]

View File

@ -188,3 +188,4 @@ merge_with_error_impl_take_error_message!(ParseOffsetDateTimeError);
merge_with_error_impl_take_error_message!(ParseTaskKindError);
merge_with_error_impl_take_error_message!(ParseTaskStatusError);
merge_with_error_impl_take_error_message!(IndexUidFormatError);
merge_with_error_impl_take_error_message!(InvalidSearchSemanticRatio);

View File

@ -1,6 +1,6 @@
use std::fmt::{self, Debug, Display};
use std::fs::File;
use std::io::{self, Seek, Write};
use std::io::{self, BufWriter, Write};
use std::marker::PhantomData;
use memmap2::MmapOptions;
@ -104,8 +104,8 @@ impl ErrorCode for DocumentFormatError {
}
/// Reads CSV from input and write an obkv batch to writer.
pub fn read_csv(file: &File, writer: impl Write + Seek, delimiter: u8) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(writer);
pub fn read_csv(file: &File, writer: impl Write, delimiter: u8) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(BufWriter::new(writer));
let mmap = unsafe { MmapOptions::new().map(file)? };
let csv = csv::ReaderBuilder::new().delimiter(delimiter).from_reader(mmap.as_ref());
builder.append_csv(csv).map_err(|e| (PayloadType::Csv { delimiter }, e))?;
@ -116,9 +116,9 @@ pub fn read_csv(file: &File, writer: impl Write + Seek, delimiter: u8) -> Result
Ok(count as u64)
}
/// Reads JSON from temporary file and write an obkv batch to writer.
pub fn read_json(file: &File, writer: impl Write + Seek) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(writer);
/// Reads JSON from temporary file and write an obkv batch to writer.
pub fn read_json(file: &File, writer: impl Write) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(BufWriter::new(writer));
let mmap = unsafe { MmapOptions::new().map(file)? };
let mut deserializer = serde_json::Deserializer::from_slice(&mmap);
@ -151,8 +151,8 @@ pub fn read_json(file: &File, writer: impl Write + Seek) -> Result<u64> {
}
/// Reads JSON from temporary file and write an obkv batch to writer.
pub fn read_ndjson(file: &File, writer: impl Write + Seek) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(writer);
pub fn read_ndjson(file: &File, writer: impl Write) -> Result<u64> {
let mut builder = DocumentsBatchBuilder::new(BufWriter::new(writer));
let mmap = unsafe { MmapOptions::new().map(file)? };
for result in serde_json::Deserializer::from_slice(&mmap).into_iter() {

View File

@ -222,6 +222,8 @@ InvalidVectorsType , InvalidRequest , BAD_REQUEST ;
InvalidDocumentId , InvalidRequest , BAD_REQUEST ;
InvalidDocumentLimit , InvalidRequest , BAD_REQUEST ;
InvalidDocumentOffset , InvalidRequest , BAD_REQUEST ;
InvalidEmbedder , InvalidRequest , BAD_REQUEST ;
InvalidHybridQuery , InvalidRequest , BAD_REQUEST ;
InvalidIndexLimit , InvalidRequest , BAD_REQUEST ;
InvalidIndexOffset , InvalidRequest , BAD_REQUEST ;
InvalidIndexPrimaryKey , InvalidRequest , BAD_REQUEST ;
@ -233,6 +235,7 @@ InvalidSearchAttributesToRetrieve , InvalidRequest , BAD_REQUEST ;
InvalidSearchCropLength , InvalidRequest , BAD_REQUEST ;
InvalidSearchCropMarker , InvalidRequest , BAD_REQUEST ;
InvalidSearchFacets , InvalidRequest , BAD_REQUEST ;
InvalidSearchSemanticRatio , InvalidRequest , BAD_REQUEST ;
InvalidFacetSearchFacetName , InvalidRequest , BAD_REQUEST ;
InvalidSearchFilter , InvalidRequest , BAD_REQUEST ;
InvalidSearchHighlightPostTag , InvalidRequest , BAD_REQUEST ;
@ -252,9 +255,11 @@ InvalidSearchShowRankingScoreDetails , InvalidRequest , BAD_REQUEST ;
InvalidSearchSort , InvalidRequest , BAD_REQUEST ;
InvalidSettingsDisplayedAttributes , InvalidRequest , BAD_REQUEST ;
InvalidSettingsDistinctAttribute , InvalidRequest , BAD_REQUEST ;
InvalidSettingsProximityPrecision , InvalidRequest , BAD_REQUEST ;
InvalidSettingsFaceting , InvalidRequest , BAD_REQUEST ;
InvalidSettingsFilterableAttributes , InvalidRequest , BAD_REQUEST ;
InvalidSettingsPagination , InvalidRequest , BAD_REQUEST ;
InvalidSettingsEmbedders , InvalidRequest , BAD_REQUEST ;
InvalidSettingsRankingRules , InvalidRequest , BAD_REQUEST ;
InvalidSettingsSearchableAttributes , InvalidRequest , BAD_REQUEST ;
InvalidSettingsSortableAttributes , InvalidRequest , BAD_REQUEST ;
@ -294,15 +299,20 @@ MissingFacetSearchFacetName , InvalidRequest , BAD_REQUEST ;
MissingIndexUid , InvalidRequest , BAD_REQUEST ;
MissingMasterKey , Auth , UNAUTHORIZED ;
MissingPayload , InvalidRequest , BAD_REQUEST ;
MissingSearchHybrid , InvalidRequest , BAD_REQUEST ;
MissingSwapIndexes , InvalidRequest , BAD_REQUEST ;
MissingTaskFilters , InvalidRequest , BAD_REQUEST ;
NoSpaceLeftOnDevice , System , UNPROCESSABLE_ENTITY;
PayloadTooLarge , InvalidRequest , PAYLOAD_TOO_LARGE ;
TaskNotFound , InvalidRequest , NOT_FOUND ;
TooManyOpenFiles , System , UNPROCESSABLE_ENTITY ;
TooManyVectors , InvalidRequest , BAD_REQUEST ;
UnretrievableDocument , Internal , BAD_REQUEST ;
UnretrievableErrorCode , InvalidRequest , BAD_REQUEST ;
UnsupportedMediaType , InvalidRequest , UNSUPPORTED_MEDIA_TYPE
UnsupportedMediaType , InvalidRequest , UNSUPPORTED_MEDIA_TYPE ;
// Experimental features
VectorEmbeddingError , InvalidRequest , BAD_REQUEST
}
impl ErrorCode for JoinError {
@ -324,7 +334,6 @@ impl ErrorCode for milli::Error {
UserError::SerdeJson(_)
| UserError::InvalidLmdbOpenOptions
| UserError::DocumentLimitReached
| UserError::AccessingSoftDeletedDocument { .. }
| UserError::UnknownInternalDocumentId { .. } => Code::Internal,
UserError::InvalidStoreFile => Code::InvalidStoreFile,
UserError::NoSpaceLeftOnDevice => Code::NoSpaceLeftOnDevice,
@ -336,6 +345,16 @@ impl ErrorCode for milli::Error {
UserError::InvalidDocumentId { .. } | UserError::TooManyDocumentIds { .. } => {
Code::InvalidDocumentId
}
UserError::MissingDocumentField(_) => Code::InvalidDocumentFields,
UserError::InvalidFieldForSource { .. }
| UserError::MissingFieldForSource { .. }
| UserError::InvalidOpenAiModel { .. }
| UserError::InvalidOpenAiModelDimensions { .. }
| UserError::InvalidOpenAiModelDimensionsMax { .. }
| UserError::InvalidSettingsDimensions { .. }
| UserError::InvalidPrompt(_) => Code::InvalidSettingsEmbedders,
UserError::TooManyEmbedders(_) => Code::InvalidSettingsEmbedders,
UserError::InvalidPromptForEmbeddings(..) => Code::InvalidSettingsEmbedders,
UserError::NoPrimaryKeyCandidateFound => Code::IndexPrimaryKeyNoCandidateFound,
UserError::MultiplePrimaryKeyCandidatesFound { .. } => {
Code::IndexPrimaryKeyMultipleCandidatesFound
@ -353,11 +372,15 @@ impl ErrorCode for milli::Error {
UserError::CriterionError(_) => Code::InvalidSettingsRankingRules,
UserError::InvalidGeoField { .. } => Code::InvalidDocumentGeoField,
UserError::InvalidVectorDimensions { .. } => Code::InvalidVectorDimensions,
UserError::InvalidVectorsMapType { .. } => Code::InvalidVectorsType,
UserError::InvalidVectorsType { .. } => Code::InvalidVectorsType,
UserError::TooManyVectors(_, _) => Code::TooManyVectors,
UserError::SortError(_) => Code::InvalidSearchSort,
UserError::InvalidMinTypoWordLenSetting(_, _) => {
Code::InvalidSettingsTypoTolerance
}
UserError::InvalidEmbedder(_) => Code::InvalidEmbedder,
UserError::VectorEmbeddingError(_) => Code::VectorEmbeddingError,
}
}
}
@ -387,11 +410,11 @@ impl ErrorCode for HeedError {
HeedError::Mdb(MdbError::Invalid) => Code::InvalidStoreFile,
HeedError::Io(e) => e.error_code(),
HeedError::Mdb(_)
| HeedError::Encoding
| HeedError::Decoding
| HeedError::Encoding(_)
| HeedError::Decoding(_)
| HeedError::InvalidDatabaseTyping
| HeedError::DatabaseClosing
| HeedError::BadOpenOptions => Code::Internal,
| HeedError::BadOpenOptions { .. } => Code::Internal,
}
}
}
@ -445,6 +468,15 @@ impl fmt::Display for DeserrParseIntError {
}
}
impl fmt::Display for deserr_codes::InvalidSearchSemanticRatio {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"the value of `semanticRatio` is invalid, expected a float between `0.0` and `1.0`."
)
}
}
#[macro_export]
macro_rules! internal_error {
($target:ty : $($other:path), *) => {

View File

@ -3,11 +3,14 @@ use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug, Clone, Copy, Default, PartialEq, Eq)]
#[serde(rename_all = "camelCase", default)]
pub struct RuntimeTogglableFeatures {
pub score_details: bool,
pub vector_store: bool,
pub metrics: bool,
pub logs_route: bool,
pub export_puffin_reports: bool,
}
#[derive(Default, Debug, Clone, Copy)]
pub struct InstanceTogglableFeatures {
pub metrics: bool,
pub logs_route: bool,
}

View File

@ -9,6 +9,7 @@ pub mod index_uid_pattern;
pub mod keys;
pub mod settings;
pub mod star_or;
pub mod task_view;
pub mod tasks;
pub mod versioning;
pub use milli::{heed, Index};

View File

@ -8,6 +8,7 @@ use std::str::FromStr;
use deserr::{DeserializeError, Deserr, ErrorKind, MergeWithError, ValuePointerRef};
use fst::IntoStreamer;
use milli::proximity::ProximityPrecision;
use milli::update::Setting;
use milli::{Criterion, CriterionError, Index, DEFAULT_VALUES_PER_FACET};
use serde::{Deserialize, Serialize, Serializer};
@ -186,6 +187,9 @@ pub struct Settings<T> {
#[deserr(default, error = DeserrJsonError<InvalidSettingsDistinctAttribute>)]
pub distinct_attribute: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsProximityPrecision>)]
pub proximity_precision: Setting<ProximityPrecisionView>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsTypoTolerance>)]
pub typo_tolerance: Setting<TypoSettings>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
@ -195,6 +199,10 @@ pub struct Settings<T> {
#[deserr(default, error = DeserrJsonError<InvalidSettingsPagination>)]
pub pagination: Setting<PaginationSettings>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsEmbedders>)]
pub embedders: Setting<BTreeMap<String, Setting<milli::vector::settings::EmbeddingSettings>>>,
#[serde(skip)]
#[deserr(skip)]
pub _kind: PhantomData<T>,
@ -214,9 +222,11 @@ impl Settings<Checked> {
separator_tokens: Setting::Reset,
dictionary: Setting::Reset,
distinct_attribute: Setting::Reset,
proximity_precision: Setting::Reset,
typo_tolerance: Setting::Reset,
faceting: Setting::Reset,
pagination: Setting::Reset,
embedders: Setting::Reset,
_kind: PhantomData,
}
}
@ -234,9 +244,11 @@ impl Settings<Checked> {
dictionary,
synonyms,
distinct_attribute,
proximity_precision,
typo_tolerance,
faceting,
pagination,
embedders,
..
} = self;
@ -252,9 +264,11 @@ impl Settings<Checked> {
dictionary,
synonyms,
distinct_attribute,
proximity_precision,
typo_tolerance,
faceting,
pagination,
embedders,
_kind: PhantomData,
}
}
@ -296,12 +310,29 @@ impl Settings<Unchecked> {
separator_tokens: self.separator_tokens,
dictionary: self.dictionary,
distinct_attribute: self.distinct_attribute,
proximity_precision: self.proximity_precision,
typo_tolerance: self.typo_tolerance,
faceting: self.faceting,
pagination: self.pagination,
embedders: self.embedders,
_kind: PhantomData,
}
}
pub fn validate(self) -> Result<Self, milli::Error> {
self.validate_embedding_settings()
}
fn validate_embedding_settings(mut self) -> Result<Self, milli::Error> {
let Setting::Set(mut configs) = self.embedders else { return Ok(self) };
for (name, config) in configs.iter_mut() {
let config_to_check = std::mem::take(config);
let checked_config = milli::update::validate_embedding_settings(config_to_check, name)?;
*config = checked_config
}
self.embedders = Setting::Set(configs);
Ok(self)
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
@ -390,6 +421,12 @@ pub fn apply_settings_to_builder(
Setting::NotSet => (),
}
match settings.proximity_precision {
Setting::Set(ref precision) => builder.set_proximity_precision((*precision).into()),
Setting::Reset => builder.reset_proximity_precision(),
Setting::NotSet => (),
}
match settings.typo_tolerance {
Setting::Set(ref value) => {
match value.enabled {
@ -476,6 +513,12 @@ pub fn apply_settings_to_builder(
Setting::Reset => builder.reset_pagination_max_total_hits(),
Setting::NotSet => (),
}
match settings.embedders.clone() {
Setting::Set(value) => builder.set_embedder_settings(value),
Setting::Reset => builder.reset_embedder_settings(),
Setting::NotSet => (),
}
}
pub fn settings(
@ -509,6 +552,8 @@ pub fn settings(
let distinct_field = index.distinct_field(rtxn)?.map(String::from);
let proximity_precision = index.proximity_precision(rtxn)?.map(ProximityPrecisionView::from);
let synonyms = index.user_defined_synonyms(rtxn)?;
let min_typo_word_len = MinWordSizeTyposSetting {
@ -532,7 +577,10 @@ pub fn settings(
let faceting = FacetingSettings {
max_values_per_facet: Setting::Set(
index.max_values_per_facet(rtxn)?.unwrap_or(DEFAULT_VALUES_PER_FACET),
index
.max_values_per_facet(rtxn)?
.map(|x| x as usize)
.unwrap_or(DEFAULT_VALUES_PER_FACET),
),
sort_facet_values_by: Setting::Set(
index
@ -545,10 +593,20 @@ pub fn settings(
let pagination = PaginationSettings {
max_total_hits: Setting::Set(
index.pagination_max_total_hits(rtxn)?.unwrap_or(DEFAULT_PAGINATION_MAX_TOTAL_HITS),
index
.pagination_max_total_hits(rtxn)?
.map(|x| x as usize)
.unwrap_or(DEFAULT_PAGINATION_MAX_TOTAL_HITS),
),
};
let embedders: BTreeMap<_, _> = index
.embedding_configs(rtxn)?
.into_iter()
.map(|(name, config)| (name, Setting::Set(config.into())))
.collect();
let embedders = if embedders.is_empty() { Setting::NotSet } else { Setting::Set(embedders) };
Ok(Settings {
displayed_attributes: match displayed_attributes {
Some(attrs) => Setting::Set(attrs),
@ -569,10 +627,12 @@ pub fn settings(
Some(field) => Setting::Set(field),
None => Setting::Reset,
},
proximity_precision: Setting::Set(proximity_precision.unwrap_or_default()),
synonyms: Setting::Set(synonyms),
typo_tolerance: Setting::Set(typo_tolerance),
faceting: Setting::Set(faceting),
pagination: Setting::Set(pagination),
embedders,
_kind: PhantomData,
})
}
@ -673,6 +733,32 @@ impl From<RankingRuleView> for Criterion {
}
}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, Deserr, Serialize, Deserialize)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(error = DeserrJsonError<InvalidSettingsProximityPrecision>, rename_all = camelCase, deny_unknown_fields)]
pub enum ProximityPrecisionView {
#[default]
ByWord,
ByAttribute,
}
impl From<ProximityPrecision> for ProximityPrecisionView {
fn from(value: ProximityPrecision) -> Self {
match value {
ProximityPrecision::ByWord => ProximityPrecisionView::ByWord,
ProximityPrecision::ByAttribute => ProximityPrecisionView::ByAttribute,
}
}
}
impl From<ProximityPrecisionView> for ProximityPrecision {
fn from(value: ProximityPrecisionView) -> Self {
match value {
ProximityPrecisionView::ByWord => ProximityPrecision::ByWord,
ProximityPrecisionView::ByAttribute => ProximityPrecision::ByAttribute,
}
}
}
#[cfg(test)]
pub(crate) mod test {
use super::*;
@ -692,9 +778,11 @@ pub(crate) mod test {
dictionary: Setting::NotSet,
synonyms: Setting::NotSet,
distinct_attribute: Setting::NotSet,
proximity_precision: Setting::NotSet,
typo_tolerance: Setting::NotSet,
faceting: Setting::NotSet,
pagination: Setting::NotSet,
embedders: Setting::NotSet,
_kind: PhantomData::<Unchecked>,
};
@ -716,9 +804,11 @@ pub(crate) mod test {
dictionary: Setting::NotSet,
synonyms: Setting::NotSet,
distinct_attribute: Setting::NotSet,
proximity_precision: Setting::NotSet,
typo_tolerance: Setting::NotSet,
faceting: Setting::NotSet,
pagination: Setting::NotSet,
embedders: Setting::NotSet,
_kind: PhantomData::<Unchecked>,
};

View File

@ -0,0 +1,139 @@
use serde::Serialize;
use time::{Duration, OffsetDateTime};
use crate::error::ResponseError;
use crate::settings::{Settings, Unchecked};
use crate::tasks::{serialize_duration, Details, IndexSwap, Kind, Status, Task, TaskId};
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct TaskView {
pub uid: TaskId,
#[serde(default)]
pub index_uid: Option<String>,
pub status: Status,
#[serde(rename = "type")]
pub kind: Kind,
pub canceled_by: Option<TaskId>,
#[serde(skip_serializing_if = "Option::is_none")]
pub details: Option<DetailsView>,
pub error: Option<ResponseError>,
#[serde(serialize_with = "serialize_duration", default)]
pub duration: Option<Duration>,
#[serde(with = "time::serde::rfc3339")]
pub enqueued_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339::option", default)]
pub started_at: Option<OffsetDateTime>,
#[serde(with = "time::serde::rfc3339::option", default)]
pub finished_at: Option<OffsetDateTime>,
}
impl TaskView {
pub fn from_task(task: &Task) -> TaskView {
TaskView {
uid: task.uid,
index_uid: task.index_uid().map(ToOwned::to_owned),
status: task.status,
kind: task.kind.as_kind(),
canceled_by: task.canceled_by,
details: task.details.clone().map(DetailsView::from),
error: task.error.clone(),
duration: task.started_at.zip(task.finished_at).map(|(start, end)| end - start),
enqueued_at: task.enqueued_at,
started_at: task.started_at,
finished_at: task.finished_at,
}
}
}
#[derive(Default, Debug, PartialEq, Eq, Clone, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct DetailsView {
#[serde(skip_serializing_if = "Option::is_none")]
pub received_documents: Option<u64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub indexed_documents: Option<Option<u64>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub primary_key: Option<Option<String>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub provided_ids: Option<usize>,
#[serde(skip_serializing_if = "Option::is_none")]
pub deleted_documents: Option<Option<u64>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub matched_tasks: Option<u64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub canceled_tasks: Option<Option<u64>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub deleted_tasks: Option<Option<u64>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub original_filter: Option<Option<String>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub dump_uid: Option<Option<String>>,
#[serde(skip_serializing_if = "Option::is_none")]
#[serde(flatten)]
pub settings: Option<Box<Settings<Unchecked>>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub swaps: Option<Vec<IndexSwap>>,
}
impl From<Details> for DetailsView {
fn from(details: Details) -> Self {
match details {
Details::DocumentAdditionOrUpdate { received_documents, indexed_documents } => {
DetailsView {
received_documents: Some(received_documents),
indexed_documents: Some(indexed_documents),
..DetailsView::default()
}
}
Details::SettingsUpdate { settings } => {
DetailsView { settings: Some(settings), ..DetailsView::default() }
}
Details::IndexInfo { primary_key } => {
DetailsView { primary_key: Some(primary_key), ..DetailsView::default() }
}
Details::DocumentDeletion {
provided_ids: received_document_ids,
deleted_documents,
} => DetailsView {
provided_ids: Some(received_document_ids),
deleted_documents: Some(deleted_documents),
original_filter: Some(None),
..DetailsView::default()
},
Details::DocumentDeletionByFilter { original_filter, deleted_documents } => {
DetailsView {
provided_ids: Some(0),
original_filter: Some(Some(original_filter)),
deleted_documents: Some(deleted_documents),
..DetailsView::default()
}
}
Details::ClearAll { deleted_documents } => {
DetailsView { deleted_documents: Some(deleted_documents), ..DetailsView::default() }
}
Details::TaskCancelation { matched_tasks, canceled_tasks, original_filter } => {
DetailsView {
matched_tasks: Some(matched_tasks),
canceled_tasks: Some(canceled_tasks),
original_filter: Some(Some(original_filter)),
..DetailsView::default()
}
}
Details::TaskDeletion { matched_tasks, deleted_tasks, original_filter } => {
DetailsView {
matched_tasks: Some(matched_tasks),
deleted_tasks: Some(deleted_tasks),
original_filter: Some(Some(original_filter)),
..DetailsView::default()
}
}
Details::Dump { dump_uid } => {
DetailsView { dump_uid: Some(dump_uid), ..DetailsView::default() }
}
Details::IndexSwap { swaps } => {
DetailsView { swaps: Some(swaps), ..Default::default() }
}
}
}
}

View File

@ -13,14 +13,14 @@ license.workspace = true
default-run = "meilisearch"
[dependencies]
actix-cors = "0.6.4"
actix-http = { version = "3.3.1", default-features = false, features = [
actix-cors = "0.7.0"
actix-http = { version = "3.5.1", default-features = false, features = [
"compress-brotli",
"compress-gzip",
"rustls",
] }
actix-utils = "3.0.1"
actix-web = { version = "4.3.1", default-features = false, features = [
actix-web = { version = "4.4.1", default-features = false, features = [
"macros",
"compress-brotli",
"compress-gzip",
@ -28,114 +28,115 @@ actix-web = { version = "4.3.1", default-features = false, features = [
"rustls",
] }
actix-web-static-files = { git = "https://github.com/kilork/actix-web-static-files.git", rev = "2d3b6160", optional = true }
anyhow = { version = "1.0.70", features = ["backtrace"] }
anyhow = { version = "1.0.79", features = ["backtrace"] }
async-stream = "0.3.5"
async-trait = "0.1.68"
bstr = "1.4.0"
async-trait = "0.1.77"
bstr = "1.9.0"
byte-unit = { version = "4.0.19", default-features = false, features = [
"std",
"serde",
] }
bytes = "1.4.0"
clap = { version = "4.2.1", features = ["derive", "env"] }
crossbeam-channel = "0.5.8"
deserr = { version = "0.6.0", features = ["actix-web"]}
bytes = "1.5.0"
clap = { version = "4.4.17", features = ["derive", "env"] }
crossbeam-channel = "0.5.11"
deserr = { version = "0.6.1", features = ["actix-web"] }
dump = { path = "../dump" }
either = "1.8.1"
env_logger = "0.10.0"
either = "1.9.0"
file-store = { path = "../file-store" }
flate2 = "1.0.25"
flate2 = "1.0.28"
fst = "0.4.7"
futures = "0.3.28"
futures-util = "0.3.28"
http = "0.2.9"
futures = "0.3.30"
futures-util = "0.3.30"
http = "0.2.11"
index-scheduler = { path = "../index-scheduler" }
indexmap = { version = "2.0.0", features = ["serde"] }
is-terminal = "0.4.8"
indexmap = { version = "2.1.0", features = ["serde"] }
is-terminal = "0.4.10"
itertools = "0.11.0"
jsonwebtoken = "8.3.0"
lazy_static = "1.4.0"
log = "0.4.17"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
mimalloc = { version = "0.1.37", default-features = false }
mimalloc = { version = "0.1.39", default-features = false }
mime = "0.3.17"
num_cpus = "1.15.0"
obkv = "0.2.0"
once_cell = "1.17.1"
ordered-float = "3.7.0"
num_cpus = "1.16.0"
obkv = "0.2.1"
once_cell = "1.19.0"
ordered-float = "4.2.0"
parking_lot = "0.12.1"
permissive-json-pointer = { path = "../permissive-json-pointer" }
pin-project-lite = "0.2.9"
pin-project-lite = "0.2.13"
platform-dirs = "0.3.0"
prometheus = { version = "0.13.3", features = ["process"] }
puffin = "0.16.0"
puffin_http = { version = "0.13.0", optional = true }
puffin = { version = "0.16.0", features = ["serialization"] }
rand = "0.8.5"
rayon = "1.7.0"
regex = "1.7.3"
reqwest = { version = "0.11.16", features = [
rayon = "1.8.0"
regex = "1.10.2"
reqwest = { version = "0.11.23", features = [
"rustls-tls",
"json",
], default-features = false }
rustls = "0.20.8"
rustls-pemfile = "1.0.2"
segment = { version = "0.2.2", optional = true }
serde = { version = "1.0.160", features = ["derive"] }
serde_json = { version = "1.0.95", features = ["preserve_order"] }
sha2 = "0.10.6"
siphasher = "0.3.10"
slice-group-by = "0.3.0"
segment = { version = "0.2.3", optional = true }
serde = { version = "1.0.195", features = ["derive"] }
serde_json = { version = "1.0.111", features = ["preserve_order"] }
sha2 = "0.10.8"
siphasher = "1.0.0"
slice-group-by = "0.3.1"
static-files = { version = "0.2.3", optional = true }
sysinfo = "0.29.7"
tar = "0.4.38"
tempfile = "3.5.0"
thiserror = "1.0.40"
time = { version = "0.3.20", features = [
sysinfo = "0.30.5"
tar = "0.4.40"
tempfile = "3.9.0"
thiserror = "1.0.56"
time = { version = "0.3.31", features = [
"serde-well-known",
"formatting",
"parsing",
"macros",
] }
tokio = { version = "1.27.0", features = ["full"] }
tokio-stream = "0.1.12"
toml = "0.7.3"
uuid = { version = "1.3.1", features = ["serde", "v4"] }
walkdir = "2.3.3"
tokio = { version = "1.35.1", features = ["full"] }
tokio-stream = "0.1.14"
toml = "0.8.8"
uuid = { version = "1.6.1", features = ["serde", "v4"] }
walkdir = "2.4.0"
yaup = "0.2.1"
serde_urlencoded = "0.7.1"
termcolor = "1.2.0"
termcolor = "1.4.1"
url = { version = "2.5.0", features = ["serde"] }
tracing = "0.1.40"
tracing-subscriber = { version = "0.3.18", features = ["json"] }
tracing-trace = { version = "0.1.0", path = "../tracing-trace" }
tracing-actix-web = "0.7.9"
build-info = { version = "1.7.0", path = "../build-info" }
[dev-dependencies]
actix-rt = "2.8.0"
actix-rt = "2.9.0"
assert-json-diff = "2.0.2"
brotli = "3.3.4"
insta = "1.29.0"
manifest-dir-macros = "0.1.16"
brotli = "3.4.0"
insta = "1.34.0"
manifest-dir-macros = "0.1.18"
maplit = "1.0.2"
meili-snap = { path = "../meili-snap" }
temp-env = "0.3.3"
urlencoding = "2.1.2"
temp-env = "0.3.6"
urlencoding = "2.1.3"
yaup = "0.2.1"
[build-dependencies]
anyhow = { version = "1.0.70", optional = true }
cargo_toml = { version = "0.15.2", optional = true }
anyhow = { version = "1.0.79", optional = true }
cargo_toml = { version = "0.18.0", optional = true }
hex = { version = "0.4.3", optional = true }
reqwest = { version = "0.11.16", features = [
reqwest = { version = "0.11.23", features = [
"blocking",
"rustls-tls",
], default-features = false, optional = true }
sha-1 = { version = "0.10.1", optional = true }
static-files = { version = "0.2.3", optional = true }
tempfile = { version = "3.5.0", optional = true }
vergen = { version = "7.5.1", default-features = false, features = ["git"] }
zip = { version = "0.6.4", optional = true }
tempfile = { version = "3.9.0", optional = true }
zip = { version = "0.6.6", optional = true }
[features]
default = ["analytics", "meilisearch-types/all-tokenizations", "mini-dashboard"]
analytics = ["segment"]
profile-with-puffin = ["dep:puffin_http"]
mini-dashboard = [
"actix-web-static-files",
"static-files",
@ -152,7 +153,9 @@ hebrew = ["meilisearch-types/hebrew"]
japanese = ["meilisearch-types/japanese"]
thai = ["meilisearch-types/thai"]
greek = ["meilisearch-types/greek"]
khmer = ["meilisearch-types/khmer"]
vietnamese = ["meilisearch-types/vietnamese"]
[package.metadata.mini-dashboard]
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.2.11/build.zip"
sha1 = "83cd44ed1e5f97ecb581dc9f958a63f4ccc982d9"
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.2.13/build.zip"
sha1 = "e20cc9b390003c6c844f4b8bcc5c5013191a77ff"

View File

@ -1,17 +1,4 @@
use vergen::{vergen, Config, SemverKind};
fn main() {
// Note: any code that needs VERGEN_ environment variables should take care to define them manually in the Dockerfile and pass them
// in the corresponding GitHub workflow (publish_docker.yml).
// This is due to the Dockerfile building the binary outside of the git directory.
let mut config = Config::default();
// allow using non-annotated tags
*config.git_mut().semver_kind_mut() = SemverKind::Lightweight;
if let Err(e) = vergen(config) {
println!("cargo:warning=vergen: {}", e);
}
#[cfg(feature = "mini-dashboard")]
mini_dashboard::setup_mini_dashboard().expect("Could not load the mini-dashboard assets");
}

View File

@ -18,7 +18,7 @@ use segment::message::{Identify, Track, User};
use segment::{AutoBatcher, Batcher, HttpClient};
use serde::Serialize;
use serde_json::{json, Value};
use sysinfo::{DiskExt, System, SystemExt};
use sysinfo::{Disks, System};
use time::OffsetDateTime;
use tokio::select;
use tokio::sync::mpsc::{self, Receiver, Sender};
@ -28,7 +28,9 @@ use super::{
config_user_id_path, DocumentDeletionKind, DocumentFetchKind, MEILISEARCH_CONFIG_PATH,
};
use crate::analytics::Analytics;
use crate::option::{default_http_addr, IndexerOpts, MaxMemory, MaxThreads, ScheduleSnapshot};
use crate::option::{
default_http_addr, IndexerOpts, LogMode, MaxMemory, MaxThreads, ScheduleSnapshot,
};
use crate::routes::indexes::documents::UpdateDocumentsQuery;
use crate::routes::indexes::facet_search::FacetSearchQuery;
use crate::routes::tasks::TasksFilterQuery;
@ -36,7 +38,7 @@ use crate::routes::{create_all_stats, Stats};
use crate::search::{
FacetSearchResult, MatchingStrategy, SearchQuery, SearchQueryWithIndex, SearchResult,
DEFAULT_CROP_LENGTH, DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG,
DEFAULT_HIGHLIGHT_PRE_TAG, DEFAULT_SEARCH_LIMIT,
DEFAULT_HIGHLIGHT_PRE_TAG, DEFAULT_SEARCH_LIMIT, DEFAULT_SEMANTIC_RATIO,
};
use crate::Opt;
@ -250,7 +252,12 @@ impl super::Analytics for SegmentAnalytics {
struct Infos {
env: String,
experimental_enable_metrics: bool,
experimental_logs_mode: LogMode,
experimental_replication_parameters: bool,
experimental_enable_logs_route: bool,
experimental_reduce_indexing_memory_usage: bool,
experimental_max_number_of_batched_tasks: usize,
gpu_enabled: bool,
db_path: bool,
import_dump: bool,
dump_dir: bool,
@ -263,6 +270,8 @@ struct Infos {
ignore_snapshot_if_db_exists: bool,
http_addr: bool,
http_payload_size_limit: Byte,
task_queue_webhook: bool,
task_webhook_authorization_header: bool,
log_level: String,
max_indexing_memory: MaxMemory,
max_indexing_threads: MaxThreads,
@ -284,10 +293,16 @@ impl From<Opt> for Infos {
let Opt {
db_path,
experimental_enable_metrics,
experimental_logs_mode,
experimental_replication_parameters,
experimental_enable_logs_route,
experimental_reduce_indexing_memory_usage,
experimental_max_number_of_batched_tasks,
http_addr,
master_key: _,
env,
task_webhook_url,
task_webhook_authorization_header,
max_index_size: _,
max_task_db_size: _,
http_payload_size_limit,
@ -327,7 +342,11 @@ impl From<Opt> for Infos {
Self {
env,
experimental_enable_metrics,
experimental_logs_mode,
experimental_replication_parameters,
experimental_enable_logs_route,
experimental_reduce_indexing_memory_usage,
gpu_enabled: meilisearch_types::milli::vector::is_cuda_enabled(),
db_path: db_path != PathBuf::from("./data.ms"),
import_dump: import_dump.is_some(),
dump_dir: dump_dir != PathBuf::from("dumps/"),
@ -340,6 +359,9 @@ impl From<Opt> for Infos {
ignore_snapshot_if_db_exists,
http_addr: http_addr != default_http_addr(),
http_payload_size_limit,
experimental_max_number_of_batched_tasks,
task_queue_webhook: task_webhook_url.is_some(),
task_webhook_authorization_header: task_webhook_authorization_header.is_some(),
log_level: log_level.to_string(),
max_indexing_memory,
max_indexing_threads,
@ -377,16 +399,17 @@ impl Segment {
fn compute_traits(opt: &Opt, stats: Stats) -> Value {
static FIRST_START_TIMESTAMP: Lazy<Instant> = Lazy::new(Instant::now);
static SYSTEM: Lazy<Value> = Lazy::new(|| {
let disks = Disks::new_with_refreshed_list();
let mut sys = System::new_all();
sys.refresh_all();
let kernel_version =
sys.kernel_version().and_then(|k| k.split_once('-').map(|(k, _)| k.to_string()));
let kernel_version = System::kernel_version()
.and_then(|k| k.split_once('-').map(|(k, _)| k.to_string()));
json!({
"distribution": sys.name(),
"distribution": System::name(),
"kernel_version": kernel_version,
"cores": sys.cpus().len(),
"ram_size": sys.total_memory(),
"disk_size": sys.disks().iter().map(|disk| disk.total_space()).max(),
"disk_size": disks.iter().map(|disk| disk.total_space()).max(),
"server_provider": std::env::var("MEILI_SERVER_PROVIDER").ok(),
})
});
@ -450,7 +473,9 @@ impl Segment {
create_all_stats(index_scheduler.into(), auth_controller.into(), &AuthFilter::default())
{
// Replace the version number with the prototype name if any.
let version = if let Some(prototype) = crate::prototype_name() {
let version = if let Some(prototype) = build_info::DescribeResult::from_build()
.and_then(|describe| describe.as_prototype())
{
prototype
} else {
env!("CARGO_PKG_VERSION")
@ -583,6 +608,11 @@ pub struct SearchAggregator {
// vector
// The maximum number of floats in a vector request
max_vector_size: usize,
// Whether the semantic ratio passed to a hybrid search equals the default ratio.
semantic_ratio: bool,
// Whether a non-default embedder was specified
embedder: bool,
hybrid: bool,
// every time a search is done, we increment the counter linked to the used settings
matching_strategy: HashMap<String, usize>,
@ -636,6 +666,7 @@ impl SearchAggregator {
crop_marker,
matching_strategy,
attributes_to_search_on,
hybrid,
} = query;
let mut ret = Self::default();
@ -709,6 +740,12 @@ impl SearchAggregator {
ret.show_ranking_score = *show_ranking_score;
ret.show_ranking_score_details = *show_ranking_score_details;
if let Some(hybrid) = hybrid {
ret.semantic_ratio = hybrid.semantic_ratio != DEFAULT_SEMANTIC_RATIO();
ret.embedder = hybrid.embedder.is_some();
ret.hybrid = true;
}
ret
}
@ -762,6 +799,9 @@ impl SearchAggregator {
facets_total_number_of_facets,
show_ranking_score,
show_ranking_score_details,
semantic_ratio,
embedder,
hybrid,
} = other;
if self.timestamp.is_none() {
@ -807,6 +847,9 @@ impl SearchAggregator {
// vector
self.max_vector_size = self.max_vector_size.max(max_vector_size);
self.semantic_ratio |= semantic_ratio;
self.hybrid |= hybrid;
self.embedder |= embedder;
// pagination
self.max_limit = self.max_limit.max(max_limit);
@ -875,6 +918,9 @@ impl SearchAggregator {
facets_total_number_of_facets,
show_ranking_score,
show_ranking_score_details,
semantic_ratio,
embedder,
hybrid,
} = self;
if total_received == 0 {
@ -914,6 +960,11 @@ impl SearchAggregator {
"vector": {
"max_vector_size": max_vector_size,
},
"hybrid": {
"enabled": hybrid,
"semantic_ratio": semantic_ratio,
"embedder": embedder,
},
"pagination": {
"max_limit": max_limit,
"max_offset": max_offset,
@ -1009,6 +1060,7 @@ impl MultiSearchAggregator {
crop_marker: _,
matching_strategy: _,
attributes_to_search_on: _,
hybrid: _,
} = query;
index_uid.as_str()
@ -1155,6 +1207,7 @@ impl FacetSearchAggregator {
filter,
matching_strategy,
attributes_to_search_on,
hybrid,
} = query;
let mut ret = Self::default();
@ -1168,7 +1221,8 @@ impl FacetSearchAggregator {
|| vector.is_some()
|| filter.is_some()
|| *matching_strategy != MatchingStrategy::default()
|| attributes_to_search_on.is_some();
|| attributes_to_search_on.is_some()
|| hybrid.is_some();
ret
}

View File

@ -12,6 +12,8 @@ pub enum MeilisearchHttpError {
#[error("A Content-Type header is missing. Accepted values for the Content-Type header are: {}",
.0.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", "))]
MissingContentType(Vec<String>),
#[error("The `/logs/stream` route is currently in use by someone else.")]
AlreadyUsedLogRoute,
#[error("The Content-Type `{0}` does not support the use of a csv delimiter. The csv delimiter can only be used with the Content-Type `text/csv`.")]
CsvDelimiterWithWrongContentType(String),
#[error(
@ -51,12 +53,15 @@ pub enum MeilisearchHttpError {
DocumentFormat(#[from] DocumentFormatError),
#[error(transparent)]
Join(#[from] JoinError),
#[error("Invalid request: missing `hybrid` parameter when both `q` and `vector` are present.")]
MissingSearchHybrid,
}
impl ErrorCode for MeilisearchHttpError {
fn error_code(&self) -> Code {
match self {
MeilisearchHttpError::MissingContentType(_) => Code::MissingContentType,
MeilisearchHttpError::AlreadyUsedLogRoute => Code::BadRequest,
MeilisearchHttpError::CsvDelimiterWithWrongContentType(_) => Code::InvalidContentType,
MeilisearchHttpError::MissingPayload(_) => Code::MissingPayload,
MeilisearchHttpError::InvalidContentType(_, _) => Code::InvalidContentType,
@ -74,6 +79,7 @@ impl ErrorCode for MeilisearchHttpError {
MeilisearchHttpError::FileStore(_) => Code::Internal,
MeilisearchHttpError::DocumentFormat(e) => e.error_code(),
MeilisearchHttpError::Join(_) => Code::Internal,
MeilisearchHttpError::MissingSearchHybrid => Code::MissingSearchHybrid,
}
}
}

View File

@ -131,6 +131,7 @@ gen_seq! { SeqFromRequestFut3; A B C }
gen_seq! { SeqFromRequestFut4; A B C D }
gen_seq! { SeqFromRequestFut5; A B C D E }
gen_seq! { SeqFromRequestFut6; A B C D E F }
gen_seq! { SeqFromRequestFut7; A B C D E F G }
pin_project! {
#[project = ExtractProj]

View File

@ -29,7 +29,6 @@ use error::PayloadError;
use extractors::payload::PayloadConfig;
use http::header::CONTENT_TYPE;
use index_scheduler::{IndexScheduler, IndexSchedulerOptions};
use log::error;
use meilisearch_auth::AuthController;
use meilisearch_types::milli::documents::{DocumentsBatchBuilder, DocumentsBatchReader};
use meilisearch_types::milli::update::{IndexDocumentsConfig, IndexDocumentsMethod};
@ -39,6 +38,8 @@ use meilisearch_types::versioning::{check_version_file, create_version_file};
use meilisearch_types::{compression, milli, VERSION_FILE_NAME};
pub use option::Opt;
use option::ScheduleSnapshot;
use tracing::{error, info_span};
use tracing_subscriber::filter::Targets;
use crate::error::MeilisearchHttpError;
@ -86,10 +87,35 @@ fn is_empty_db(db_path: impl AsRef<Path>) -> bool {
}
}
/// The handle used to update the logs at runtime. Must be accessible from the `main.rs` and the `route/logs.rs`.
pub type LogRouteHandle =
tracing_subscriber::reload::Handle<LogRouteType, tracing_subscriber::Registry>;
pub type LogRouteType = tracing_subscriber::filter::Filtered<
Option<Box<dyn tracing_subscriber::Layer<tracing_subscriber::Registry> + Send + Sync>>,
Targets,
tracing_subscriber::Registry,
>;
pub type SubscriberForSecondLayer = tracing_subscriber::layer::Layered<
tracing_subscriber::reload::Layer<LogRouteType, tracing_subscriber::Registry>,
tracing_subscriber::Registry,
>;
pub type LogStderrHandle =
tracing_subscriber::reload::Handle<LogStderrType, SubscriberForSecondLayer>;
pub type LogStderrType = tracing_subscriber::filter::Filtered<
Box<dyn tracing_subscriber::Layer<SubscriberForSecondLayer> + Send + Sync>,
Targets,
SubscriberForSecondLayer,
>;
pub fn create_app(
index_scheduler: Data<IndexScheduler>,
auth_controller: Data<AuthController>,
opt: Opt,
logs: (LogRouteHandle, LogStderrHandle),
analytics: Arc<dyn Analytics>,
enable_dashboard: bool,
) -> actix_web::App<
@ -108,16 +134,14 @@ pub fn create_app(
index_scheduler.clone(),
auth_controller.clone(),
&opt,
logs,
analytics.clone(),
)
})
.configure(routes::configure)
.configure(|s| dashboard(s, enable_dashboard));
let app = app.wrap(actix_web::middleware::Condition::new(
opt.experimental_enable_metrics,
middleware::RouteMetrics,
));
let app = app.wrap(middleware::RouteMetrics);
app.wrap(
Cors::default()
.send_wildcard()
@ -126,11 +150,49 @@ pub fn create_app(
.allow_any_method()
.max_age(86_400), // 24h
)
.wrap(actix_web::middleware::Logger::default())
.wrap(tracing_actix_web::TracingLogger::<AwebTracingLogger>::new())
.wrap(actix_web::middleware::Compress::default())
.wrap(actix_web::middleware::NormalizePath::new(actix_web::middleware::TrailingSlash::Trim))
}
struct AwebTracingLogger;
impl tracing_actix_web::RootSpanBuilder for AwebTracingLogger {
fn on_request_start(request: &actix_web::dev::ServiceRequest) -> tracing::Span {
use tracing::field::Empty;
let conn_info = request.connection_info();
let headers = request.headers();
let user_agent = headers
.get(http::header::USER_AGENT)
.map(|value| String::from_utf8_lossy(value.as_bytes()).into_owned())
.unwrap_or_default();
info_span!("HTTP request", method = %request.method(), host = conn_info.host(), route = %request.path(), query_parameters = %request.query_string(), %user_agent, status_code = Empty, error = Empty)
}
fn on_request_end<B: MessageBody>(
span: tracing::Span,
outcome: &Result<ServiceResponse<B>, actix_web::Error>,
) {
match &outcome {
Ok(response) => {
let code: i32 = response.response().status().as_u16().into();
span.record("status_code", code);
if let Some(error) = response.response().error() {
// use the status code already constructed for the outgoing HTTP response
span.record("error", &tracing::field::display(error.as_response_error()));
}
}
Err(error) => {
let code: i32 = error.error_response().status().as_u16().into();
span.record("status_code", code);
span.record("error", &tracing::field::display(error.as_response_error()));
}
};
}
}
enum OnFailure {
RemoveDb,
KeepDb,
@ -203,7 +265,9 @@ pub fn setup_meilisearch(opt: &Opt) -> anyhow::Result<(Arc<IndexScheduler>, Arc<
.name(String::from("register-snapshot-tasks"))
.spawn(move || loop {
thread::sleep(snapshot_delay);
if let Err(e) = index_scheduler.register(KindWithContent::SnapshotCreation) {
if let Err(e) =
index_scheduler.register(KindWithContent::SnapshotCreation, None, false)
{
error!("Error while registering snapshot: {}", e);
}
})
@ -231,12 +295,16 @@ fn open_or_create_database_unchecked(
indexes_path: opt.db_path.join("indexes"),
snapshots_path: opt.snapshot_dir.clone(),
dumps_path: opt.dump_dir.clone(),
webhook_url: opt.task_webhook_url.as_ref().map(|url| url.to_string()),
webhook_authorization_header: opt.task_webhook_authorization_header.clone(),
task_db_size: opt.max_task_db_size.get_bytes() as usize,
index_base_map_size: opt.max_index_size.get_bytes() as usize,
enable_mdb_writemap: opt.experimental_reduce_indexing_memory_usage,
indexer_config: (&opt.indexer_options).try_into()?,
autobatching_enabled: true,
cleanup_enabled: !opt.experimental_replication_parameters,
max_number_of_tasks: 1_000_000,
max_number_of_batched_tasks: opt.experimental_max_number_of_batched_tasks,
index_growth_amount: byte_unit::Byte::from_str("10GiB").unwrap().get_bytes() as usize,
index_count: DEFAULT_INDEX_COUNT,
instance_features,
@ -280,15 +348,15 @@ fn import_dump(
let mut dump_reader = dump::DumpReader::open(reader)?;
if let Some(date) = dump_reader.date() {
log::info!(
"Importing a dump of meilisearch `{:?}` from the {}",
dump_reader.version(), // TODO: get the meilisearch version instead of the dump version
date
tracing::info!(
version = ?dump_reader.version(), // TODO: get the meilisearch version instead of the dump version
%date,
"Importing a dump of meilisearch"
);
} else {
log::info!(
"Importing a dump of meilisearch `{:?}`",
dump_reader.version(), // TODO: get the meilisearch version instead of the dump version
tracing::info!(
version = ?dump_reader.version(), // TODO: get the meilisearch version instead of the dump version
"Importing a dump of meilisearch",
);
}
@ -322,7 +390,7 @@ fn import_dump(
for index_reader in dump_reader.indexes()? {
let mut index_reader = index_reader?;
let metadata = index_reader.metadata();
log::info!("Importing index `{}`.", metadata.uid);
tracing::info!("Importing index `{}`.", metadata.uid);
let date = Some((metadata.created_at, metadata.updated_at));
let index = index_scheduler.create_raw_index(&metadata.uid, date)?;
@ -336,14 +404,15 @@ fn import_dump(
}
// 4.2 Import the settings.
log::info!("Importing the settings.");
tracing::info!("Importing the settings.");
let settings = index_reader.settings()?;
apply_settings_to_builder(&settings, &mut builder);
builder.execute(|indexing_step| log::debug!("update: {:?}", indexing_step), || false)?;
builder
.execute(|indexing_step| tracing::debug!("update: {:?}", indexing_step), || false)?;
// 4.3 Import the documents.
// 4.3.1 We need to recreate the grenad+obkv format accepted by the index.
log::info!("Importing the documents.");
tracing::info!("Importing the documents.");
let file = tempfile::tempfile()?;
let mut builder = DocumentsBatchBuilder::new(BufWriter::new(file));
for document in index_reader.documents()? {
@ -357,6 +426,9 @@ fn import_dump(
let reader = BufReader::new(file);
let reader = DocumentsBatchReader::from_reader(reader)?;
let embedder_configs = index.embedding_configs(&wtxn)?;
let embedders = index_scheduler.embedders(embedder_configs)?;
let builder = milli::update::IndexDocuments::new(
&mut wtxn,
&index,
@ -365,15 +437,18 @@ fn import_dump(
update_method: IndexDocumentsMethod::ReplaceDocuments,
..Default::default()
},
|indexing_step| log::debug!("update: {:?}", indexing_step),
|indexing_step| tracing::trace!("update: {:?}", indexing_step),
|| false,
)?;
let builder = builder.with_embedders(embedders);
let (builder, user_result) = builder.add_documents(reader)?;
log::info!("{} documents found.", user_result?);
let user_result = user_result?;
tracing::info!(documents_found = user_result, "{} documents found.", user_result);
builder.execute()?;
wtxn.commit()?;
log::info!("All documents successfully imported.");
tracing::info!("All documents successfully imported.");
}
let mut index_scheduler_dump = index_scheduler.register_dumped_task()?;
@ -391,6 +466,7 @@ pub fn configure_data(
index_scheduler: Data<IndexScheduler>,
auth: Data<AuthController>,
opt: &Opt,
(logs_route, logs_stderr): (LogRouteHandle, LogStderrHandle),
analytics: Arc<dyn Analytics>,
) {
let http_payload_size_limit = opt.http_payload_size_limit.get_bytes() as usize;
@ -398,8 +474,12 @@ pub fn configure_data(
.app_data(index_scheduler)
.app_data(auth)
.app_data(web::Data::from(analytics))
.app_data(web::Data::new(logs_route))
.app_data(web::Data::new(logs_stderr))
.app_data(web::Data::new(opt.clone()))
.app_data(
web::JsonConfig::default()
.limit(http_payload_size_limit)
.content_type(|mime| mime == mime::APPLICATION_JSON)
.error_handler(|err, req: &HttpRequest| match err {
JsonPayloadError::ContentType => match req.headers().get(CONTENT_TYPE) {
@ -456,30 +536,3 @@ pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
pub fn dashboard(config: &mut web::ServiceConfig, _enable_frontend: bool) {
config.service(web::resource("/").route(web::get().to(routes::running)));
}
/// Parses the output of
/// [`VERGEN_GIT_SEMVER_LIGHTWEIGHT`](https://docs.rs/vergen/latest/vergen/struct.Git.html#instructions)
/// as a prototype name.
///
/// Returns `Some(prototype_name)` if the following conditions are met on this value:
///
/// 1. starts with `prototype-`,
/// 2. ends with `-<some_number>`,
/// 3. does not end with `<some_number>-<some_number>`.
///
/// Otherwise, returns `None`.
pub fn prototype_name() -> Option<&'static str> {
let prototype: &'static str = option_env!("VERGEN_GIT_SEMVER_LIGHTWEIGHT")?;
if !prototype.starts_with("prototype-") {
return None;
}
let mut rsplit_prototype = prototype.rsplit('-');
// last component MUST be a number
rsplit_prototype.next()?.parse::<u64>().ok()?;
// before than last component SHALL NOT be a number
rsplit_prototype.next()?.parse::<u64>().err()?;
Some(prototype)
}

View File

@ -1,6 +1,7 @@
use std::env;
use std::io::{stderr, Write};
use std::io::{stderr, LineWriter, Write};
use std::path::PathBuf;
use std::str::FromStr;
use std::sync::Arc;
use actix_web::http::KeepAlive;
@ -9,37 +10,78 @@ use actix_web::HttpServer;
use index_scheduler::IndexScheduler;
use is_terminal::IsTerminal;
use meilisearch::analytics::Analytics;
use meilisearch::{analytics, create_app, prototype_name, setup_meilisearch, Opt};
use meilisearch::option::LogMode;
use meilisearch::{
analytics, create_app, setup_meilisearch, LogRouteHandle, LogRouteType, LogStderrHandle,
LogStderrType, Opt, SubscriberForSecondLayer,
};
use meilisearch_auth::{generate_master_key, AuthController, MASTER_KEY_MIN_SIZE};
use mimalloc::MiMalloc;
use termcolor::{Color, ColorChoice, ColorSpec, StandardStream, WriteColor};
use tracing::level_filters::LevelFilter;
use tracing_subscriber::layer::SubscriberExt as _;
use tracing_subscriber::Layer;
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
static ALLOC: MiMalloc = MiMalloc;
fn default_log_route_layer() -> LogRouteType {
None.with_filter(tracing_subscriber::filter::Targets::new().with_target("", LevelFilter::OFF))
}
fn default_log_stderr_layer(opt: &Opt) -> LogStderrType {
let layer = tracing_subscriber::fmt::layer()
.with_writer(|| LineWriter::new(std::io::stderr()))
.with_span_events(tracing_subscriber::fmt::format::FmtSpan::CLOSE);
let layer = match opt.experimental_logs_mode {
LogMode::Human => Box::new(layer)
as Box<dyn tracing_subscriber::Layer<SubscriberForSecondLayer> + Send + Sync>,
LogMode::Json => Box::new(layer.json())
as Box<dyn tracing_subscriber::Layer<SubscriberForSecondLayer> + Send + Sync>,
};
layer.with_filter(
tracing_subscriber::filter::Targets::new()
.with_target("", LevelFilter::from_str(&opt.log_level.to_string()).unwrap()),
)
}
/// does all the setup before meilisearch is launched
fn setup(opt: &Opt) -> anyhow::Result<()> {
let mut log_builder = env_logger::Builder::new();
log_builder.parse_filters(&opt.log_level.to_string());
fn setup(opt: &Opt) -> anyhow::Result<(LogRouteHandle, LogStderrHandle)> {
let (route_layer, route_layer_handle) =
tracing_subscriber::reload::Layer::new(default_log_route_layer());
let route_layer: tracing_subscriber::reload::Layer<_, _> = route_layer;
log_builder.init();
let (stderr_layer, stderr_layer_handle) =
tracing_subscriber::reload::Layer::new(default_log_stderr_layer(opt));
let route_layer: tracing_subscriber::reload::Layer<_, _> = route_layer;
Ok(())
let subscriber = tracing_subscriber::registry().with(route_layer).with(stderr_layer);
// set the subscriber as the default for the application
tracing::subscriber::set_global_default(subscriber).unwrap();
Ok((route_layer_handle, stderr_layer_handle))
}
fn on_panic(info: &std::panic::PanicInfo) {
let info = info.to_string().replace('\n', " ");
tracing::error!(%info);
}
#[actix_web::main]
async fn main() -> anyhow::Result<()> {
let (opt, config_read_from) = Opt::try_build()?;
#[cfg(feature = "profile-with-puffin")]
let _server = puffin_http::Server::new(&format!("0.0.0.0:{}", puffin_http::DEFAULT_PORT))?;
puffin::set_scopes_on(cfg!(feature = "profile-with-puffin"));
std::panic::set_hook(Box::new(on_panic));
anyhow::ensure!(
!(cfg!(windows) && opt.experimental_reduce_indexing_memory_usage),
"The `experimental-reduce-indexing-memory-usage` flag is not supported on Windows"
);
setup(&opt)?;
let log_handle = setup(&opt)?;
match (opt.env.as_ref(), &opt.master_key) {
("production", Some(master_key)) if master_key.len() < MASTER_KEY_MIN_SIZE => {
@ -77,7 +119,7 @@ async fn main() -> anyhow::Result<()> {
print_launch_resume(&opt, analytics.clone(), config_read_from);
run_http(index_scheduler, auth_controller, opt, analytics).await?;
run_http(index_scheduler, auth_controller, opt, log_handle, analytics).await?;
Ok(())
}
@ -86,6 +128,7 @@ async fn run_http(
index_scheduler: Arc<IndexScheduler>,
auth_controller: Arc<AuthController>,
opt: Opt,
logs: (LogRouteHandle, LogStderrHandle),
analytics: Arc<dyn Analytics>,
) -> anyhow::Result<()> {
let enable_dashboard = &opt.env == "development";
@ -98,6 +141,7 @@ async fn run_http(
index_scheduler.clone(),
auth_controller.clone(),
opt.clone(),
logs.clone(),
analytics.clone(),
enable_dashboard,
)
@ -119,8 +163,8 @@ pub fn print_launch_resume(
analytics: Arc<dyn Analytics>,
config_read_from: Option<PathBuf>,
) {
let commit_sha = option_env!("VERGEN_GIT_SHA").unwrap_or("unknown");
let commit_date = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP").unwrap_or("unknown");
let build_info = build_info::BuildInfo::from_build();
let protocol =
if opt.ssl_cert_path.is_some() && opt.ssl_key_path.is_some() { "https" } else { "http" };
let ascii_name = r#"
@ -145,10 +189,18 @@ pub fn print_launch_resume(
eprintln!("Database path:\t\t{:?}", opt.db_path);
eprintln!("Server listening on:\t\"{}://{}\"", protocol, opt.http_addr);
eprintln!("Environment:\t\t{:?}", opt.env);
eprintln!("Commit SHA:\t\t{:?}", commit_sha.to_string());
eprintln!("Commit date:\t\t{:?}", commit_date.to_string());
eprintln!("Commit SHA:\t\t{:?}", build_info.commit_sha1.unwrap_or("unknown"));
eprintln!(
"Commit date:\t\t{:?}",
build_info
.commit_timestamp
.and_then(|commit_timestamp| commit_timestamp
.format(&time::format_description::well_known::Rfc3339)
.ok())
.unwrap_or("unknown".into())
);
eprintln!("Package version:\t{:?}", env!("CARGO_PKG_VERSION").to_string());
if let Some(prototype) = prototype_name() {
if let Some(prototype) = build_info.describe.and_then(|describe| describe.as_prototype()) {
eprintln!("Prototype:\t\t{:?}", prototype);
}

View File

@ -3,8 +3,10 @@
use std::future::{ready, Ready};
use actix_web::dev::{self, Service, ServiceRequest, ServiceResponse, Transform};
use actix_web::web::Data;
use actix_web::Error;
use futures_util::future::LocalBoxFuture;
use index_scheduler::IndexScheduler;
use prometheus::HistogramTimer;
pub struct RouteMetrics;
@ -47,19 +49,27 @@ where
fn call(&self, req: ServiceRequest) -> Self::Future {
let mut histogram_timer: Option<HistogramTimer> = None;
let request_path = req.path();
let is_registered_resource = req.resource_map().has_resource(request_path);
if is_registered_resource {
let request_method = req.method().to_string();
histogram_timer = Some(
crate::metrics::MEILISEARCH_HTTP_RESPONSE_TIME_SECONDS
// calling unwrap here is safe because index scheduler is added to app data while creating actix app.
// also, the tests will fail if this is not present.
let index_scheduler = req.app_data::<Data<IndexScheduler>>().unwrap();
let features = index_scheduler.features();
if features.check_metrics().is_ok() {
let request_path = req.path();
let is_registered_resource = req.resource_map().has_resource(request_path);
if is_registered_resource {
let request_method = req.method().to_string();
histogram_timer = Some(
crate::metrics::MEILISEARCH_HTTP_RESPONSE_TIME_SECONDS
.with_label_values(&[&request_method, request_path])
.start_timer(),
);
crate::metrics::MEILISEARCH_HTTP_REQUESTS_TOTAL
.with_label_values(&[&request_method, request_path])
.start_timer(),
);
crate::metrics::MEILISEARCH_HTTP_REQUESTS_TOTAL
.with_label_values(&[&request_method, request_path])
.inc();
}
.inc();
}
};
let fut = self.service.call(req);

View File

@ -1,4 +1,3 @@
use std::convert::TryFrom;
use std::env::VarError;
use std::ffi::OsStr;
use std::fmt::Display;
@ -20,7 +19,8 @@ use rustls::server::{
use rustls::RootCertStore;
use rustls_pemfile::{certs, pkcs8_private_keys, rsa_private_keys};
use serde::{Deserialize, Serialize};
use sysinfo::{RefreshKind, System, SystemExt};
use sysinfo::{MemoryRefreshKind, RefreshKind, System};
use url::Url;
const POSSIBLE_ENV: [&str; 2] = ["development", "production"];
@ -28,6 +28,8 @@ const MEILI_DB_PATH: &str = "MEILI_DB_PATH";
const MEILI_HTTP_ADDR: &str = "MEILI_HTTP_ADDR";
const MEILI_MASTER_KEY: &str = "MEILI_MASTER_KEY";
const MEILI_ENV: &str = "MEILI_ENV";
const MEILI_TASK_WEBHOOK_URL: &str = "MEILI_TASK_WEBHOOK_URL";
const MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER: &str = "MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER";
#[cfg(feature = "analytics")]
const MEILI_NO_ANALYTICS: &str = "MEILI_NO_ANALYTICS";
const MEILI_HTTP_PAYLOAD_SIZE_LIMIT: &str = "MEILI_HTTP_PAYLOAD_SIZE_LIMIT";
@ -48,9 +50,14 @@ const MEILI_IGNORE_MISSING_DUMP: &str = "MEILI_IGNORE_MISSING_DUMP";
const MEILI_IGNORE_DUMP_IF_DB_EXISTS: &str = "MEILI_IGNORE_DUMP_IF_DB_EXISTS";
const MEILI_DUMP_DIR: &str = "MEILI_DUMP_DIR";
const MEILI_LOG_LEVEL: &str = "MEILI_LOG_LEVEL";
const MEILI_EXPERIMENTAL_LOGS_MODE: &str = "MEILI_EXPERIMENTAL_LOGS_MODE";
const MEILI_EXPERIMENTAL_REPLICATION_PARAMETERS: &str = "MEILI_EXPERIMENTAL_REPLICATION_PARAMETERS";
const MEILI_EXPERIMENTAL_ENABLE_LOGS_ROUTE: &str = "MEILI_EXPERIMENTAL_ENABLE_LOGS_ROUTE";
const MEILI_EXPERIMENTAL_ENABLE_METRICS: &str = "MEILI_EXPERIMENTAL_ENABLE_METRICS";
const MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE: &str =
"MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE";
const MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS: &str =
"MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS";
const DEFAULT_CONFIG_FILE_PATH: &str = "./config.toml";
const DEFAULT_DB_PATH: &str = "./data.ms";
@ -73,6 +80,39 @@ const DEFAULT_LOG_EVERY_N: usize = 100_000;
pub const INDEX_SIZE: u64 = 2 * 1024 * 1024 * 1024 * 1024; // 2 TiB
pub const TASK_DB_SIZE: u64 = 20 * 1024 * 1024 * 1024; // 20 GiB
#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize)]
#[serde(rename_all = "UPPERCASE")]
pub enum LogMode {
#[default]
Human,
Json,
}
impl Display for LogMode {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
LogMode::Human => Display::fmt("HUMAN", f),
LogMode::Json => Display::fmt("JSON", f),
}
}
}
impl FromStr for LogMode {
type Err = LogModeError;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match s.trim().to_lowercase().as_str() {
"human" => Ok(LogMode::Human),
"json" => Ok(LogMode::Json),
_ => Err(LogModeError(s.to_owned())),
}
}
}
#[derive(Debug, thiserror::Error)]
#[error("Unsupported log mode level `{0}`. Supported values are `HUMAN` and `JSON`.")]
pub struct LogModeError(String);
#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize)]
#[serde(rename_all = "UPPERCASE")]
pub enum LogLevel {
@ -154,6 +194,14 @@ pub struct Opt {
#[serde(default = "default_env")]
pub env: String,
/// Called whenever a task finishes so a third party can be notified.
#[clap(long, env = MEILI_TASK_WEBHOOK_URL)]
pub task_webhook_url: Option<Url>,
/// The Authorization header to send on the webhook URL whenever a task finishes so a third party can be notified.
#[clap(long, env = MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER)]
pub task_webhook_authorization_header: Option<String>,
/// Deactivates Meilisearch's built-in telemetry when provided.
///
/// Meilisearch automatically collects data from all instances that do not opt out using this flag.
@ -296,11 +344,40 @@ pub struct Opt {
#[serde(default)]
pub experimental_enable_metrics: bool,
/// Experimental logs mode feature. For more information, see: <https://github.com/orgs/meilisearch/discussions/723>
///
/// Change the mode of the logs on the console.
#[clap(long, env = MEILI_EXPERIMENTAL_LOGS_MODE, default_value_t)]
#[serde(default)]
pub experimental_logs_mode: LogMode,
/// Experimental logs route feature. For more information, see: <https://github.com/orgs/meilisearch/discussions/721>
///
/// Enables the log routes on the `POST /logs/stream`, `POST /logs/stderr` endpoints, and the `DELETE /logs/stream` to stop receiving logs.
#[clap(long, env = MEILI_EXPERIMENTAL_ENABLE_LOGS_ROUTE)]
#[serde(default)]
pub experimental_enable_logs_route: bool,
/// Enable multiple features that helps you to run meilisearch in a replicated context.
/// For more information, see: <https://github.com/orgs/meilisearch/discussions/725>
///
/// - /!\ Disable the automatic clean up of old processed tasks, you're in charge of that now
/// - Lets you specify a custom task ID upon registering a task
/// - Lets you execute dry-register a task (get an answer from the route but nothing is actually registered in meilisearch and it won't be processed)
#[clap(long, env = MEILI_EXPERIMENTAL_REPLICATION_PARAMETERS)]
#[serde(default)]
pub experimental_replication_parameters: bool,
/// Experimental RAM reduction during indexing, do not use in production, see: <https://github.com/meilisearch/product/discussions/652>
#[clap(long, env = MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE)]
#[serde(default)]
pub experimental_reduce_indexing_memory_usage: bool,
/// Experimentally reduces the maximum number of tasks that will be processed at once, see: <https://github.com/orgs/meilisearch/discussions/713>
#[clap(long, env = MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS, default_value_t = default_limit_batched_tasks())]
#[serde(default = "default_limit_batched_tasks")]
pub experimental_max_number_of_batched_tasks: usize,
#[serde(flatten)]
#[clap(flatten)]
pub indexer_options: IndexerOpts,
@ -368,9 +445,12 @@ impl Opt {
http_addr,
master_key,
env,
task_webhook_url,
task_webhook_authorization_header,
max_index_size: _,
max_task_db_size: _,
http_payload_size_limit,
experimental_max_number_of_batched_tasks,
ssl_cert_path,
ssl_key_path,
ssl_auth_path,
@ -392,8 +472,11 @@ impl Opt {
config_file_path: _,
#[cfg(feature = "analytics")]
no_analytics,
experimental_enable_metrics: enable_metrics_route,
experimental_reduce_indexing_memory_usage: reduce_indexing_memory_usage,
experimental_enable_metrics,
experimental_logs_mode,
experimental_enable_logs_route,
experimental_replication_parameters,
experimental_reduce_indexing_memory_usage,
} = self;
export_to_env_if_not_present(MEILI_DB_PATH, db_path);
export_to_env_if_not_present(MEILI_HTTP_ADDR, http_addr);
@ -401,6 +484,16 @@ impl Opt {
export_to_env_if_not_present(MEILI_MASTER_KEY, master_key);
}
export_to_env_if_not_present(MEILI_ENV, env);
if let Some(task_webhook_url) = task_webhook_url {
export_to_env_if_not_present(MEILI_TASK_WEBHOOK_URL, task_webhook_url.to_string());
}
if let Some(task_webhook_authorization_header) = task_webhook_authorization_header {
export_to_env_if_not_present(
MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER,
task_webhook_authorization_header,
);
}
#[cfg(feature = "analytics")]
{
export_to_env_if_not_present(MEILI_NO_ANALYTICS, no_analytics.to_string());
@ -409,6 +502,10 @@ impl Opt {
MEILI_HTTP_PAYLOAD_SIZE_LIMIT,
http_payload_size_limit.to_string(),
);
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS,
experimental_max_number_of_batched_tasks.to_string(),
);
if let Some(ssl_cert_path) = ssl_cert_path {
export_to_env_if_not_present(MEILI_SSL_CERT_PATH, ssl_cert_path);
}
@ -433,11 +530,23 @@ impl Opt {
export_to_env_if_not_present(MEILI_LOG_LEVEL, log_level.to_string());
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_ENABLE_METRICS,
enable_metrics_route.to_string(),
experimental_enable_metrics.to_string(),
);
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_LOGS_MODE,
experimental_logs_mode.to_string(),
);
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_REPLICATION_PARAMETERS,
experimental_replication_parameters.to_string(),
);
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_ENABLE_LOGS_ROUTE,
experimental_enable_logs_route.to_string(),
);
export_to_env_if_not_present(
MEILI_EXPERIMENTAL_REDUCE_INDEXING_MEMORY_USAGE,
reduce_indexing_memory_usage.to_string(),
experimental_reduce_indexing_memory_usage.to_string(),
);
indexer_options.export_to_env();
}
@ -489,7 +598,10 @@ impl Opt {
}
pub(crate) fn to_instance_features(&self) -> InstanceTogglableFeatures {
InstanceTogglableFeatures { metrics: self.experimental_enable_metrics }
InstanceTogglableFeatures {
metrics: self.experimental_enable_metrics,
logs_route: self.experimental_enable_logs_route,
}
}
}
@ -598,8 +710,8 @@ impl MaxMemory {
/// Returns the total amount of bytes available or `None` if this system isn't supported.
fn total_memory_bytes() -> Option<u64> {
if System::IS_SUPPORTED {
let memory_kind = RefreshKind::new().with_memory();
if sysinfo::IS_SUPPORTED_SYSTEM {
let memory_kind = RefreshKind::new().with_memory(MemoryRefreshKind::new().with_ram());
let mut system = System::new_with_specifics(memory_kind);
system.refresh_memory();
Some(system.total_memory())
@ -727,6 +839,10 @@ fn default_http_payload_size_limit() -> Byte {
Byte::from_str(DEFAULT_HTTP_PAYLOAD_SIZE_LIMIT).unwrap()
}
fn default_limit_batched_tasks() -> usize {
usize::MAX
}
fn default_snapshot_dir() -> PathBuf {
PathBuf::from(DEFAULT_SNAPSHOT_DIR)
}

View File

@ -10,7 +10,7 @@ use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::{DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::error::deserr_codes::*;
use meilisearch_types::error::{Code, ResponseError};
use meilisearch_types::keys::{Action, CreateApiKey, Key, PatchApiKey};
use meilisearch_types::keys::{CreateApiKey, Key, PatchApiKey};
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use uuid::Uuid;

View File

@ -1,17 +1,18 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_auth::AuthController;
use meilisearch_types::error::ResponseError;
use meilisearch_types::tasks::KindWithContent;
use serde_json::json;
use tracing::debug;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::SummarizedTaskView;
use crate::routes::{get_task_id, is_dry_run, SummarizedTaskView};
use crate::Opt;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(SeqHandler(create_dump))));
@ -21,6 +22,7 @@ pub async fn create_dump(
index_scheduler: GuardedData<ActionPolicy<{ actions::DUMPS_CREATE }>, Data<IndexScheduler>>,
auth_controller: GuardedData<ActionPolicy<{ actions::DUMPS_CREATE }>, Data<AuthController>>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish("Dump Created".to_string(), json!({}), Some(&req));
@ -29,9 +31,13 @@ pub async fn create_dump(
keys: auth_controller.list_keys()?,
instance_uid: analytics.instance_uid().cloned(),
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Create dump");
Ok(HttpResponse::Accepted().json(task))
}

View File

@ -3,11 +3,11 @@ use actix_web::{HttpRequest, HttpResponse};
use deserr::actix_web::AwebJson;
use deserr::Deserr;
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::actions;
use serde_json::json;
use tracing::debug;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::ActionPolicy;
@ -29,21 +29,26 @@ async fn get_features(
>,
req: HttpRequest,
analytics: Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let features = index_scheduler.features()?;
) -> HttpResponse {
let features = index_scheduler.features();
analytics.publish("Experimental features Seen".to_string(), json!(null), Some(&req));
debug!("returns: {:?}", features.runtime_features());
Ok(HttpResponse::Ok().json(features.runtime_features()))
let features = features.runtime_features();
debug!(returns = ?features, "Get features");
HttpResponse::Ok().json(features)
}
#[derive(Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct RuntimeTogglableFeatures {
#[deserr(default)]
pub score_details: Option<bool>,
#[deserr(default)]
pub vector_store: Option<bool>,
#[deserr(default)]
pub metrics: Option<bool>,
#[deserr(default)]
pub logs_route: Option<bool>,
#[deserr(default)]
pub export_puffin_reports: Option<bool>,
}
async fn patch_features(
@ -55,29 +60,41 @@ async fn patch_features(
req: HttpRequest,
analytics: Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let features = index_scheduler.features()?;
let features = index_scheduler.features();
debug!(parameters = ?new_features, "Patch features");
let old_features = features.runtime_features();
let new_features = meilisearch_types::features::RuntimeTogglableFeatures {
score_details: new_features.0.score_details.unwrap_or(old_features.score_details),
vector_store: new_features.0.vector_store.unwrap_or(old_features.vector_store),
metrics: new_features.0.metrics.unwrap_or(old_features.metrics),
logs_route: new_features.0.logs_route.unwrap_or(old_features.logs_route),
export_puffin_reports: new_features
.0
.export_puffin_reports
.unwrap_or(old_features.export_puffin_reports),
};
// explicitly destructure for analytics rather than using the `Serialize` implementation, because
// the it renames to camelCase, which we don't want for analytics.
// **Do not** ignore fields with `..` or `_` here, because we want to add them in the future.
let meilisearch_types::features::RuntimeTogglableFeatures { score_details, vector_store } =
new_features;
let meilisearch_types::features::RuntimeTogglableFeatures {
vector_store,
metrics,
logs_route,
export_puffin_reports,
} = new_features;
analytics.publish(
"Experimental features Updated".to_string(),
json!({
"score_details": score_details,
"vector_store": vector_store,
"metrics": metrics,
"logs_route": logs_route,
"export_puffin_reports": export_puffin_reports,
}),
Some(&req),
);
index_scheduler.put_runtime_features(new_features)?;
debug!(returns = ?new_features, "Patch features");
Ok(HttpResponse::Ok().json(new_features))
}

View File

@ -3,12 +3,11 @@ use std::io::ErrorKind;
use actix_web::http::header::CONTENT_TYPE;
use actix_web::web::Data;
use actix_web::{web, HttpMessage, HttpRequest, HttpResponse};
use bstr::ByteSlice;
use bstr::ByteSlice as _;
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use deserr::Deserr;
use futures::StreamExt;
use index_scheduler::IndexScheduler;
use log::debug;
use index_scheduler::{IndexScheduler, TaskId};
use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::{DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::document_formats::{read_csv, read_json, read_ndjson, PayloadType};
@ -28,6 +27,7 @@ use serde_json::Value;
use tempfile::tempfile;
use tokio::fs::File;
use tokio::io::{AsyncSeekExt, AsyncWriteExt, BufWriter};
use tracing::debug;
use crate::analytics::{Analytics, DocumentDeletionKind, DocumentFetchKind};
use crate::error::MeilisearchHttpError;
@ -36,8 +36,11 @@ use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::payload::Payload;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::{PaginationView, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT};
use crate::routes::{
get_task_id, is_dry_run, PaginationView, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT,
};
use crate::search::parse_filter;
use crate::Opt;
static ACCEPTED_CONTENT_TYPE: Lazy<Vec<String>> = Lazy::new(|| {
vec!["application/json".to_string(), "application/x-ndjson".to_string(), "text/csv".to_string()]
@ -101,6 +104,7 @@ pub async fn get_document(
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let DocumentParam { index_uid, document_id } = document_param.into_inner();
debug!(parameters = ?params, "Get document");
let index_uid = IndexUid::try_from(index_uid)?;
analytics.get_fetch_documents(&DocumentFetchKind::PerDocumentId, &req);
@ -110,7 +114,7 @@ pub async fn get_document(
let index = index_scheduler.index(&index_uid)?;
let document = retrieve_document(&index, &document_id, attributes_to_retrieve)?;
debug!("returns: {:?}", document);
debug!(returns = ?document, "Get document");
Ok(HttpResponse::Ok().json(document))
}
@ -118,6 +122,7 @@ pub async fn delete_document(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, Data<IndexScheduler>>,
path: web::Path<DocumentParam>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let DocumentParam { index_uid, document_id } = path.into_inner();
@ -129,8 +134,12 @@ pub async fn delete_document(
index_uid: index_uid.to_string(),
documents_ids: vec![document_id],
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
@ -168,9 +177,8 @@ pub async fn documents_by_query_post(
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with body: {:?}", body);
let body = body.into_inner();
debug!(parameters = ?body, "Get documents POST");
analytics.post_fetch_documents(
&DocumentFetchKind::Normal {
@ -191,7 +199,7 @@ pub async fn get_documents(
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
debug!(parameters = ?params, "Get documents GET");
let BrowseQueryGet { limit, offset, fields, filter } = params.into_inner();
@ -235,7 +243,7 @@ fn documents_by_query(
let ret = PaginationView::new(offset, limit, total as usize, documents);
debug!("returns: {:?}", ret);
debug!(returns = ?ret, "Get documents");
Ok(HttpResponse::Ok().json(ret))
}
@ -267,16 +275,19 @@ pub async fn replace_documents(
params: AwebQueryParameter<UpdateDocumentsQuery, DeserrQueryParamError>,
body: Payload,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
debug!("called with params: {:?}", params);
debug!(parameters = ?params, "Replace documents");
let params = params.into_inner();
analytics.add_documents(&params, index_scheduler.index(&index_uid).is_err(), &req);
let allow_index_creation = index_scheduler.filters().allow_index_creation(&index_uid);
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task = document_addition(
extract_mime_type(&req)?,
index_scheduler,
@ -285,9 +296,12 @@ pub async fn replace_documents(
params.csv_delimiter,
body,
IndexDocumentsMethod::ReplaceDocuments,
uid,
dry_run,
allow_index_creation,
)
.await?;
debug!(returns = ?task, "Replace documents");
Ok(HttpResponse::Accepted().json(task))
}
@ -298,16 +312,19 @@ pub async fn update_documents(
params: AwebQueryParameter<UpdateDocumentsQuery, DeserrQueryParamError>,
body: Payload,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
debug!("called with params: {:?}", params);
let params = params.into_inner();
debug!(parameters = ?params, "Update documents");
analytics.update_documents(&params, index_scheduler.index(&index_uid).is_err(), &req);
let allow_index_creation = index_scheduler.filters().allow_index_creation(&index_uid);
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task = document_addition(
extract_mime_type(&req)?,
index_scheduler,
@ -316,9 +333,12 @@ pub async fn update_documents(
params.csv_delimiter,
body,
IndexDocumentsMethod::UpdateDocuments,
uid,
dry_run,
allow_index_creation,
)
.await?;
debug!(returns = ?task, "Update documents");
Ok(HttpResponse::Accepted().json(task))
}
@ -332,6 +352,8 @@ async fn document_addition(
csv_delimiter: Option<u8>,
mut body: Payload,
method: IndexDocumentsMethod,
task_id: Option<TaskId>,
dry_run: bool,
allow_index_creation: bool,
) -> Result<SummarizedTaskView, MeilisearchHttpError> {
let format = match (
@ -364,7 +386,7 @@ async fn document_addition(
}
};
let (uuid, mut update_file) = index_scheduler.create_update_file()?;
let (uuid, mut update_file) = index_scheduler.create_update_file(dry_run)?;
let temp_file = match tempfile() {
Ok(file) => file,
@ -403,11 +425,9 @@ async fn document_addition(
let read_file = buffer.into_inner().into_std().await;
let documents_count = tokio::task::spawn_blocking(move || {
let documents_count = match format {
PayloadType::Json => read_json(&read_file, update_file.as_file_mut())?,
PayloadType::Csv { delimiter } => {
read_csv(&read_file, update_file.as_file_mut(), delimiter)?
}
PayloadType::Ndjson => read_ndjson(&read_file, update_file.as_file_mut())?,
PayloadType::Json => read_json(&read_file, &mut update_file)?,
PayloadType::Csv { delimiter } => read_csv(&read_file, &mut update_file, delimiter)?,
PayloadType::Ndjson => read_ndjson(&read_file, &mut update_file)?,
};
// we NEED to persist the file here because we moved the `udpate_file` in another task.
update_file.persist()?;
@ -427,7 +447,10 @@ async fn document_addition(
Err(index_scheduler::Error::FileStore(file_store::Error::IoError(e)))
if e.kind() == ErrorKind::NotFound => {}
Err(e) => {
log::warn!("Unknown error happened while deleting a malformed update file with uuid {uuid}: {e}");
tracing::warn!(
index_uuid = %uuid,
"Unknown error happened while deleting a malformed update file: {e}"
);
}
}
// We still want to return the original error to the end user.
@ -445,7 +468,9 @@ async fn document_addition(
};
let scheduler = index_scheduler.clone();
let task = match tokio::task::spawn_blocking(move || scheduler.register(task)).await? {
let task = match tokio::task::spawn_blocking(move || scheduler.register(task, task_id, dry_run))
.await?
{
Ok(task) => task,
Err(e) => {
index_scheduler.delete_update_file(uuid)?;
@ -453,7 +478,6 @@ async fn document_addition(
}
};
debug!("returns: {:?}", task);
Ok(task.into())
}
@ -462,9 +486,10 @@ pub async fn delete_documents_batch(
index_uid: web::Path<String>,
body: web::Json<Vec<Value>>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
debug!(parameters = ?body, "Delete documents by batch");
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
analytics.delete_documents(DocumentDeletionKind::PerBatch, &req);
@ -476,10 +501,14 @@ pub async fn delete_documents_batch(
let task =
KindWithContent::DocumentDeletion { index_uid: index_uid.to_string(), documents_ids: ids };
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Delete documents by batch");
Ok(HttpResponse::Accepted().json(task))
}
@ -495,9 +524,10 @@ pub async fn delete_documents_by_filter(
index_uid: web::Path<String>,
body: AwebJson<DocumentDeletionByFilter, DeserrJsonError>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
debug!(parameters = ?body, "Delete documents by filter");
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let index_uid = index_uid.into_inner();
let filter = body.into_inner().filter;
@ -512,10 +542,14 @@ pub async fn delete_documents_by_filter(
.map_err(|err| ResponseError::from_msg(err.message, Code::InvalidDocumentFilter))?;
let task = KindWithContent::DocumentDeletionByFilter { index_uid, filter_expr: filter };
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Delete documents by filter");
Ok(HttpResponse::Accepted().json(task))
}
@ -523,16 +557,21 @@ pub async fn clear_all_documents(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
analytics.delete_documents(DocumentDeletionKind::ClearAll, &req);
let task = KindWithContent::DocumentClear { index_uid: index_uid.to_string() };
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Delete all documents");
Ok(HttpResponse::Accepted().json(task))
}
@ -612,8 +651,8 @@ fn retrieve_document<S: AsRef<str>>(
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
let internal_id = index
.external_documents_ids(&txn)?
.get(doc_id.as_bytes())
.external_documents_ids()
.get(&txn, doc_id)?
.ok_or_else(|| MeilisearchHttpError::DocumentNotFound(doc_id.to_string()))?;
let document = index

View File

@ -2,20 +2,20 @@ use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::AwebJson;
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::deserr_codes::*;
use meilisearch_types::error::ResponseError;
use meilisearch_types::index_uid::IndexUid;
use serde_json::Value;
use tracing::debug;
use crate::analytics::{Analytics, FacetSearchAggregator};
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::search::{
add_search_rules, perform_facet_search, MatchingStrategy, SearchQuery, DEFAULT_CROP_LENGTH,
DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG, DEFAULT_HIGHLIGHT_PRE_TAG,
DEFAULT_SEARCH_LIMIT, DEFAULT_SEARCH_OFFSET,
add_search_rules, perform_facet_search, HybridQuery, MatchingStrategy, SearchQuery,
DEFAULT_CROP_LENGTH, DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG,
DEFAULT_HIGHLIGHT_PRE_TAG, DEFAULT_SEARCH_LIMIT, DEFAULT_SEARCH_OFFSET,
};
pub fn configure(cfg: &mut web::ServiceConfig) {
@ -36,6 +36,8 @@ pub struct FacetSearchQuery {
pub q: Option<String>,
#[deserr(default, error = DeserrJsonError<InvalidSearchVector>)]
pub vector: Option<Vec<f32>>,
#[deserr(default, error = DeserrJsonError<InvalidHybridQuery>)]
pub hybrid: Option<HybridQuery>,
#[deserr(default, error = DeserrJsonError<InvalidSearchFilter>)]
pub filter: Option<Value>,
#[deserr(default, error = DeserrJsonError<InvalidSearchMatchingStrategy>, default)]
@ -54,7 +56,7 @@ pub async fn search(
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let query = params.into_inner();
debug!("facet search called with params: {:?}", query);
debug!(parameters = ?query, "Facet search");
let mut aggregate = FacetSearchAggregator::from_query(&query, &req);
@ -68,7 +70,7 @@ pub async fn search(
}
let index = index_scheduler.index(&index_uid)?;
let features = index_scheduler.features()?;
let features = index_scheduler.features();
let search_result = tokio::task::spawn_blocking(move || {
perform_facet_search(&index, search_query, facet_query, facet_name, features)
})
@ -81,7 +83,7 @@ pub async fn search(
let search_result = search_result?;
debug!("returns: {:?}", search_result);
debug!(returns = ?search_result, "Facet search");
Ok(HttpResponse::Ok().json(search_result))
}
@ -95,6 +97,7 @@ impl From<FacetSearchQuery> for SearchQuery {
filter,
matching_strategy,
attributes_to_search_on,
hybrid,
} = value;
SearchQuery {
@ -119,6 +122,7 @@ impl From<FacetSearchQuery> for SearchQuery {
matching_strategy,
vector,
attributes_to_search_on,
hybrid,
}
}
}

View File

@ -5,7 +5,6 @@ use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use deserr::{DeserializeError, Deserr, ValuePointerRef};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::{immutable_field_error, DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::error::deserr_codes::*;
@ -16,12 +15,15 @@ use meilisearch_types::tasks::KindWithContent;
use serde::Serialize;
use serde_json::json;
use time::OffsetDateTime;
use tracing::debug;
use super::{Pagination, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT};
use super::{get_task_id, Pagination, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT};
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::{AuthenticationError, GuardedData};
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::is_dry_run;
use crate::Opt;
pub mod documents;
pub mod facet_search;
@ -93,6 +95,7 @@ pub async fn list_indexes(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, Data<IndexScheduler>>,
paginate: AwebQueryParameter<ListIndexes, DeserrQueryParamError>,
) -> Result<HttpResponse, ResponseError> {
debug!(parameters = ?paginate, "List indexes");
let filters = index_scheduler.filters();
let indexes: Vec<Option<IndexView>> =
index_scheduler.try_for_each_index(|uid, index| -> Result<Option<IndexView>, _> {
@ -105,7 +108,7 @@ pub async fn list_indexes(
let indexes: Vec<IndexView> = indexes.into_iter().flatten().collect();
let ret = paginate.as_pagination().auto_paginate_sized(indexes.into_iter());
debug!("returns: {:?}", ret);
debug!(returns = ?ret, "List indexes");
Ok(HttpResponse::Ok().json(ret))
}
@ -122,8 +125,10 @@ pub async fn create_index(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_CREATE }>, Data<IndexScheduler>>,
body: AwebJson<IndexCreateRequest, DeserrJsonError>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!(parameters = ?body, "Create index");
let IndexCreateRequest { primary_key, uid } = body.into_inner();
let allow_index_creation = index_scheduler.filters().allow_index_creation(&uid);
@ -135,8 +140,13 @@ pub async fn create_index(
);
let task = KindWithContent::IndexCreation { index_uid: uid.to_string(), primary_key };
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!(returns = ?task, "Create index");
Ok(HttpResponse::Accepted().json(task))
} else {
@ -177,7 +187,7 @@ pub async fn get_index(
let index = index_scheduler.index(&index_uid)?;
let index_view = IndexView::new(index_uid.into_inner(), &index)?;
debug!("returns: {:?}", index_view);
debug!(returns = ?index_view, "Get index");
Ok(HttpResponse::Ok().json(index_view))
}
@ -187,9 +197,10 @@ pub async fn update_index(
index_uid: web::Path<String>,
body: AwebJson<UpdateIndexRequest, DeserrJsonError>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
debug!(parameters = ?body, "Update index");
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let body = body.into_inner();
analytics.publish(
@ -203,21 +214,32 @@ pub async fn update_index(
primary_key: body.primary_key,
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Update index");
Ok(HttpResponse::Accepted().json(task))
}
pub async fn delete_index(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_DELETE }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
req: HttpRequest,
opt: web::Data<Opt>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let task = KindWithContent::IndexDeletion { index_uid: index_uid.into_inner() };
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!(returns = ?task, "Delete index");
Ok(HttpResponse::Accepted().json(task))
}
@ -255,6 +277,6 @@ pub async fn get_index_stats(
let stats = IndexStats::from(index_scheduler.index_stats(&index_uid)?);
debug!("returns: {:?}", stats);
debug!(returns = ?stats, "Get index stats");
Ok(HttpResponse::Ok().json(stats))
}

View File

@ -2,23 +2,25 @@ use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::{DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::error::deserr_codes::*;
use meilisearch_types::error::ResponseError;
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::milli;
use meilisearch_types::milli::vector::DistributionShift;
use meilisearch_types::serde_cs::vec::CS;
use serde_json::Value;
use tracing::{debug, warn};
use crate::analytics::{Analytics, SearchAggregator};
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::search::{
add_search_rules, perform_search, MatchingStrategy, SearchQuery, DEFAULT_CROP_LENGTH,
DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG, DEFAULT_HIGHLIGHT_PRE_TAG,
DEFAULT_SEARCH_LIMIT, DEFAULT_SEARCH_OFFSET,
add_search_rules, perform_search, HybridQuery, MatchingStrategy, SearchQuery, SemanticRatio,
DEFAULT_CROP_LENGTH, DEFAULT_CROP_MARKER, DEFAULT_HIGHLIGHT_POST_TAG,
DEFAULT_HIGHLIGHT_PRE_TAG, DEFAULT_SEARCH_LIMIT, DEFAULT_SEARCH_OFFSET, DEFAULT_SEMANTIC_RATIO,
};
pub fn configure(cfg: &mut web::ServiceConfig) {
@ -74,6 +76,31 @@ pub struct SearchQueryGet {
matching_strategy: MatchingStrategy,
#[deserr(default, error = DeserrQueryParamError<InvalidSearchAttributesToSearchOn>)]
pub attributes_to_search_on: Option<CS<String>>,
#[deserr(default, error = DeserrQueryParamError<InvalidEmbedder>)]
pub hybrid_embedder: Option<String>,
#[deserr(default, error = DeserrQueryParamError<InvalidSearchSemanticRatio>)]
pub hybrid_semantic_ratio: Option<SemanticRatioGet>,
}
#[derive(Debug, Clone, Copy, Default, PartialEq, deserr::Deserr)]
#[deserr(try_from(String) = TryFrom::try_from -> InvalidSearchSemanticRatio)]
pub struct SemanticRatioGet(SemanticRatio);
impl std::convert::TryFrom<String> for SemanticRatioGet {
type Error = InvalidSearchSemanticRatio;
fn try_from(s: String) -> Result<Self, Self::Error> {
let f: f32 = s.parse().map_err(|_| InvalidSearchSemanticRatio)?;
Ok(SemanticRatioGet(SemanticRatio::try_from(f)?))
}
}
impl std::ops::Deref for SemanticRatioGet {
type Target = SemanticRatio;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl From<SearchQueryGet> for SearchQuery {
@ -86,6 +113,20 @@ impl From<SearchQueryGet> for SearchQuery {
None => None,
};
let hybrid = match (other.hybrid_embedder, other.hybrid_semantic_ratio) {
(None, None) => None,
(None, Some(semantic_ratio)) => {
Some(HybridQuery { semantic_ratio: *semantic_ratio, embedder: None })
}
(Some(embedder), None) => Some(HybridQuery {
semantic_ratio: DEFAULT_SEMANTIC_RATIO(),
embedder: Some(embedder),
}),
(Some(embedder), Some(semantic_ratio)) => {
Some(HybridQuery { semantic_ratio: *semantic_ratio, embedder: Some(embedder) })
}
};
Self {
q: other.q,
vector: other.vector.map(CS::into_inner),
@ -108,6 +149,7 @@ impl From<SearchQueryGet> for SearchQuery {
crop_marker: other.crop_marker,
matching_strategy: other.matching_strategy,
attributes_to_search_on: other.attributes_to_search_on.map(|o| o.into_iter().collect()),
hybrid,
}
}
}
@ -144,7 +186,7 @@ pub async fn search_with_url_query(
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
debug!(parameters = ?params, "Search get");
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let mut query: SearchQuery = params.into_inner().into();
@ -157,9 +199,13 @@ pub async fn search_with_url_query(
let mut aggregate = SearchAggregator::from_query(&query, &req);
let index = index_scheduler.index(&index_uid)?;
let features = index_scheduler.features()?;
let features = index_scheduler.features();
let distribution = embed(&mut query, index_scheduler.get_ref(), &index).await?;
let search_result =
tokio::task::spawn_blocking(move || perform_search(&index, query, features)).await?;
tokio::task::spawn_blocking(move || perform_search(&index, query, features, distribution))
.await?;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
@ -167,7 +213,7 @@ pub async fn search_with_url_query(
let search_result = search_result?;
debug!("returns: {:?}", search_result);
debug!(returns = ?search_result, "Search get");
Ok(HttpResponse::Ok().json(search_result))
}
@ -181,7 +227,7 @@ pub async fn search_with_post(
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let mut query = params.into_inner();
debug!("search called with params: {:?}", query);
debug!(parameters = ?query, "Search post");
// Tenant token search_rules.
if let Some(search_rules) = index_scheduler.filters().get_index_search_rules(&index_uid) {
@ -192,9 +238,13 @@ pub async fn search_with_post(
let index = index_scheduler.index(&index_uid)?;
let features = index_scheduler.features()?;
let features = index_scheduler.features();
let distribution = embed(&mut query, index_scheduler.get_ref(), &index).await?;
let search_result =
tokio::task::spawn_blocking(move || perform_search(&index, query, features)).await?;
tokio::task::spawn_blocking(move || perform_search(&index, query, features, distribution))
.await?;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
@ -202,10 +252,84 @@ pub async fn search_with_post(
let search_result = search_result?;
debug!("returns: {:?}", search_result);
debug!(returns = ?search_result, "Search post");
Ok(HttpResponse::Ok().json(search_result))
}
pub async fn embed(
query: &mut SearchQuery,
index_scheduler: &IndexScheduler,
index: &milli::Index,
) -> Result<Option<DistributionShift>, ResponseError> {
match (&query.hybrid, &query.vector, &query.q) {
(Some(HybridQuery { semantic_ratio: _, embedder }), None, Some(q))
if !q.trim().is_empty() =>
{
let embedder_configs = index.embedding_configs(&index.read_txn()?)?;
let embedders = index_scheduler.embedders(embedder_configs)?;
let embedder = if let Some(embedder_name) = embedder {
embedders.get(embedder_name)
} else {
embedders.get_default()
};
let embedder = embedder
.ok_or(milli::UserError::InvalidEmbedder("default".to_owned()))
.map_err(milli::Error::from)?
.0;
let distribution = embedder.distribution();
let embeddings = embedder
.embed(vec![q.to_owned()])
.await
.map_err(milli::vector::Error::from)
.map_err(milli::Error::from)?
.pop()
.expect("No vector returned from embedding");
if embeddings.iter().nth(1).is_some() {
warn!("Ignoring embeddings past the first one in long search query");
query.vector = Some(embeddings.iter().next().unwrap().to_vec());
} else {
query.vector = Some(embeddings.into_inner());
}
Ok(distribution)
}
(Some(hybrid), vector, _) => {
let embedder_configs = index.embedding_configs(&index.read_txn()?)?;
let embedders = index_scheduler.embedders(embedder_configs)?;
let embedder = if let Some(embedder_name) = &hybrid.embedder {
embedders.get(embedder_name)
} else {
embedders.get_default()
};
let embedder = embedder
.ok_or(milli::UserError::InvalidEmbedder("default".to_owned()))
.map_err(milli::Error::from)?
.0;
if let Some(vector) = vector {
if vector.len() != embedder.dimensions() {
return Err(meilisearch_types::milli::Error::UserError(
meilisearch_types::milli::UserError::InvalidVectorDimensions {
expected: embedder.dimensions(),
found: vector.len(),
},
)
.into());
}
}
Ok(embedder.distribution())
}
_ => Ok(None),
}
}
#[cfg(test)]
mod test {
use super::*;

View File

@ -2,19 +2,21 @@ use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::AwebJson;
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::ResponseError;
use meilisearch_types::facet_values_sort::FacetValuesSort;
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::milli::update::Setting;
use meilisearch_types::settings::{settings, RankingRuleView, Settings, Unchecked};
use meilisearch_types::tasks::KindWithContent;
use serde_json::json;
use tracing::debug;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::routes::SummarizedTaskView;
use crate::routes::{get_task_id, is_dry_run, SummarizedTaskView};
use crate::Opt;
#[macro_export]
macro_rules! make_setting_route {
@ -23,17 +25,18 @@ macro_rules! make_setting_route {
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse, Resource};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::error::ResponseError;
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::milli::update::Setting;
use meilisearch_types::settings::{settings, Settings};
use meilisearch_types::tasks::KindWithContent;
use tracing::debug;
use $crate::analytics::Analytics;
use $crate::extractors::authentication::policies::*;
use $crate::extractors::authentication::GuardedData;
use $crate::extractors::sequential_extractor::SeqHandler;
use $crate::routes::SummarizedTaskView;
use $crate::Opt;
use $crate::routes::{is_dry_run, get_task_id, SummarizedTaskView};
pub async fn delete(
index_scheduler: GuardedData<
@ -41,6 +44,8 @@ macro_rules! make_setting_route {
Data<IndexScheduler>,
>,
index_uid: web::Path<String>,
req: HttpRequest,
opt: web::Data<Opt>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
@ -55,12 +60,14 @@ macro_rules! make_setting_route {
is_deletion: true,
allow_index_creation,
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task))
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Delete settings");
Ok(HttpResponse::Accepted().json(task))
}
@ -72,12 +79,15 @@ macro_rules! make_setting_route {
index_uid: actix_web::web::Path<String>,
body: deserr::actix_web::AwebJson<Option<$type>, $err_ty>,
req: HttpRequest,
opt: web::Data<Opt>,
$analytics_var: web::Data<dyn Analytics>,
) -> std::result::Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let body = body.into_inner();
debug!(parameters = ?body, "Update settings");
#[allow(clippy::redundant_closure_call)]
$analytics(&body, &req);
let new_settings = Settings {
@ -88,6 +98,11 @@ macro_rules! make_setting_route {
..Default::default()
};
let new_settings = $crate::routes::indexes::settings::validate_settings(
new_settings,
&index_scheduler,
)?;
let allow_index_creation =
index_scheduler.filters().allow_index_creation(&index_uid);
@ -97,12 +112,14 @@ macro_rules! make_setting_route {
is_deletion: false,
allow_index_creation,
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task))
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Update settings");
Ok(HttpResponse::Accepted().json(task))
}
@ -119,7 +136,7 @@ macro_rules! make_setting_route {
let rtxn = index.read_txn()?;
let settings = settings(&index, &rtxn)?;
debug!("returns: {:?}", settings);
debug!(returns = ?settings, "Update settings");
let mut json = serde_json::json!(&settings);
let val = json[$camelcase_attr].take();
@ -434,6 +451,31 @@ make_setting_route!(
}
);
make_setting_route!(
"/proximity-precision",
put,
meilisearch_types::settings::ProximityPrecisionView,
meilisearch_types::deserr::DeserrJsonError<
meilisearch_types::error::deserr_codes::InvalidSettingsProximityPrecision,
>,
proximity_precision,
"proximityPrecision",
analytics,
|precision: &Option<meilisearch_types::settings::ProximityPrecisionView>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"ProximityPrecision Updated".to_string(),
json!({
"proximity_precision": {
"set": precision.is_some(),
"value": precision.unwrap_or_default(),
}
}),
Some(req),
);
}
);
make_setting_route!(
"/ranking-rules",
put,
@ -520,6 +562,67 @@ make_setting_route!(
}
);
make_setting_route!(
"/embedders",
patch,
std::collections::BTreeMap<String, Setting<meilisearch_types::milli::vector::settings::EmbeddingSettings>>,
meilisearch_types::deserr::DeserrJsonError<
meilisearch_types::error::deserr_codes::InvalidSettingsEmbedders,
>,
embedders,
"embedders",
analytics,
|setting: &Option<std::collections::BTreeMap<String, Setting<meilisearch_types::milli::vector::settings::EmbeddingSettings>>>, req: &HttpRequest| {
analytics.publish(
"Embedders Updated".to_string(),
serde_json::json!({"embedders": crate::routes::indexes::settings::embedder_analytics(setting.as_ref())}),
Some(req),
);
}
);
fn embedder_analytics(
setting: Option<
&std::collections::BTreeMap<
String,
Setting<meilisearch_types::milli::vector::settings::EmbeddingSettings>,
>,
>,
) -> serde_json::Value {
let mut sources = std::collections::HashSet::new();
if let Some(s) = &setting {
for source in s
.values()
.filter_map(|config| config.clone().set())
.filter_map(|config| config.source.set())
{
use meilisearch_types::milli::vector::settings::EmbedderSource;
match source {
EmbedderSource::OpenAi => sources.insert("openAi"),
EmbedderSource::HuggingFace => sources.insert("huggingFace"),
EmbedderSource::UserProvided => sources.insert("userProvided"),
};
}
};
let document_template_used = setting.as_ref().map(|map| {
map.values()
.filter_map(|config| config.clone().set())
.any(|config| config.document_template.set().is_some())
});
json!(
{
"total": setting.as_ref().map(|s| s.len()),
"sources": sources,
"document_template_used": document_template_used,
}
)
}
macro_rules! generate_configure {
($($mod:ident),*) => {
pub fn configure(cfg: &mut web::ServiceConfig) {
@ -540,6 +643,7 @@ generate_configure!(
displayed_attributes,
searchable_attributes,
distinct_attribute,
proximity_precision,
stop_words,
separator_tokens,
non_separator_tokens,
@ -548,7 +652,8 @@ generate_configure!(
ranking_rules,
typo_tolerance,
pagination,
faceting
faceting,
embedders
);
pub async fn update_all(
@ -556,11 +661,14 @@ pub async fn update_all(
index_uid: web::Path<String>,
body: AwebJson<Settings<Unchecked>, DeserrJsonError>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
let new_settings = body.into_inner();
debug!(parameters = ?new_settings, "Update all settings");
let new_settings = validate_settings(new_settings, &index_scheduler)?;
analytics.publish(
"Settings Updated".to_string(),
@ -593,6 +701,10 @@ pub async fn update_all(
"distinct_attribute": {
"set": new_settings.distinct_attribute.as_ref().set().is_some()
},
"proximity_precision": {
"set": new_settings.proximity_precision.as_ref().set().is_some(),
"value": new_settings.proximity_precision.as_ref().set().copied().unwrap_or_default()
},
"typo_tolerance": {
"enabled": new_settings.typo_tolerance
.as_ref()
@ -652,6 +764,7 @@ pub async fn update_all(
"synonyms": {
"total": new_settings.synonyms.as_ref().set().map(|synonyms| synonyms.len()),
},
"embedders": crate::routes::indexes::settings::embedder_analytics(new_settings.embedders.as_ref().set())
}),
Some(&req),
);
@ -664,10 +777,14 @@ pub async fn update_all(
is_deletion: false,
allow_index_creation,
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Update all settings");
Ok(HttpResponse::Accepted().json(task))
}
@ -680,13 +797,15 @@ pub async fn get_all(
let index = index_scheduler.index(&index_uid)?;
let rtxn = index.read_txn()?;
let new_settings = settings(&index, &rtxn)?;
debug!("returns: {:?}", new_settings);
debug!(returns = ?new_settings, "Get all settings");
Ok(HttpResponse::Ok().json(new_settings))
}
pub async fn delete_all(
index_scheduler: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
req: HttpRequest,
opt: web::Data<Opt>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
@ -700,9 +819,23 @@ pub async fn delete_all(
is_deletion: true,
allow_index_creation,
};
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Delete all settings");
Ok(HttpResponse::Accepted().json(task))
}
fn validate_settings(
settings: Settings<Unchecked>,
index_scheduler: &IndexScheduler,
) -> Result<Settings<Unchecked>, ResponseError> {
if matches!(settings.embedders, Setting::Set(_)) {
index_scheduler.features().check_vector("Passing `embedders` in settings")?
}
Ok(settings.validate()?)
}

View File

@ -0,0 +1,318 @@
use std::convert::Infallible;
use std::io::Write;
use std::ops::ControlFlow;
use std::pin::Pin;
use std::str::FromStr;
use std::sync::Arc;
use actix_web::web::{Bytes, Data};
use actix_web::{web, HttpResponse};
use deserr::actix_web::AwebJson;
use deserr::{DeserializeError, Deserr, ErrorKind, MergeWithError, ValuePointerRef};
use futures_util::Stream;
use index_scheduler::IndexScheduler;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::deserr_codes::*;
use meilisearch_types::error::{Code, ResponseError};
use tokio::sync::mpsc;
use tracing_subscriber::filter::Targets;
use tracing_subscriber::Layer;
use crate::error::MeilisearchHttpError;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::{LogRouteHandle, LogStderrHandle};
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("stream")
.route(web::post().to(SeqHandler(get_logs)))
.route(web::delete().to(SeqHandler(cancel_logs))),
)
.service(web::resource("stderr").route(web::post().to(SeqHandler(update_stderr_target))));
}
#[derive(Debug, Default, Clone, Copy, Deserr, PartialEq, Eq)]
#[deserr(rename_all = camelCase)]
pub enum LogMode {
#[default]
Human,
Json,
Profile,
}
/// Simple wrapper around the `Targets` from `tracing_subscriber` to implement `MergeWithError` on it.
#[derive(Clone, Debug)]
struct MyTargets(Targets);
/// Simple wrapper around the `ParseError` from `tracing_subscriber` to implement `MergeWithError` on it.
#[derive(Debug, thiserror::Error)]
enum MyParseError {
#[error(transparent)]
ParseError(#[from] tracing_subscriber::filter::ParseError),
#[error(
"Empty string is not a valid target. If you want to get no logs use `OFF`. Usage: `info`, `meilisearch=info`, or you can write multiple filters in one target: `index_scheduler=info,milli=trace`"
)]
Example,
}
impl FromStr for MyTargets {
type Err = MyParseError;
fn from_str(s: &str) -> Result<Self, Self::Err> {
if s.is_empty() {
Err(MyParseError::Example)
} else {
Ok(MyTargets(Targets::from_str(s).map_err(MyParseError::ParseError)?))
}
}
}
impl MergeWithError<MyParseError> for DeserrJsonError<BadRequest> {
fn merge(
_self_: Option<Self>,
other: MyParseError,
merge_location: ValuePointerRef,
) -> ControlFlow<Self, Self> {
Self::error::<Infallible>(
None,
ErrorKind::Unexpected { msg: other.to_string() },
merge_location,
)
}
}
#[derive(Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields, validate = validate_get_logs -> DeserrJsonError<InvalidSettingsTypoTolerance>)]
pub struct GetLogs {
#[deserr(default = "info".parse().unwrap(), try_from(&String) = MyTargets::from_str -> DeserrJsonError<BadRequest>)]
target: MyTargets,
#[deserr(default, error = DeserrJsonError<BadRequest>)]
mode: LogMode,
#[deserr(default = false, error = DeserrJsonError<BadRequest>)]
profile_memory: bool,
}
fn validate_get_logs<E: DeserializeError>(
logs: GetLogs,
location: ValuePointerRef,
) -> Result<GetLogs, E> {
if logs.profile_memory && logs.mode != LogMode::Profile {
Err(deserr::take_cf_content(E::error::<Infallible>(
None,
ErrorKind::Unexpected {
msg: format!("`profile_memory` can only be used while profiling code and is not compatible with the {:?} mode.", logs.mode),
},
location,
)))
} else {
Ok(logs)
}
}
struct LogWriter {
sender: mpsc::UnboundedSender<Vec<u8>>,
}
impl Write for LogWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
self.sender.send(buf.to_vec()).map_err(std::io::Error::other)?;
Ok(buf.len())
}
fn flush(&mut self) -> std::io::Result<()> {
Ok(())
}
}
struct HandleGuard {
/// We need to keep an handle on the logs to make it available again when the streamer is dropped
logs: Arc<LogRouteHandle>,
}
impl Drop for HandleGuard {
fn drop(&mut self) {
if let Err(e) = self.logs.modify(|layer| *layer.inner_mut() = None) {
tracing::error!("Could not free the logs route: {e}");
}
}
}
fn byte_stream(
receiver: mpsc::UnboundedReceiver<Vec<u8>>,
guard: HandleGuard,
) -> impl futures_util::Stream<Item = Result<Bytes, ResponseError>> {
futures_util::stream::unfold((receiver, guard), move |(mut receiver, guard)| async move {
let vec = receiver.recv().await;
vec.map(From::from).map(Ok).map(|a| (a, (receiver, guard)))
})
}
type PinnedByteStream = Pin<Box<dyn Stream<Item = Result<Bytes, ResponseError>>>>;
fn make_layer<
S: tracing::Subscriber + for<'span> tracing_subscriber::registry::LookupSpan<'span>,
>(
opt: &GetLogs,
logs: Data<LogRouteHandle>,
) -> (Box<dyn Layer<S> + Send + Sync>, PinnedByteStream) {
let guard = HandleGuard { logs: logs.into_inner() };
match opt.mode {
LogMode::Human => {
let (sender, receiver) = tokio::sync::mpsc::unbounded_channel();
let fmt_layer = tracing_subscriber::fmt::layer()
.with_writer(move || LogWriter { sender: sender.clone() })
.with_span_events(tracing_subscriber::fmt::format::FmtSpan::CLOSE);
let stream = byte_stream(receiver, guard);
(Box::new(fmt_layer) as Box<dyn Layer<S> + Send + Sync>, Box::pin(stream))
}
LogMode::Json => {
let (sender, receiver) = tokio::sync::mpsc::unbounded_channel();
let fmt_layer = tracing_subscriber::fmt::layer()
.with_writer(move || LogWriter { sender: sender.clone() })
.json()
.with_span_events(tracing_subscriber::fmt::format::FmtSpan::CLOSE);
let stream = byte_stream(receiver, guard);
(Box::new(fmt_layer) as Box<dyn Layer<S> + Send + Sync>, Box::pin(stream))
}
LogMode::Profile => {
let (trace, layer) = tracing_trace::Trace::new(opt.profile_memory);
let stream = entry_stream(trace, guard);
(Box::new(layer) as Box<dyn Layer<S> + Send + Sync>, Box::pin(stream))
}
}
}
fn entry_stream(
trace: tracing_trace::Trace,
guard: HandleGuard,
) -> impl Stream<Item = Result<Bytes, ResponseError>> {
let receiver = trace.into_receiver();
let entry_buf = Vec::new();
futures_util::stream::unfold(
(receiver, entry_buf, guard),
move |(mut receiver, mut entry_buf, guard)| async move {
let mut bytes = Vec::new();
while bytes.len() < 8192 {
entry_buf.clear();
let Ok(count) = tokio::time::timeout(
std::time::Duration::from_secs(1),
receiver.recv_many(&mut entry_buf, 100),
)
.await
else {
break;
};
if count == 0 {
if !bytes.is_empty() {
break;
}
// channel closed, exit
return None;
}
for entry in &entry_buf {
if let Err(error) = serde_json::to_writer(&mut bytes, entry) {
tracing::error!(
error = &error as &dyn std::error::Error,
"deserializing entry"
);
return Some((
Err(ResponseError::from_msg(
format!("error deserializing entry: {error}"),
Code::Internal,
)),
(receiver, entry_buf, guard),
));
}
}
}
Some((Ok(bytes.into()), (receiver, entry_buf, guard)))
},
)
}
pub async fn get_logs(
index_scheduler: GuardedData<ActionPolicy<{ actions::METRICS_GET }>, Data<IndexScheduler>>,
logs: Data<LogRouteHandle>,
body: AwebJson<GetLogs, DeserrJsonError>,
) -> Result<HttpResponse, ResponseError> {
index_scheduler.features().check_logs_route()?;
let opt = body.into_inner();
let mut stream = None;
logs.modify(|layer| match layer.inner_mut() {
None => {
// there is no one getting logs
*layer.filter_mut() = opt.target.0.clone();
let (new_layer, new_stream) = make_layer(&opt, logs.clone());
*layer.inner_mut() = Some(new_layer);
stream = Some(new_stream);
}
Some(_) => {
// there is already someone getting logs
}
})
.unwrap();
if let Some(stream) = stream {
Ok(HttpResponse::Ok().streaming(stream))
} else {
Err(MeilisearchHttpError::AlreadyUsedLogRoute.into())
}
}
pub async fn cancel_logs(
index_scheduler: GuardedData<ActionPolicy<{ actions::METRICS_GET }>, Data<IndexScheduler>>,
logs: Data<LogRouteHandle>,
) -> Result<HttpResponse, ResponseError> {
index_scheduler.features().check_logs_route()?;
if let Err(e) = logs.modify(|layer| *layer.inner_mut() = None) {
tracing::error!("Could not free the logs route: {e}");
}
Ok(HttpResponse::NoContent().finish())
}
#[derive(Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct UpdateStderrLogs {
#[deserr(default = "info".parse().unwrap(), try_from(&String) = MyTargets::from_str -> DeserrJsonError<BadRequest>)]
target: MyTargets,
}
pub async fn update_stderr_target(
index_scheduler: GuardedData<ActionPolicy<{ actions::METRICS_GET }>, Data<IndexScheduler>>,
logs: Data<LogStderrHandle>,
body: AwebJson<UpdateStderrLogs, DeserrJsonError>,
) -> Result<HttpResponse, ResponseError> {
index_scheduler.features().check_logs_route()?;
let opt = body.into_inner();
logs.modify(|layer| {
*layer.filter_mut() = opt.target.0.clone();
})
.unwrap();
Ok(HttpResponse::NoContent().finish())
}

View File

@ -19,7 +19,7 @@ pub async fn get_metrics(
index_scheduler: GuardedData<ActionPolicy<{ actions::METRICS_GET }>, Data<IndexScheduler>>,
auth_controller: Data<AuthController>,
) -> Result<HttpResponse, ResponseError> {
index_scheduler.features()?.check_metrics()?;
index_scheduler.features().check_metrics()?;
let auth_filters = index_scheduler.filters();
if !auth_filters.all_indexes_authorized() {
let mut error = ResponseError::from(AuthenticationError::InvalidToken);

View File

@ -3,18 +3,19 @@ use std::collections::BTreeMap;
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_auth::AuthController;
use meilisearch_types::error::ResponseError;
use meilisearch_types::error::{Code, ResponseError};
use meilisearch_types::settings::{Settings, Unchecked};
use meilisearch_types::tasks::{Kind, Status, Task, TaskId};
use serde::{Deserialize, Serialize};
use serde_json::json;
use time::OffsetDateTime;
use tracing::debug;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::Opt;
const PAGINATION_DEFAULT_LIMIT: usize = 20;
@ -22,6 +23,7 @@ mod api_key;
mod dump;
pub mod features;
pub mod indexes;
mod logs;
mod metrics;
mod multi_search;
mod snapshot;
@ -31,6 +33,7 @@ pub mod tasks;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::scope("/tasks").configure(tasks::configure))
.service(web::resource("/health").route(web::get().to(get_health)))
.service(web::scope("/logs").configure(logs::configure))
.service(web::scope("/keys").configure(api_key::configure))
.service(web::scope("/dumps").configure(dump::configure))
.service(web::scope("/snapshots").configure(snapshot::configure))
@ -43,6 +46,56 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
.service(web::scope("/experimental-features").configure(features::configure));
}
pub fn get_task_id(req: &HttpRequest, opt: &Opt) -> Result<Option<TaskId>, ResponseError> {
if !opt.experimental_replication_parameters {
return Ok(None);
}
let task_id = req
.headers()
.get("TaskId")
.map(|header| {
header.to_str().map_err(|e| {
ResponseError::from_msg(
format!("TaskId is not a valid utf-8 string: {e}"),
Code::BadRequest,
)
})
})
.transpose()?
.map(|s| {
s.parse::<TaskId>().map_err(|e| {
ResponseError::from_msg(
format!(
"Could not parse the TaskId as a {}: {e}",
std::any::type_name::<TaskId>(),
),
Code::BadRequest,
)
})
})
.transpose()?;
Ok(task_id)
}
pub fn is_dry_run(req: &HttpRequest, opt: &Opt) -> Result<bool, ResponseError> {
if !opt.experimental_replication_parameters {
return Ok(false);
}
Ok(req
.headers()
.get("DryRun")
.map(|header| {
header.to_str().map_err(|e| {
ResponseError::from_msg(
format!("DryRun is not a valid utf-8 string: {e}"),
Code::BadRequest,
)
})
})
.transpose()?
.map_or(false, |s| s.to_lowercase() == "true"))
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct SummarizedTaskView {
@ -250,7 +303,7 @@ async fn get_stats(
let stats = create_all_stats((*index_scheduler).clone(), (*auth_controller).clone(), filters)?;
debug!("returns: {:?}", stats);
debug!(returns = ?stats, "Get stats");
Ok(HttpResponse::Ok().json(stats))
}
@ -306,12 +359,18 @@ async fn get_version(
) -> HttpResponse {
analytics.publish("Version Seen".to_string(), json!(null), Some(&req));
let commit_sha = option_env!("VERGEN_GIT_SHA").unwrap_or("unknown");
let commit_date = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP").unwrap_or("unknown");
let build_info = build_info::BuildInfo::from_build();
HttpResponse::Ok().json(VersionResponse {
commit_sha: commit_sha.to_string(),
commit_date: commit_date.to_string(),
commit_sha: build_info.commit_sha1.unwrap_or("unknown").to_string(),
commit_date: build_info
.commit_timestamp
.and_then(|commit_timestamp| {
commit_timestamp
.format(&time::format_description::well_known::Iso8601::DEFAULT)
.ok()
})
.unwrap_or("unknown".into()),
pkg_version: env!("CARGO_PKG_VERSION").to_string(),
})
}

View File

@ -3,16 +3,17 @@ use actix_web::web::{self, Data};
use actix_web::{HttpRequest, HttpResponse};
use deserr::actix_web::AwebJson;
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::actions;
use serde::Serialize;
use tracing::debug;
use crate::analytics::{Analytics, MultiSearchAggregator};
use crate::extractors::authentication::policies::ActionPolicy;
use crate::extractors::authentication::{AuthenticationError, GuardedData};
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::indexes::search::embed;
use crate::search::{
add_search_rules, perform_search, SearchQueryWithIndex, SearchResultWithIndex,
};
@ -41,54 +42,56 @@ pub async fn multi_search_with_post(
let queries = params.into_inner().queries;
let mut multi_aggregate = MultiSearchAggregator::from_queries(&queries, &req);
let features = index_scheduler.features()?;
let features = index_scheduler.features();
// Explicitly expect a `(ResponseError, usize)` for the error type rather than `ResponseError` only,
// so that `?` doesn't work if it doesn't use `with_index`, ensuring that it is not forgotten in case of code
// changes.
let search_results: Result<_, (ResponseError, usize)> = (|| {
async {
let mut search_results = Vec::with_capacity(queries.len());
for (query_index, (index_uid, mut query)) in
queries.into_iter().map(SearchQueryWithIndex::into_index_query).enumerate()
{
debug!("multi-search #{query_index}: called with params: {:?}", query);
let search_results: Result<_, (ResponseError, usize)> = async {
let mut search_results = Vec::with_capacity(queries.len());
for (query_index, (index_uid, mut query)) in
queries.into_iter().map(SearchQueryWithIndex::into_index_query).enumerate()
{
debug!(on_index = query_index, parameters = ?query, "Multi-search");
// Check index from API key
if !index_scheduler.filters().is_index_authorized(&index_uid) {
return Err(AuthenticationError::InvalidToken).with_index(query_index);
}
// Apply search rules from tenant token
if let Some(search_rules) =
index_scheduler.filters().get_index_search_rules(&index_uid)
{
add_search_rules(&mut query, search_rules);
}
let index = index_scheduler
.index(&index_uid)
.map_err(|err| {
let mut err = ResponseError::from(err);
// Patch the HTTP status code to 400 as it defaults to 404 for `index_not_found`, but
// here the resource not found is not part of the URL.
err.code = StatusCode::BAD_REQUEST;
err
})
.with_index(query_index)?;
let search_result =
tokio::task::spawn_blocking(move || perform_search(&index, query, features))
.await
.with_index(query_index)?;
search_results.push(SearchResultWithIndex {
index_uid: index_uid.into_inner(),
result: search_result.with_index(query_index)?,
});
// Check index from API key
if !index_scheduler.filters().is_index_authorized(&index_uid) {
return Err(AuthenticationError::InvalidToken).with_index(query_index);
}
Ok(search_results)
// Apply search rules from tenant token
if let Some(search_rules) = index_scheduler.filters().get_index_search_rules(&index_uid)
{
add_search_rules(&mut query, search_rules);
}
let index = index_scheduler
.index(&index_uid)
.map_err(|err| {
let mut err = ResponseError::from(err);
// Patch the HTTP status code to 400 as it defaults to 404 for `index_not_found`, but
// here the resource not found is not part of the URL.
err.code = StatusCode::BAD_REQUEST;
err
})
.with_index(query_index)?;
let distribution = embed(&mut query, index_scheduler.get_ref(), &index)
.await
.with_index(query_index)?;
let search_result = tokio::task::spawn_blocking(move || {
perform_search(&index, query, features, distribution)
})
.await
.with_index(query_index)?;
search_results.push(SearchResultWithIndex {
index_uid: index_uid.into_inner(),
result: search_result.with_index(query_index)?,
});
}
})()
Ok(search_results)
}
.await;
if search_results.is_ok() {
@ -104,7 +107,7 @@ pub async fn multi_search_with_post(
err
})?;
debug!("returns: {:?}", search_results);
debug!(returns = ?search_results, "Multi-search");
Ok(HttpResponse::Ok().json(SearchResults { results: search_results }))
}

View File

@ -1,16 +1,17 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::error::ResponseError;
use meilisearch_types::tasks::KindWithContent;
use serde_json::json;
use tracing::debug;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::SummarizedTaskView;
use crate::routes::{get_task_id, is_dry_run, SummarizedTaskView};
use crate::Opt;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(SeqHandler(create_snapshot))));
@ -19,14 +20,19 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
pub async fn create_snapshot(
index_scheduler: GuardedData<ActionPolicy<{ actions::SNAPSHOTS_CREATE }>, Data<IndexScheduler>>,
req: HttpRequest,
opt: web::Data<Opt>,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish("Snapshot Created".to_string(), json!({}), Some(&req));
let task = KindWithContent::SnapshotCreation;
let uid = get_task_id(&req, &opt)?;
let dry_run = is_dry_run(&req, &opt)?;
let task: SummarizedTaskView =
tokio::task::spawn_blocking(move || index_scheduler.register(task)).await??.into();
tokio::task::spawn_blocking(move || index_scheduler.register(task, uid, dry_run))
.await??
.into();
debug!("returns: {:?}", task);
debug!(returns = ?task, "Create snapshot");
Ok(HttpResponse::Accepted().json(task))
}

Some files were not shown because too many files have changed in this diff Show More