Compare commits

...

69 Commits

Author SHA1 Message Date
Louis Dureuil
11fc9059cf Change base index size to 2TiB 2023-02-16 14:31:08 +01:00
Louis Dureuil
be01a33cea Skip computing index budget in tests 2023-02-16 14:31:08 +01:00
Louis Dureuil
d93a60c005 Compute budget 2023-02-16 10:53:57 +01:00
Louis Dureuil
babfbd9c87 Add dichotomic search to utils 2023-02-16 10:53:57 +01:00
Louis Dureuil
5daaf7cd5f Fix doc comment
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-02-16 10:53:57 +01:00
Louis Dureuil
82ae5df706 Document generation and let it wrap-around as is it safe to do. 2023-02-16 10:53:57 +01:00
Louis Dureuil
d3920a7e8a Retry in case of timeout while reopening 2023-02-16 10:53:57 +01:00
Louis Dureuil
b07614932e Default to 100 concurrent indexes in unixes, 10 in Windows 2023-02-16 10:53:57 +01:00
Louis Dureuil
89e74dcc31 Change default index map size to 10GiB 2023-02-16 10:53:57 +01:00
Louis Dureuil
a4d0a56fc4 Add basic tests for index eviction and resize 2023-02-16 10:53:57 +01:00
Louis Dureuil
06fa22203b Fix TODO 2023-02-16 10:53:57 +01:00
Louis Dureuil
52bf260f73 Rewrite where evicted indexes are added to the set 2023-02-16 10:53:57 +01:00
Louis Dureuil
f3c3ccc4b3 WIP: evict indexes in unavailable 2023-02-16 10:53:57 +01:00
Louis Dureuil
e6cd7a68cc Parameterize growth factor and index count 2023-02-16 10:53:57 +01:00
Louis Dureuil
de771a8bd7 Use LRU cache 2023-02-16 10:53:56 +01:00
Louis Dureuil
9107ac86f1 Add LruMap 2023-02-16 10:53:56 +01:00
Louis Dureuil
3908fdec29 Make sure we don't leave the in memory hashmap in an inconsistent state 2023-02-16 10:53:41 +01:00
Louis Dureuil
081dfb82ce Resize indexes when they're full 2023-02-16 10:53:41 +01:00
Louis Dureuil
52bdccee77 Add IndexMapper::resize_index fn 2023-02-16 10:53:41 +01:00
Louis Dureuil
380a2bec04 Add IndexStatus::BeingResized 2023-02-16 10:53:41 +01:00
Louis Dureuil
7b30f3e4de IndexScheduler::tick returns a TickOutcome 2023-02-16 10:53:41 +01:00
Louis Dureuil
4a792e6e98 create_or_open_index takes a map_size argument 2023-02-16 10:53:41 +01:00
Louis Dureuil
ad70f461b5 Add Batch::index_uid 2023-02-16 10:53:41 +01:00
Louis Dureuil
49e18da23e Do not escape tag name
$() syntax is not interpreted by the Dockerfile
2023-02-16 10:53:14 +01:00
Louis Dureuil
54240db495 Add note in code so one does not forget next time 2023-02-16 10:53:14 +01:00
Louis Dureuil
e1ed4bc750 Change Dockerfile to also pass the VERGEN_GIT_SEMVER_LIGHTWEIGHT when building 2023-02-16 10:53:14 +01:00
Louis Dureuil
9bd1cfb3a3 Ignore -dirty flag 2023-02-16 10:53:14 +01:00
Louis Dureuil
a341c94871 Update contributing.md 2023-02-16 10:53:14 +01:00
Louis Dureuil
f46cf46b8c Add prototype to analytics if any 2023-02-16 10:53:14 +01:00
Louis Dureuil
c3a30a5a91 If using a prototype, display its name at Meilisearch startup 2023-02-16 10:53:14 +01:00
bors[bot]
143e3cf948 Merge #3490
3490: Fix attributes set candidates r=curquiza a=ManyTheFish

# Pull Request

Fix attributes set candidates for v1.1.0

## details

The attribute criterion was not returning the remaining candidates when its internal algorithm was been exhausted.
We had a loss of candidates by the attribute criterion leading to the bug reported in the issue linked below.
After some investigation, it seems that it was the only criterion that had this behavior.

We are now returning the remaining candidates instead of an empty bitmap.

## Related issue

Fixes #3483
PR on milli for v1.0.1: https://github.com/meilisearch/milli/pull/777


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-02-15 17:38:07 +00:00
bors[bot]
91ce8a5e67 Merge #3492
3492: Bump deserr r=Kerollmops a=irevoire

Bump deserr to the latest version;
- We now use the default actix-web extractors that deserr provides (which were copy/pasted from meilisearch)
- We also use the default `JsonError` message provided by deserr instead of defining our own in meilisearch
- Finally, we get the new `did you mean?` error message. Fix #3493

Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-15 10:05:05 +00:00
bors[bot]
fd7ae1883b Merge #3495
3495: Add tests with rust nightly in CI r=curquiza a=ztkmkoo

# Pull Request

## Related issue
Fixes #3402 

## What does this PR do?
- add ci test with rust nightly
- make test with rust stable not run on schedule event

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Kebron <ztkmkoo@gmail.com>
2023-02-15 07:53:17 +00:00
Tamo
42a3cdca66 get rids of the unwrap_any function in favor of take_cf_content 2023-02-14 20:06:31 +01:00
Tamo
a43765d454 use the pre-defined deserr extractors 2023-02-14 20:05:30 +01:00
Tamo
769576fd94 get rids of the whole error_message module since it has been integrated into the last version of deserr 2023-02-14 20:05:27 +01:00
Tamo
8fb7b1d10f bump deserr 2023-02-14 20:04:30 +01:00
bors[bot]
d494c29768 Merge #3479
3479: Unify "Bad latitude" & "Bad longitude" errors r=irevoire a=cymruu

# Pull Request

## Related issue
Fix part of #3006

## What does this PR do?
- Moved out `BadGeoLat`, `BadGeoLng`, `BadGeoBoundingBoxTopIsBelowBottom` from `FilterError` into newly introduced error type `ParseGeoError`. 
- Renamed `BadGeo` error  to `ReservedGeo`
- Used new `ParseGeoError` type in `FilterError` and `AscDescError`

Screenshot: 
![image](https://user-images.githubusercontent.com/2981598/217927231-fe23b6a3-2ea8-4145-98af-38eb61c4ff16.png)

I ran `cargo test --package milli -- --test-threads 1` and tests passed.
`--test-threads` was set to 1 because my OS complained about too many opened files.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Filip Bachul <filipbachul@gmail.com>
Co-authored-by: filip <filipbachul@gmail.com>
2023-02-14 18:35:51 +00:00
bors[bot]
f3b54337f9 Merge #3174
3174: Allow wildcards at the end of index names for API Keys and Tenant tokens r=irevoire a=Kerollmops

This PR introduces the wildcards at the end of the index names when identifying indexes in the API Keys and tenant tokens. It fixes #2788 and fixes #2908. This PR is based on `@akhildevelops'` work.

Note that when a tenant token filter is chosen to restrict a search, it is always the most restrictive pattern that is chosen. If we have an index pattern _prod*_ that defines _filter1_ and _p*_ that defines _filter2_, the engine will choose _filter1_ over _filter2_ as it is defined for a most restrictive pattern, _prod*_. This restrictiveness is defined by 1. is it exact, without _*_ 2. the length of the pattern.

It is a continuation of work that has already started and should close #2869.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-02-14 16:12:01 +00:00
Clément Renault
7f3ae40204 Remove a useless comment regarding the index pattern error code 2023-02-14 17:09:20 +01:00
Filip Bachul
a53536836b fmt 2023-02-14 17:04:22 +01:00
Kebron
b095325bf8 Add tests with rust nightly in CI 2023-02-14 15:33:12 +00:00
Filip Bachul
d7ad39ad77 fix: clippy error 2023-02-14 00:15:35 +01:00
Filip Bachul
849de089d2 add thiserror for AscDescError 2023-02-14 00:15:35 +01:00
filip
7f25007d31 Update milli/src/asc_desc.rs
Co-authored-by: Tamo <irevoire@protonmail.ch>
2023-02-14 00:15:35 +01:00
Filip Bachul
c810af3ebf implement From<ParseGeoError> for AscDescError 2023-02-14 00:15:35 +01:00
Filip Bachul
c0b77773ba fmt asc_desc 2023-02-14 00:15:35 +01:00
Filip Bachul
7481559e8b move BadGeo to FilterError 2023-02-14 00:15:35 +01:00
Filip Bachul
83c765ce6c implement From<ParseGeoError> for FilterError 2023-02-14 00:15:35 +01:00
Filip Bachul
4c91037602 use ParseGeoError in sort parser 2023-02-14 00:15:35 +01:00
Filip Bachul
825923f6fc export ParseGeoError 2023-02-14 00:15:35 +01:00
Filip Bachul
e405702733 chore: introduce new error ParseGeoError type 2023-02-14 00:15:35 +01:00
ManyTheFish
6fa877efb0 Fix attributes set candidates 2023-02-13 17:49:52 +01:00
Kerollmops
4b1cd10653 Return an internal error when index pattern should be valid 2023-02-13 17:49:42 +01:00
Clément Renault
47748395dc Update an authentication comment
Co-authored-by: Many the fish <many@meilisearch.com>
2023-02-13 17:20:08 +01:00
bors[bot]
ff595156d7 Merge #3480
3480: Gitignore vscode & jetbrains IDE folders r=curquiza a=AymanHamdoun

# Pull Request

## Related issue
There is no issue for it, and i couldn't find an appropriate category to make an issue for it.

## What does this PR do?
- Its just a gitignore edit so people who use vscode and jetbrains IDEs (like IntelliJ) dont have to deal with committing the folder the IDE generates to store local project configs by mistake. (I honestly wanted to fork the repo to add something else but this bothered me enough to make a PR for it first) 

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ✔️ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [✔️  ] Have you read the contributing guidelines?
- [✔️  ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Ayman <ayman.s.hamdoun@gmail.com>
2023-02-13 10:47:25 +00:00
Ayman
8770088df3 remove idea folder 2023-02-10 11:45:02 +04:00
Ayman
827c1c8447 edit gitignore to ignore .idea and .vscode folders 2023-02-10 11:42:19 +04:00
Clément Renault
764df24b7d Make clippy happy (again) 2023-02-09 13:21:20 +01:00
Clément Renault
4570d5bf3a Merge remote-tracking branch 'origin/main' into temp-wildcard 2023-02-09 13:14:05 +01:00
Kerollmops
c690c4fec4 Added and modified the current API Key and Tenant Token tests 2023-02-09 11:17:30 +01:00
Kerollmops
7b4b57ecc8 Fix the current tests 2023-02-08 14:54:05 +01:00
Kerollmops
a36b1dbd70 Fix the tasks with the new patterns 2023-02-01 18:21:45 +01:00
Kerollmops
d563ed8a39 Making it work with index uid patterns 2023-02-01 17:51:30 +01:00
Kerollmops
ec7de4bae7 Make it work for any all routes including stats and index swaps 2023-01-25 16:12:40 +01:00
Kerollmops
184b8afd9e Make it work in the CreateApiKey struct 2023-01-25 15:01:50 +01:00
Kerollmops
29961b8c6b Make it work with the dumps 2023-01-25 14:41:36 +01:00
Clément Renault
0b08413c98 Introduce the IndexUidPattern type 2023-01-25 14:22:17 +01:00
Clément Renault
474d4ec498 Add tests for the index patterns 2023-01-25 14:22:16 +01:00
60 changed files with 1925 additions and 892 deletions

View File

@@ -7,7 +7,8 @@ WORKDIR /meilisearch
ARG COMMIT_SHA
ARG COMMIT_DATE
ENV COMMIT_SHA=${COMMIT_SHA} COMMIT_DATE=${COMMIT_DATE}
ARG GIT_TAG
ENV COMMIT_SHA=${COMMIT_SHA} COMMIT_DATE=${COMMIT_DATE} VERGEN_GIT_SEMVER_LIGHTWEIGHT=${GIT_TAG}
ENV RUSTFLAGS="-C target-feature=-crt-static"
COPY . .

View File

@@ -92,6 +92,7 @@ jobs:
build-args: |
COMMIT_SHA=${{ github.sha }}
COMMIT_DATE=${{ steps.build-metadata.outputs.date }}
GIT_TAG=${{ github.ref_name }}
# /!\ Don't touch this without checking with Cloud team
- name: Send CI information to Cloud team

View File

@@ -2,6 +2,9 @@ name: Rust
on:
workflow_dispatch:
schedule:
# Everyday at 5:00am
- cron: '0 5 * * *'
pull_request:
push:
# trying and staging branches are for Bors config
@@ -27,10 +30,18 @@ jobs:
run: |
apt-get update && apt-get install -y curl
apt-get install build-essential -y
- uses: actions-rs/toolchain@v1
- name: Run test with Rust stable
if: github.event_name != 'schedule'
uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Run test with Rust nightly
if: github.event_name == 'schedule'
uses: actions-rs/toolchain@v1
with:
toolchain: nightly
override: true
# Disable cache due to disk space issues with Windows workers in CI
# - name: Cache dependencies
# uses: Swatinem/rust-cache@v2.2.0

2
.gitignore vendored
View File

@@ -1,3 +1,5 @@
.idea/
.vscode/
/target
**/*.csv
**/*.json_lines

View File

@@ -121,15 +121,19 @@ The full Meilisearch release process is described in [this guide](https://github
Depending on the developed feature, you might need to provide a prototyped version of Meilisearch to make it easier to test by the users.
The prototype name must follow this convention: `prototype-X-Y` where
- `X` is the feature name formatted in `kebab-case`
- `X` is the feature name formatted in `kebab-case`. It should not end with a single number.
- `Y` is the version of the prototype, starting from `0`.
Example: `prototype-auto-resize-0`.
Example: `prototype-auto-resize-0`. </br>
❌ Bad example: `auto-resize-0`: lacks the `prototype` prefix. </br>
❌ Bad example: `prototype-auto-resize`: lacks the version suffix. </br>
❌ Bad example: `prototype-auto-resize-0-0`: feature name ends with a single number.
Steps to create a prototype:
1. In your terminal, go to the last commit of your branch (the one you want to provide as a prototype).
2. Create a tag following the convention: `git tag prototype-X-Y`
3. Run Meilisearch and check that its launch summary features a line: `Prototype: prototype-X-Y` (you may need to switch branches and back after tagging for this to work).
3. Push the tag: `git push origin prototype-X-Y`
4. Check the [Docker CI](https://github.com/meilisearch/meilisearch/actions/workflows/publish-docker-images.yml) is now running.
@@ -138,7 +142,7 @@ More information about [how to run Meilisearch with Docker](https://docs.meilise
⚙️ However, no binaries will be created. If the users do not use Docker, they can go to the `prototype-X-Y` tag in the Meilisearch repository and compile from the source code.
⚠️ When sharing a prototype with users, prevent them from using it in production. Prototypes are only for test purposes.
⚠️ When sharing a prototype with users, remind them to not use it in production. Prototypes are solely for test purposes.
### Release assets

57
Cargo.lock generated
View File

@@ -36,9 +36,9 @@ dependencies = [
[[package]]
name = "actix-http"
version = "3.2.2"
version = "3.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0c83abf9903e1f0ad9973cc4f7b9767fd5a03a583f51a5b7a339e07987cd2724"
checksum = "0070905b2c4a98d184c4e81025253cb192aa8a73827553f38e9410801ceb35bb"
dependencies = [
"actix-codec",
"actix-rt",
@@ -46,7 +46,7 @@ dependencies = [
"actix-tls",
"actix-utils",
"ahash",
"base64 0.13.1",
"base64 0.21.0",
"bitflags",
"brotli",
"bytes",
@@ -68,7 +68,10 @@ dependencies = [
"rand",
"sha1",
"smallvec",
"tokio",
"tokio-util",
"tracing",
"zstd 0.12.3+zstd.1.5.2",
]
[[package]]
@@ -164,9 +167,9 @@ dependencies = [
[[package]]
name = "actix-web"
version = "4.2.1"
version = "4.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d48f7b6534e06c7bfc72ee91db7917d4af6afe23e7d223b51e68fffbb21e96b9"
checksum = "464e0fddc668ede5f26ec1f9557a8d44eda948732f40c6b0ad79126930eb775f"
dependencies = [
"actix-codec",
"actix-http",
@@ -1110,20 +1113,26 @@ dependencies = [
[[package]]
name = "deserr"
version = "0.3.0"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "28380303ca15ec07e1d5b079baf19cf849b09edad5cab219c1c51b2bd07523de"
checksum = "6eee2844f21cf7fb5693aae1fb8f1658127acfdb2fc072167d68a9152584ae64"
dependencies = [
"actix-http",
"actix-utils",
"actix-web",
"deserr-internal",
"futures",
"serde-cs",
"serde_json",
"serde_urlencoded",
"strsim",
]
[[package]]
name = "deserr-internal"
version = "0.3.0"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "860928cd8af78d223a3d70dd581f21d7c3de8aa2eecd938e0c0a399ded7c1451"
checksum = "c27246f8ca9eeba9dd70d614b664dc43b529251ed7bd9e633131010d340da4b9"
dependencies = [
"convert_case 0.5.0",
"proc-macro2",
@@ -2527,6 +2536,7 @@ dependencies = [
"base64 0.13.1",
"enum-iterator",
"hmac",
"maplit",
"meilisearch-types",
"rand",
"roaring",
@@ -4422,7 +4432,7 @@ dependencies = [
"pbkdf2",
"sha1",
"time",
"zstd",
"zstd 0.11.2+zstd.1.5.2",
]
[[package]]
@@ -4431,7 +4441,16 @@ version = "0.11.2+zstd.1.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "20cc960326ece64f010d2d2107537f26dc589a6573a316bd5b1dba685fa5fde4"
dependencies = [
"zstd-safe",
"zstd-safe 5.0.2+zstd.1.5.2",
]
[[package]]
name = "zstd"
version = "0.12.3+zstd.1.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "76eea132fb024e0e13fd9c2f5d5d595d8a967aa72382ac2f9d39fcc95afd0806"
dependencies = [
"zstd-safe 6.0.4+zstd.1.5.4",
]
[[package]]
@@ -4445,10 +4464,20 @@ dependencies = [
]
[[package]]
name = "zstd-sys"
version = "2.0.5+zstd.1.5.2"
name = "zstd-safe"
version = "6.0.4+zstd.1.5.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "edc50ffce891ad571e9f9afe5039c4837bede781ac4bb13052ed7ae695518596"
checksum = "7afb4b54b8910cf5447638cb54bf4e8a65cbedd783af98b98c62ffe91f185543"
dependencies = [
"libc",
"zstd-sys",
]
[[package]]
name = "zstd-sys"
version = "2.0.7+zstd.1.5.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94509c3ba2fe55294d752b79842c530ccfab760192521df74a081a78d2b3c7f5"
dependencies = [
"cc",
"libc",

View File

@@ -7,7 +7,8 @@ WORKDIR /meilisearch
ARG COMMIT_SHA
ARG COMMIT_DATE
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE}
ARG GIT_TAG
ENV VERGEN_GIT_SHA=${COMMIT_SHA} VERGEN_GIT_COMMIT_TIMESTAMP=${COMMIT_DATE} VERGEN_GIT_SEMVER_LIGHTWEIGHT=${GIT_TAG}
ENV RUSTFLAGS="-C target-feature=-crt-static"
COPY . .

View File

@@ -203,12 +203,11 @@ pub(crate) mod test {
use big_s::S;
use maplit::btreeset;
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::{Action, Key};
use meilisearch_types::milli::update::Setting;
use meilisearch_types::milli::{self};
use meilisearch_types::settings::{Checked, Settings};
use meilisearch_types::star_or::StarOr;
use meilisearch_types::tasks::{Details, Status};
use serde_json::{json, Map, Value};
use time::macros::datetime;
@@ -341,7 +340,7 @@ pub(crate) mod test {
name: Some(S("doggos_key")),
uid: Uuid::from_str("9f8a34da-b6b2-42f0-939b-dbd4c3448655").unwrap(),
actions: vec![Action::DocumentsAll],
indexes: vec![StarOr::Other(IndexUid::from_str("doggos").unwrap())],
indexes: vec![IndexUidPattern::from_str("doggos").unwrap()],
expires_at: Some(datetime!(4130-03-14 12:21 UTC)),
created_at: datetime!(1960-11-15 0:00 UTC),
updated_at: datetime!(2022-11-10 0:00 UTC),
@@ -351,7 +350,7 @@ pub(crate) mod test {
name: Some(S("master_key")),
uid: Uuid::from_str("4622f717-1c00-47bb-a494-39d76a49b591").unwrap(),
actions: vec![Action::All],
indexes: vec![StarOr::Star],
indexes: vec![IndexUidPattern::all()],
expires_at: None,
created_at: datetime!(0000-01-01 00:01 UTC),
updated_at: datetime!(1964-05-04 17:25 UTC),

View File

@@ -181,10 +181,8 @@ impl CompatV5ToV6 {
.indexes
.into_iter()
.map(|index| match index {
v5::StarOr::Star => v6::StarOr::Star,
v5::StarOr::Other(uid) => {
v6::StarOr::Other(v6::IndexUid::new_unchecked(uid.as_str()))
}
v5::StarOr::Star => v6::IndexUidPattern::all(),
v5::StarOr::Other(uid) => v6::IndexUidPattern::new_unchecked(uid.as_str()),
})
.collect(),
expires_at: key.expires_at,

View File

@@ -34,8 +34,7 @@ pub type PaginationSettings = meilisearch_types::settings::PaginationSettings;
// everything related to the api keys
pub type Action = meilisearch_types::keys::Action;
pub type StarOr<T> = meilisearch_types::star_or::StarOr<T>;
pub type IndexUid = meilisearch_types::index_uid::IndexUid;
pub type IndexUidPattern = meilisearch_types::index_uid_pattern::IndexUidPattern;
// everything related to the errors
pub type ResponseError = meilisearch_types::error::ResponseError;

View File

@@ -169,6 +169,22 @@ impl Batch {
Batch::IndexSwap { task } => vec![task.uid],
}
}
/// Return the index UID associated with this batch
pub fn index_uid(&self) -> Option<&str> {
use Batch::*;
match self {
TaskCancelation { .. }
| TaskDeletion(_)
| SnapshotCreation(_)
| Dump(_)
| IndexSwap { .. } => None,
IndexOperation { op, .. } => Some(op.index_uid()),
IndexCreation { index_uid, .. }
| IndexUpdate { index_uid, .. }
| IndexDeletion { index_uid, .. } => Some(index_uid),
}
}
}
impl IndexOperation {

View File

@@ -1,20 +1,20 @@
use std::collections::hash_map::Entry;
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::path::PathBuf;
use std::sync::{Arc, RwLock};
use std::time::Duration;
use std::{fs, thread};
use log::error;
use meilisearch_types::heed::types::Str;
use meilisearch_types::heed::{Database, Env, EnvOpenOptions, RoTxn, RwTxn};
use meilisearch_types::heed::{Database, Env, RoTxn, RwTxn};
use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::Index;
use time::OffsetDateTime;
use uuid::Uuid;
use self::IndexStatus::{Available, BeingDeleted};
use self::index_map::IndexMap;
use self::IndexStatus::{Available, BeingDeleted, Closing, Missing};
use crate::uuid_codec::UuidCodec;
use crate::{clamp_to_page_size, Error, Result};
use crate::{Error, Result};
const INDEX_MAPPING: &str = "index-mapping";
@@ -25,26 +25,423 @@ const INDEX_MAPPING: &str = "index-mapping";
/// 2. Opening indexes and storing references to these opened indexes
/// 3. Accessing indexes through their uuid
/// 4. Mapping a user-defined name to each index uuid.
///
/// # Implementation notes
///
/// An index exists as 3 bits of data:
/// 1. The index data on disk, that can exist in 3 states: Missing, Present, or BeingDeleted.
/// 2. The persistent database containing the association between the index' name and its UUID,
/// that can exist in 2 states: Missing or Present.
/// 3. The state of the index in the in-memory `IndexMap`, that can exist in multiple states:
/// - Missing
/// - Available
/// - Closing (because an index needs resizing or was evicted from the cache)
/// - BeingDeleted
///
/// All of this data should be kept consistent between index operations, which is achieved by the `IndexMapper`
/// with the use of the following primitives:
/// - A RwLock on the `IndexMap`.
/// - Transactions on the association database.
/// - ClosingEvent signals emitted when closing an environment.
#[derive(Clone)]
pub struct IndexMapper {
/// Keep track of the opened indexes. Used mainly by the index resolver.
index_map: Arc<RwLock<HashMap<Uuid, IndexStatus>>>,
index_map: Arc<RwLock<IndexMap>>,
/// Map an index name with an index uuid currently available on disk.
pub(crate) index_mapping: Database<Str, UuidCodec>,
/// Path to the folder where the LMDB environments of each index are.
base_path: PathBuf,
index_size: usize,
/// The map size an index is opened with on the first time.
index_base_map_size: usize,
/// The quantity by which the map size of an index is incremented upon reopening, in bytes.
index_growth_amount: usize,
pub indexer_config: Arc<IndexerConfig>,
}
mod index_map {
/// the map size to use when we don't succeed in reading it in indexes.
const DEFAULT_MAP_SIZE: usize = 10 * 1024 * 1024 * 1024; // 10 GiB
use std::collections::BTreeMap;
use std::path::Path;
use std::time::Duration;
use meilisearch_types::heed::{EnvClosingEvent, EnvOpenOptions};
use meilisearch_types::milli::Index;
use time::OffsetDateTime;
use uuid::Uuid;
use super::IndexStatus::{self, Available, BeingDeleted, Closing, Missing};
use crate::lru::{InsertionOutcome, LruMap};
use crate::{clamp_to_page_size, Result};
/// Keep an internally consistent view of the open indexes in memory.
///
/// This view is made of an LRU cache that will evict the least frequently used indexes when new indexes are opened.
/// Indexes that are being closed (for resizing or due to cache eviction) or deleted cannot be evicted from the cache and
/// are stored separately.
///
/// This view provides operations to change the state of the index as it is known in memory:
/// open an index (making it available for queries), close an index (specifying the new size it should be opened with),
/// delete an index.
///
/// External consistency with the other bits of data of an index is provided by the `IndexMapper` parent structure.
pub struct IndexMap {
/// A LRU map of indexes that are in the open state and available for queries.
available: LruMap<Uuid, Index>,
/// A map of indexes that are not available for queries, either because they are being deleted
/// or because they are being closed.
///
/// If they are being deleted, the UUID points to `None`.
unavailable: BTreeMap<Uuid, Option<ClosingIndex>>,
/// A monotonically increasing generation number, used to differentiate between multiple successive index closing requests.
///
/// Because multiple readers could be waiting on an index to close, the following could theoretically happen:
///
/// 1. Multiple readers wait for the index closing to occur.
/// 2. One of them "wins the race", takes the lock and then removes the index that finished closing from the map.
/// 3. The index is reopened, but must be closed again (such as being resized again).
/// 4. One reader that "lost the race" in (2) wakes up and tries to take the lock and remove the index from the map.
///
/// In that situation, the index may or may not have finished closing. The `generation` field allows to remember which
/// closing request was made, so the reader that "lost the race" has the old generation and will need to wait again for the index
/// to close.
generation: usize,
}
#[derive(Clone)]
pub struct ClosingIndex {
uuid: Uuid,
closing_event: EnvClosingEvent,
map_size: usize,
generation: usize,
}
impl ClosingIndex {
/// Waits for the index to be definitely closed.
///
/// To avoid blocking, users should relinquish their locks to the IndexMap before calling this function.
///
/// After the index is physically closed, the in memory map must still be updated to take this into account.
/// To do so, a `ReopenableIndex` is returned, that can be used to either definitely close or definitely open
/// the index without waiting anymore.
pub fn wait_timeout(self, timeout: Duration) -> Option<ReopenableIndex> {
self.closing_event.wait_timeout(timeout).then_some(ReopenableIndex {
uuid: self.uuid,
map_size: self.map_size,
generation: self.generation,
})
}
}
pub struct ReopenableIndex {
uuid: Uuid,
map_size: usize,
generation: usize,
}
impl ReopenableIndex {
/// Attempts to reopen the index, which can result in the index being reopened again or not
/// (e.g. if another thread already opened and closed the index again).
///
/// Use get again on the IndexMap to get the updated status.
///
/// Fails if the underlying index creation fails.
///
/// # Status table
///
/// | Previous Status | New Status |
/// |-----------------|------------|
/// | Missing | Missing |
/// | BeingDeleted | BeingDeleted |
/// | Closing | Available or Closing depending on generation |
/// | Available | Available |
///
pub fn reopen(self, map: &mut IndexMap, path: &Path) -> Result<()> {
if let Closing(reopen) = map.get(&self.uuid) {
if reopen.generation != self.generation {
return Ok(());
}
map.unavailable.remove(&self.uuid);
map.create(&self.uuid, path, None, self.map_size)?;
}
Ok(())
}
/// Attempts to close the index, which may or may not result in the index being closed
/// (e.g. if another thread already reopened the index again).
///
/// Use get again on the IndexMap to get the updated status.
///
/// # Status table
///
/// | Previous Status | New Status |
/// |-----------------|------------|
/// | Missing | Missing |
/// | BeingDeleted | BeingDeleted |
/// | Closing | Missing or Closing depending on generation |
/// | Available | Available |
pub fn close(self, map: &mut IndexMap) {
if let Closing(reopen) = map.get(&self.uuid) {
if reopen.generation != self.generation {
return;
}
map.unavailable.remove(&self.uuid);
}
}
}
impl IndexMap {
pub fn new(cap: usize) -> IndexMap {
Self { unavailable: Default::default(), available: LruMap::new(cap), generation: 0 }
}
/// Gets the current status of an index in the map.
///
/// If the index is available it can be accessed from the returned status.
pub fn get(&self, uuid: &Uuid) -> IndexStatus {
self.available
.get(uuid)
.map(|index| Available(index.clone()))
.unwrap_or_else(|| self.get_unavailable(uuid))
}
fn get_unavailable(&self, uuid: &Uuid) -> IndexStatus {
match self.unavailable.get(uuid) {
Some(Some(reopen)) => Closing(reopen.clone()),
Some(None) => BeingDeleted,
None => Missing,
}
}
/// Attempts to create a new index that wasn't existing before.
///
/// # Status table
///
/// | Previous Status | New Status |
/// |-----------------|------------|
/// | Missing | Available |
/// | BeingDeleted | panics |
/// | Closing | panics |
/// | Available | panics |
///
pub fn create(
&mut self,
uuid: &Uuid,
path: &Path,
date: Option<(OffsetDateTime, OffsetDateTime)>,
map_size: usize,
) -> Result<Index> {
if !matches!(self.get_unavailable(uuid), Missing) {
panic!("Attempt to open an index that was unavailable");
}
let index = create_or_open_index(path, date, map_size)?;
match self.available.insert(*uuid, index.clone()) {
InsertionOutcome::InsertedNew => (),
InsertionOutcome::Evicted(evicted_uuid, evicted_index) => {
self.close(evicted_uuid, evicted_index, 0);
}
InsertionOutcome::Replaced(_) => {
panic!("Attempt to open an index that was already opened")
}
}
Ok(index)
}
/// Increases the current generation. See documentation for this field.
///
/// In the unlikely event that the 2^64 generations would have been exhausted, we simply wrap-around.
///
/// For this to cause an issue, one should be able to stop a reader in time after it got a `ReopenableIndex` and before it takes the lock
/// to remove it from the unavailable map, and keep the reader in this frozen state for 2^64 closing of other indexes.
///
/// This seems overwhelmingly impossible to achieve in practice.
fn next_generation(&mut self) -> usize {
self.generation = self.generation.wrapping_add(1);
self.generation
}
/// Attempts to close an index.
///
/// # Status table
///
/// | Previous Status | New Status |
/// |-----------------|------------|
/// | Missing | Missing |
/// | BeingDeleted | BeingDeleted |
/// | Closing | Closing |
/// | Available | Closing |
///
pub fn close_for_resize(&mut self, uuid: &Uuid, map_size_growth: usize) {
let Some(index) = self.available.remove(uuid) else { return; };
self.close(*uuid, index, map_size_growth);
}
fn close(&mut self, uuid: Uuid, index: Index, map_size_growth: usize) {
let map_size = index.map_size().unwrap_or(DEFAULT_MAP_SIZE) + map_size_growth;
let closing_event = index.prepare_for_closing();
let generation = self.next_generation();
self.unavailable
.insert(uuid, Some(ClosingIndex { uuid, closing_event, map_size, generation }));
}
/// Attempts to delete and index.
///
/// `end_deletion` must be called just after.
///
/// # Status table
///
/// | Previous Status | New Status | Return value |
/// |-----------------|------------|--------------|
/// | Missing | BeingDeleted | Ok(None) |
/// | BeingDeleted | BeingDeleted | Err(None) |
/// | Closing | Closing | Err(Some(reopen)) |
/// | Available | BeingDeleted | Ok(Some(env_closing_event)) |
pub fn start_deletion(
&mut self,
uuid: &Uuid,
) -> std::result::Result<Option<EnvClosingEvent>, Option<ClosingIndex>> {
if let Some(index) = self.available.remove(uuid) {
return Ok(Some(index.prepare_for_closing()));
}
match self.unavailable.remove(uuid) {
Some(Some(reopen)) => Err(Some(reopen)),
Some(None) => Err(None),
None => Ok(None),
}
}
/// Marks that an index deletion finished.
///
/// Must be used after calling `start_deletion`.
///
/// # Status table
///
/// | Previous Status | New Status |
/// |-----------------|------------|
/// | Missing | Missing |
/// | BeingDeleted | Missing |
/// | Closing | panics |
/// | Available | panics |
pub fn end_deletion(&mut self, uuid: &Uuid) {
assert!(
self.available.get(uuid).is_none(),
"Attempt to finish deletion of an index that was not being deleted"
);
// Do not panic if the index was Missing or BeingDeleted
assert!(
!matches!(self.unavailable.remove(uuid), Some(Some(_))),
"Attempt to finish deletion of an index that was being closed"
);
}
}
/// Create or open an index in the specified path.
/// The path *must* exist or an error will be thrown.
fn create_or_open_index(
path: &Path,
date: Option<(OffsetDateTime, OffsetDateTime)>,
map_size: usize,
) -> Result<Index> {
let mut options = EnvOpenOptions::new();
options.map_size(clamp_to_page_size(map_size));
options.max_readers(1024);
if let Some((created, updated)) = date {
Ok(Index::new_with_creation_dates(options, path, created, updated)?)
} else {
Ok(Index::new(options, path)?)
}
}
/// Putting the tests of the LRU down there so we have access to the cache's private members
#[cfg(test)]
mod tests {
use meilisearch_types::heed::Env;
use meilisearch_types::Index;
use uuid::Uuid;
use super::super::IndexMapper;
use crate::tests::IndexSchedulerHandle;
use crate::utils::clamp_to_page_size;
use crate::IndexScheduler;
impl IndexMapper {
fn test() -> (Self, Env, IndexSchedulerHandle) {
let (index_scheduler, handle) = IndexScheduler::test(true, vec![]);
(index_scheduler.index_mapper, index_scheduler.env, handle)
}
}
fn check_first_unavailable(mapper: &IndexMapper, expected_uuid: Uuid, is_closing: bool) {
let index_map = mapper.index_map.read().unwrap();
let (uuid, state) = index_map.unavailable.first_key_value().unwrap();
assert_eq!(uuid, &expected_uuid);
assert_eq!(state.is_some(), is_closing);
}
#[test]
fn evict_indexes() {
let (mapper, env, _handle) = IndexMapper::test();
let mut uuids = vec![];
// LRU cap + 1
for i in 0..(5 + 1) {
let index_name = format!("index-{i}");
let wtxn = env.write_txn().unwrap();
mapper.create_index(wtxn, &index_name, None).unwrap();
let txn = env.read_txn().unwrap();
uuids.push(mapper.index_mapping.get(&txn, &index_name).unwrap().unwrap());
}
// index-0 was evicted
check_first_unavailable(&mapper, uuids[0], true);
// get back the evicted index
let wtxn = env.write_txn().unwrap();
mapper.create_index(wtxn, "index-0", None).unwrap();
// Least recently used is now index-1
check_first_unavailable(&mapper, uuids[1], true);
}
#[test]
fn resize_index() {
let (mapper, env, _handle) = IndexMapper::test();
let index = mapper.create_index(env.write_txn().unwrap(), "index", None).unwrap();
assert_index_size(index, mapper.index_base_map_size);
mapper.resize_index(&env.read_txn().unwrap(), "index").unwrap();
let index = mapper.create_index(env.write_txn().unwrap(), "index", None).unwrap();
assert_index_size(index, mapper.index_base_map_size + mapper.index_growth_amount);
mapper.resize_index(&env.read_txn().unwrap(), "index").unwrap();
let index = mapper.create_index(env.write_txn().unwrap(), "index", None).unwrap();
assert_index_size(index, mapper.index_base_map_size + mapper.index_growth_amount * 2);
}
fn assert_index_size(index: Index, expected: usize) {
let expected = clamp_to_page_size(expected);
let index_map_size = index.map_size().unwrap();
assert_eq!(index_map_size, expected);
}
}
}
/// Whether the index is available for use or is forbidden to be inserted back in the index map
#[allow(clippy::large_enum_variant)]
#[derive(Clone)]
pub enum IndexStatus {
/// Not currently in the index map.
Missing,
/// Do not insert it back in the index map as it is currently being deleted.
BeingDeleted,
/// Temporarily do not insert the index in the index map as it is currently being resized/evicted from the map.
Closing(index_map::ClosingIndex),
/// You can use the index without worrying about anything.
Available(Index),
}
@@ -53,36 +450,21 @@ impl IndexMapper {
pub fn new(
env: &Env,
base_path: PathBuf,
index_size: usize,
index_base_map_size: usize,
index_growth_amount: usize,
index_count: usize,
indexer_config: IndexerConfig,
) -> Result<Self> {
Ok(Self {
index_map: Arc::default(),
index_map: Arc::new(RwLock::new(IndexMap::new(index_count))),
index_mapping: env.create_database(Some(INDEX_MAPPING))?,
base_path,
index_size,
index_base_map_size,
index_growth_amount,
indexer_config: Arc::new(indexer_config),
})
}
/// Create or open an index in the specified path.
/// The path *must* exists or an error will be thrown.
fn create_or_open_index(
&self,
path: &Path,
date: Option<(OffsetDateTime, OffsetDateTime)>,
) -> Result<Index> {
let mut options = EnvOpenOptions::new();
options.map_size(clamp_to_page_size(self.index_size));
options.max_readers(1024);
if let Some((created, updated)) = date {
Ok(Index::new_with_creation_dates(options, path, created, updated)?)
} else {
Ok(Index::new(options, path)?)
}
}
/// Get or create the index.
pub fn create_index(
&self,
@@ -102,15 +484,17 @@ impl IndexMapper {
let index_path = self.base_path.join(uuid.to_string());
fs::create_dir_all(&index_path)?;
let index = self.create_or_open_index(&index_path, date)?;
// Error if the UUIDv4 somehow already exists in the map, since it should be fresh.
// This is very unlikely to happen in practice.
// TODO: it would be better to lazily create the index. But we need an Index::open function for milli.
let index = self.index_map.write().unwrap().create(
&uuid,
&index_path,
date,
self.index_base_map_size,
)?;
wtxn.commit()?;
// TODO: it would be better to lazily create the index. But we need an Index::open function for milli.
if let Some(BeingDeleted) =
self.index_map.write().unwrap().insert(uuid, Available(index.clone()))
{
panic!("Uuid v4 conflict.");
}
Ok(index)
}
@@ -130,14 +514,31 @@ impl IndexMapper {
assert!(self.index_mapping.delete(&mut wtxn, name)?);
wtxn.commit()?;
// We remove the index from the in-memory index map.
let mut lock = self.index_map.write().unwrap();
let closing_event = match lock.insert(uuid, BeingDeleted) {
Some(Available(index)) => Some(index.prepare_for_closing()),
_ => None,
};
drop(lock);
let mut tries = 0;
// We remove the index from the in-memory index map.
let closing_event = loop {
let mut lock = self.index_map.write().unwrap();
match lock.start_deletion(&uuid) {
Ok(env_closing) => break env_closing,
Err(Some(reopen)) => {
// drop the lock here so that we don't synchronously wait for the index to close.
drop(lock);
tries += 1;
if tries >= 100 {
panic!("Too many attempts to close index {name} prior to deletion.")
}
let reopen = if let Some(reopen) = reopen.wait_timeout(Duration::from_secs(6)) {
reopen
} else {
continue;
};
reopen.close(&mut self.index_map.write().unwrap());
continue;
}
Err(None) => return Ok(()),
}
};
let index_map = self.index_map.clone();
let index_path = self.base_path.join(uuid.to_string());
@@ -146,7 +547,7 @@ impl IndexMapper {
.name(String::from("index_deleter"))
.spawn(move || {
// We first wait to be sure that the previously opened index is effectively closed.
// This can take a lot of time, this is why we do that in a seperate thread.
// This can take a lot of time, this is why we do that in a separate thread.
if let Some(closing_event) = closing_event {
closing_event.wait();
}
@@ -160,7 +561,7 @@ impl IndexMapper {
}
// Finally we remove the entry from the index map.
assert!(matches!(index_map.write().unwrap().remove(&uuid), Some(BeingDeleted)));
index_map.write().unwrap().end_deletion(&uuid);
})
.unwrap();
@@ -171,6 +572,26 @@ impl IndexMapper {
Ok(self.index_mapping.get(rtxn, name)?.is_some())
}
/// Resizes the maximum size of the specified index to the double of its current maximum size.
///
/// This operation involves closing the underlying environment and so can take a long time to complete.
///
/// # Panics
///
/// - If the Index corresponding to the passed name is concurrently being deleted/resized or cannot be found in the
/// in memory hash map.
pub fn resize_index(&self, rtxn: &RoTxn, name: &str) -> Result<()> {
let uuid = self
.index_mapping
.get(rtxn, name)?
.ok_or_else(|| Error::IndexNotFound(name.to_string()))?;
// We remove the index from the in-memory index map.
self.index_map.write().unwrap().close_for_resize(&uuid, self.index_growth_amount);
Ok(())
}
/// Return an index, may open it if it wasn't already opened.
pub fn index(&self, rtxn: &RoTxn, name: &str) -> Result<Index> {
let uuid = self
@@ -179,31 +600,54 @@ impl IndexMapper {
.ok_or_else(|| Error::IndexNotFound(name.to_string()))?;
// we clone here to drop the lock before entering the match
let index = self.index_map.read().unwrap().get(&uuid).cloned();
let index = match index {
Some(Available(index)) => index,
Some(BeingDeleted) => return Err(Error::IndexNotFound(name.to_string())),
// since we're lazy, it's possible that the index has not been opened yet.
None => {
let mut index_map = self.index_map.write().unwrap();
// between the read lock and the write lock it's not impossible
// that someone already opened the index (eg if two search happens
// at the same time), thus before opening it we check a second time
// if it's not already there.
// Since there is a good chance it's not already there we can use
// the entry method.
match index_map.entry(uuid) {
Entry::Vacant(entry) => {
let index_path = self.base_path.join(uuid.to_string());
let mut tries = 0;
let index = loop {
tries += 1;
if tries > 100 {
panic!("Too many spurious wake ups while the index is being resized");
}
let index = self.index_map.read().unwrap().get(&uuid);
let index = self.create_or_open_index(&index_path, None)?;
entry.insert(Available(index.clone()));
index
}
Entry::Occupied(entry) => match entry.get() {
Available(index) => index.clone(),
match index {
Available(index) => break index,
Closing(reopen) => {
// Avoiding deadlocks: no lock taken while doing this operation.
let reopen = if let Some(reopen) = reopen.wait_timeout(Duration::from_secs(6)) {
reopen
} else {
continue;
};
let index_path = self.base_path.join(uuid.to_string());
// take the lock to reopen the environment.
reopen.reopen(&mut self.index_map.write().unwrap(), &index_path)?;
continue;
}
BeingDeleted => return Err(Error::IndexNotFound(name.to_string())),
// since we're lazy, it's possible that the index has not been opened yet.
Missing => {
let mut index_map = self.index_map.write().unwrap();
// between the read lock and the write lock it's not impossible
// that someone already opened the index (eg if two searches happen
// at the same time), thus before opening it we check a second time
// if it's not already there.
match index_map.get(&uuid) {
Missing => {
let index_path = self.base_path.join(uuid.to_string());
break index_map.create(
&uuid,
&index_path,
None,
self.index_base_map_size,
)?;
}
Available(index) => break index,
Closing(_) => {
// the reopening will be handled in the next loop operation
continue;
}
BeingDeleted => return Err(Error::IndexNotFound(name.to_string())),
},
}
}
}
};

View File

@@ -24,6 +24,7 @@ pub mod error;
mod index_mapper;
#[cfg(test)]
mod insta_snapshot;
mod lru;
mod utils;
mod uuid_codec;
@@ -31,7 +32,7 @@ pub type Result<T> = std::result::Result<T, Error>;
pub type TaskId = u32;
use std::ops::{Bound, RangeBounds};
use std::path::PathBuf;
use std::path::{Path, PathBuf};
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering::Relaxed;
use std::sync::{Arc, RwLock};
@@ -43,6 +44,7 @@ use file_store::FileStore;
use meilisearch_types::error::ResponseError;
use meilisearch_types::heed::types::{OwnedType, SerdeBincode, SerdeJson, Str};
use meilisearch_types::heed::{self, Database, Env, RoTxn};
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::milli;
use meilisearch_types::milli::documents::DocumentsBatchBuilder;
use meilisearch_types::milli::update::IndexerConfig;
@@ -229,8 +231,12 @@ pub struct IndexSchedulerOptions {
pub dumps_path: PathBuf,
/// The maximum size, in bytes, of the task index.
pub task_db_size: usize,
/// The maximum size, in bytes, of each meilisearch index.
pub index_size: usize,
/// The size, in bytes, with which a meilisearch index is opened the first time of each meilisearch index.
pub index_base_map_size: usize,
/// The size, in bytes, by which the map size of an index is increased when it resized due to being full.
pub index_growth_amount: usize,
/// The number of indexes that can be concurrently opened in memory.
pub index_count: usize,
/// Configuration used during indexing for each meilisearch index.
pub indexer_config: IndexerConfig,
/// Set to `true` iff the index scheduler is allowed to automatically
@@ -360,9 +366,25 @@ impl IndexScheduler {
std::fs::create_dir_all(&options.indexes_path)?;
std::fs::create_dir_all(&options.dumps_path)?;
let task_db_size = clamp_to_page_size(options.task_db_size);
let budget = if options.indexer_config.skip_index_budget {
IndexBudget {
map_size: options.index_base_map_size,
index_count: options.index_count,
task_db_size,
}
} else {
Self::index_budget(
&options.tasks_path,
options.index_base_map_size,
task_db_size,
options.index_count,
)
};
let env = heed::EnvOpenOptions::new()
.max_dbs(10)
.map_size(clamp_to_page_size(options.task_db_size))
.map_size(budget.task_db_size)
.open(options.tasks_path)?;
let file_store = FileStore::new(&options.update_file_path)?;
@@ -382,7 +404,9 @@ impl IndexScheduler {
index_mapper: IndexMapper::new(
&env,
options.indexes_path,
options.index_size,
budget.map_size,
options.index_growth_amount,
budget.index_count,
options.indexer_config,
)?,
env,
@@ -406,6 +430,60 @@ impl IndexScheduler {
Ok(this)
}
fn index_budget(
tasks_path: &Path,
base_map_size: usize,
mut task_db_size: usize,
max_index_count: usize,
) -> IndexBudget {
let budget = utils::dichotomic_search(base_map_size, |map_size| {
Self::is_good_heed(tasks_path, map_size)
});
log::info!("memmap budget: {budget}B");
let mut budget = budget / 2;
if task_db_size > (budget / 2) {
task_db_size = clamp_to_page_size(budget * 2 / 5);
log::warn!(
"Decreasing max size of task DB to {task_db_size}B due to constrained memory space"
);
}
budget -= task_db_size;
// won't be mutated again
let budget = budget;
let task_db_size = task_db_size;
log::info!("index budget: {budget}B");
let mut index_count = budget / base_map_size;
if index_count < 2 {
// take a bit less than half than the budget to make sure we can always afford to open an index
let map_size = (budget * 2) / 5;
// single index of max budget
log::warn!("1 index of {map_size}B can be opened simultaneously.");
return IndexBudget { map_size, index_count: 1, task_db_size };
}
// give us some space for an additional index when the cache is already full
// decrement is OK because index_count >= 2.
index_count -= 1;
if index_count > max_index_count {
index_count = max_index_count;
}
log::info!("Up to {index_count} indexes of {base_map_size}B opened simultaneously.");
IndexBudget { map_size: base_map_size, index_count, task_db_size }
}
fn is_good_heed(tasks_path: &Path, map_size: usize) -> bool {
if let Ok(env) =
heed::EnvOpenOptions::new().map_size(clamp_to_page_size(map_size)).open(tasks_path)
{
env.prepare_for_closing().wait();
true
} else {
false
}
}
pub fn read_txn(&self) -> Result<RoTxn> {
self.env.read_txn().map_err(|e| e.into())
}
@@ -422,12 +500,12 @@ impl IndexScheduler {
#[cfg(test)]
run.breakpoint(Breakpoint::Init);
loop {
run.wake_up.wait();
run.wake_up.wait();
loop {
match run.tick() {
Ok(0) => (),
Ok(_) => run.wake_up.signal(),
Ok(TickOutcome::TickAgain(_)) => (),
Ok(TickOutcome::WaitForSignal) => run.wake_up.wait(),
Err(e) => {
log::error!("{}", e);
// Wait one second when an irrecoverable error occurs.
@@ -440,7 +518,6 @@ impl IndexScheduler {
) {
std::thread::sleep(Duration::from_secs(1));
}
run.wake_up.signal();
}
}
}
@@ -630,7 +707,7 @@ impl IndexScheduler {
&self,
rtxn: &RoTxn,
query: &Query,
authorized_indexes: &Option<Vec<String>>,
authorized_indexes: &Option<Vec<IndexUidPattern>>,
) -> Result<RoaringBitmap> {
let mut tasks = self.get_task_ids(rtxn, query)?;
@@ -648,7 +725,7 @@ impl IndexScheduler {
let all_indexes_iter = self.index_tasks.iter(rtxn)?;
for result in all_indexes_iter {
let (index, index_tasks) = result?;
if !authorized_indexes.contains(&index.to_owned()) {
if !authorized_indexes.iter().any(|p| p.matches_str(index)) {
tasks -= index_tasks;
}
}
@@ -668,7 +745,7 @@ impl IndexScheduler {
pub fn get_tasks_from_authorized_indexes(
&self,
query: Query,
authorized_indexes: Option<Vec<String>>,
authorized_indexes: Option<Vec<IndexUidPattern>>,
) -> Result<Vec<Task>> {
let rtxn = self.env.read_txn()?;
@@ -764,8 +841,8 @@ impl IndexScheduler {
Ok(task)
}
/// Register a new task comming from a dump in the scheduler.
/// By takinig a mutable ref we're pretty sure no one will ever import a dump while actix is running.
/// Register a new task coming from a dump in the scheduler.
/// By taking a mutable ref we're pretty sure no one will ever import a dump while actix is running.
pub fn register_dumped_task(
&mut self,
task: TaskDump,
@@ -926,7 +1003,7 @@ impl IndexScheduler {
/// 5. Reset the in-memory list of processed tasks.
///
/// Returns the number of processed tasks.
fn tick(&self) -> Result<usize> {
fn tick(&self) -> Result<TickOutcome> {
#[cfg(test)]
{
*self.run_loop_iteration.write().unwrap() += 1;
@@ -937,8 +1014,9 @@ impl IndexScheduler {
let batch =
match self.create_next_batch(&rtxn).map_err(|e| Error::CreateBatch(Box::new(e)))? {
Some(batch) => batch,
None => return Ok(0),
None => return Ok(TickOutcome::WaitForSignal),
};
let index_uid = batch.index_uid().map(ToOwned::to_owned);
drop(rtxn);
// 1. store the starting date with the bitmap of processing tasks.
@@ -1009,7 +1087,23 @@ impl IndexScheduler {
// the `started_at` date times and `processings` of the current processing tasks.
// This date time is used by the task cancelation to store the right `started_at`
// date in the task on disk.
return Ok(0);
return Ok(TickOutcome::TickAgain(0));
}
// If an index said it was full, we need to:
// 1. identify which index is full
// 2. close the associated environment
// 3. resize it
// 4. re-schedule tasks
Err(Error::Milli(milli::Error::UserError(
milli::UserError::MaxDatabaseSizeReached,
))) if index_uid.is_some() => {
// fixme: add index_uid to match to avoid the unwrap
let index_uid = index_uid.unwrap();
// fixme: handle error more gracefully? not sure when this could happen
self.index_mapper.resize_index(&wtxn, &index_uid)?;
wtxn.abort().map_err(Error::HeedTransaction)?;
return Ok(TickOutcome::TickAgain(0));
}
// In case of a failure we must get back and patch all the tasks with the error.
Err(err) => {
@@ -1049,7 +1143,7 @@ impl IndexScheduler {
#[cfg(test)]
self.breakpoint(Breakpoint::AfterProcessing);
Ok(processed_tasks)
Ok(TickOutcome::TickAgain(processed_tasks))
}
pub(crate) fn delete_persisted_task_data(&self, task: &Task) -> Result<()> {
@@ -1084,6 +1178,26 @@ impl IndexScheduler {
}
}
/// The outcome of calling the [`IndexScheduler::tick`] function.
pub enum TickOutcome {
/// The scheduler should immediately attempt another `tick`.
///
/// The `usize` field contains the number of processed tasks.
TickAgain(usize),
/// The scheduler should wait for an external signal before attempting another `tick`.
WaitForSignal,
}
/// How many indexes we can afford to have open simultaneously.
struct IndexBudget {
/// Map size of an index.
map_size: usize,
/// Index count of an index.
index_count: usize,
/// For very constrained systems we might need to reduce the base task_db_size so we can accept at least one index.
task_db_size: usize,
}
#[cfg(test)]
mod tests {
use std::io::{BufWriter, Seek, Write};
@@ -1127,6 +1241,8 @@ mod tests {
let tempdir = TempDir::new().unwrap();
let (sender, receiver) = crossbeam::channel::bounded(0);
let indexer_config = IndexerConfig { skip_index_budget: true, ..Default::default() };
let options = IndexSchedulerOptions {
version_file_path: tempdir.path().join(VERSION_FILE_NAME),
auth_path: tempdir.path().join("auth"),
@@ -1136,8 +1252,10 @@ mod tests {
snapshots_path: tempdir.path().join("snapshots"),
dumps_path: tempdir.path().join("dumps"),
task_db_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose.
index_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose.
indexer_config: IndexerConfig::default(),
index_base_map_size: 1000 * 1000, // 1 MB, we don't use MiB on purpose.
index_growth_amount: 1000 * 1000, // 1 MB
index_count: 5,
indexer_config,
autobatching_enabled,
};
@@ -2521,7 +2639,11 @@ mod tests {
let query = Query { index_uids: Some(vec!["catto".to_owned()]), ..Default::default() };
let tasks = index_scheduler
.get_task_ids_from_authorized_indexes(&rtxn, &query, &Some(vec!["doggo".to_owned()]))
.get_task_ids_from_authorized_indexes(
&rtxn,
&query,
&Some(vec![IndexUidPattern::new_unchecked("doggo")]),
)
.unwrap();
// we have asked for only the tasks associated with catto, but are only authorized to retrieve the tasks
// associated with doggo -> empty result
@@ -2529,7 +2651,11 @@ mod tests {
let query = Query::default();
let tasks = index_scheduler
.get_task_ids_from_authorized_indexes(&rtxn, &query, &Some(vec!["doggo".to_owned()]))
.get_task_ids_from_authorized_indexes(
&rtxn,
&query,
&Some(vec![IndexUidPattern::new_unchecked("doggo")]),
)
.unwrap();
// we asked for all the tasks, but we are only authorized to retrieve the doggo tasks
// -> only the index creation of doggo should be returned
@@ -2540,7 +2666,10 @@ mod tests {
.get_task_ids_from_authorized_indexes(
&rtxn,
&query,
&Some(vec!["catto".to_owned(), "doggo".to_owned()]),
&Some(vec![
IndexUidPattern::new_unchecked("catto"),
IndexUidPattern::new_unchecked("doggo"),
]),
)
.unwrap();
// we asked for all the tasks, but we are only authorized to retrieve the doggo and catto tasks
@@ -2588,7 +2717,11 @@ mod tests {
let query = Query { canceled_by: Some(vec![task_cancelation.uid]), ..Query::default() };
let tasks = index_scheduler
.get_task_ids_from_authorized_indexes(&rtxn, &query, &Some(vec!["doggo".to_string()]))
.get_task_ids_from_authorized_indexes(
&rtxn,
&query,
&Some(vec![IndexUidPattern::new_unchecked("doggo")]),
)
.unwrap();
// Return only 1 because the user is not authorized to see task 2
snapshot!(snapshot_bitmap(&tasks), @"[1,]");

202
index-scheduler/src/lru.rs Normal file
View File

@@ -0,0 +1,202 @@
//! Thread-safe `Vec`-backend LRU cache using [`std::sync::atomic::AtomicU64`] for synchronization.
use std::sync::atomic::{AtomicU64, Ordering};
/// Thread-safe `Vec`-backend LRU cache
#[derive(Debug)]
pub struct Lru<T> {
data: Vec<(AtomicU64, T)>,
generation: AtomicU64,
cap: usize,
}
impl<T> Lru<T> {
/// Creates a new LRU cache with the specified capacity.
///
/// The capacity is allocated up-front, and will never change through a [`Self::put`] operation.
///
/// # Panics
///
/// - If the capacity is 0.
/// - If the capacity exceeds `isize::MAX` bytes.
pub fn new(cap: usize) -> Self {
assert_ne!(cap, 0, "The capacity of a cache cannot be 0");
Self {
// Note: since the element of the vector contains an AtomicU64, it is definitely not zero-sized so cap will never be usize::MAX.
data: Vec::with_capacity(cap),
generation: AtomicU64::new(0),
cap,
}
}
/// The capacity of this LRU cache, that is the maximum number of elements it can hold before evicting elements from the cache.
///
/// The cache will contain at most this number of elements at any given time.
pub fn capacity(&self) -> usize {
self.cap
}
fn next_generation(&self) -> u64 {
// Acquire so this "happens-before" any potential store to a data cell (with Release ordering)
let generation = self.generation.fetch_add(1, Ordering::Acquire);
generation + 1
}
fn next_generation_mut(&mut self) -> u64 {
let generation = self.generation.get_mut();
*generation += 1;
*generation
}
/// Add a value in the cache, evicting an older value if necessary.
///
/// If a value was evicted from the cache, it is returned.
///
/// # Complexity
///
/// - If the cache is full, then linear in the capacity.
/// - Otherwise constant.
pub fn put(&mut self, value: T) -> Option<T> {
// no need for a memory fence: we assume that whichever mechanism provides us synchronization
// (very probably, a RwLock) takes care of fencing for us.
let next_generation = self.next_generation_mut();
let evicted = if self.is_full() { self.pop() } else { None };
self.data.push((AtomicU64::new(next_generation), value));
evicted
}
/// Evict the oldest value from the cache.
///
/// If the cache is empty, `None` will be returned.
///
/// # Complexity
///
/// - Linear in the capacity of the cache.
pub fn pop(&mut self) -> Option<T> {
// Iterator::min_by_key provides shared references to its elements,
// but we need (and can afford!) an exclusive one, so let's make an explicit loop
let mut min_generation_index = None;
for (index, (generation, _)) in self.data.iter_mut().enumerate() {
let generation = *generation.get_mut();
if let Some((_, min_generation)) = min_generation_index {
if min_generation > generation {
min_generation_index = Some((index, generation));
}
} else {
min_generation_index = Some((index, generation))
}
}
min_generation_index.map(|(min_index, _)| self.data.swap_remove(min_index).1)
}
/// The current number of elements in the cache.
///
/// This value is guaranteed to be less than or equal to [`Self::capacity`].
pub fn len(&self) -> usize {
self.data.len()
}
/// Returns `true` if putting any additional element in the cache would cause the eviction of an element.
pub fn is_full(&self) -> bool {
self.len() == self.capacity()
}
}
pub struct LruMap<K, V>(Lru<(K, V)>);
impl<K, V> LruMap<K, V>
where
K: Eq,
{
/// Creates a new LRU cache map with the specified capacity.
///
/// The capacity is allocated up-front, and will never change through a [`Self::insert`] operation.
///
/// # Panics
///
/// - If the capacity is 0.
/// - If the capacity exceeds `isize::MAX` bytes.
pub fn new(cap: usize) -> Self {
Self(Lru::new(cap))
}
/// Gets a value in the cache map by its key.
///
/// If no value matches, `None` will be returned.
///
/// # Complexity
///
/// - Linear in the capacity of the cache.
pub fn get(&self, key: &K) -> Option<&V> {
for (generation, (candidate, value)) in self.0.data.iter() {
if key == candidate {
generation.store(self.0.next_generation(), Ordering::Release);
return Some(value);
}
}
None
}
/// Gets a value in the cache map by its key.
///
/// If no value matches, `None` will be returned.
///
/// # Complexity
///
/// - Linear in the capacity of the cache.
pub fn get_mut(&mut self, key: &K) -> Option<&mut V> {
let next_generation = self.0.next_generation_mut();
for (generation, (candidate, value)) in self.0.data.iter_mut() {
if key == candidate {
*generation.get_mut() = next_generation;
return Some(value);
}
}
None
}
/// Inserts a value in the cache map by its key, replacing any existing value and returning any evicted value.
///
/// # Complexity
///
/// - Linear in the capacity of the cache.
pub fn insert(&mut self, key: K, mut value: V) -> InsertionOutcome<K, V> {
match self.get_mut(&key) {
Some(old_value) => {
std::mem::swap(old_value, &mut value);
InsertionOutcome::Replaced(value)
}
None => match self.0.put((key, value)) {
Some((key, value)) => InsertionOutcome::Evicted(key, value),
None => InsertionOutcome::InsertedNew,
},
}
}
/// Removes an element from the cache map by its key, returning its value.
///
/// Returns `None` if there was no element with this key in the cache.
///
/// # Complexity
///
/// - Linear in the capacity of the cache.
pub fn remove(&mut self, key: &K) -> Option<V> {
for (index, (_, (candidate, _))) in self.0.data.iter_mut().enumerate() {
if key == candidate {
return Some(self.0.data.swap_remove(index).1 .1);
}
}
None
}
}
/// The result of an insertion in a LRU map.
pub enum InsertionOutcome<K, V> {
/// The key was not in the cache, the key-value pair has been inserted.
InsertedNew,
/// The key was not in the cache and an old key-value pair was evicted from the cache to make room for its insertions.
Evicted(K, V),
/// The key was already in the cache map, its value has been updated.
Replaced(V),
}

View File

@@ -529,3 +529,37 @@ impl IndexScheduler {
}
}
}
pub fn dichotomic_search(start_point: usize, mut is_good: impl FnMut(usize) -> bool) -> usize {
let mut biggest_good = None;
let mut smallest_bad = None;
let mut current = start_point;
loop {
let is_good = is_good(current);
(biggest_good, smallest_bad, current) = match (biggest_good, smallest_bad, is_good) {
(None, None, false) => (None, Some(current), current / 2),
(None, None, true) => (Some(current), None, current * 2),
(None, Some(smallest_bad), true) => {
(Some(current), Some(smallest_bad), (current + smallest_bad) / 2)
}
(None, Some(_), false) => (None, Some(current), current / 2),
(Some(_), None, true) => (Some(current), None, current * 2),
(Some(biggest_good), None, false) => {
(Some(biggest_good), Some(current), (biggest_good + current) / 2)
}
(Some(_), Some(smallest_bad), true) => {
(Some(current), Some(smallest_bad), (smallest_bad + current) / 2)
}
(Some(biggest_good), Some(_), false) => {
(Some(biggest_good), Some(current), (biggest_good + current) / 2)
}
};
if current == 0 {
return current;
}
if smallest_bad.is_some() && biggest_good.is_some() && biggest_good >= Some(current) {
return current;
}
}
}

View File

@@ -7,6 +7,7 @@ edition = "2021"
base64 = "0.13.1"
enum-iterator = "1.1.3"
hmac = "0.12.1"
maplit = "1.0.2"
meilisearch-types = { path = "../meilisearch-types" }
rand = "0.8.5"
roaring = { version = "0.10.0", features = ["serde"] }

View File

@@ -7,9 +7,10 @@ use std::path::Path;
use std::sync::Arc;
use error::{AuthControllerError, Result};
use maplit::hashset;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::{Action, CreateApiKey, Key, PatchApiKey};
use meilisearch_types::milli::update::Setting;
use meilisearch_types::star_or::StarOr;
use serde::{Deserialize, Serialize};
pub use store::open_auth_store_env;
use store::{generate_key_as_hexa, HeedAuthStore};
@@ -85,29 +86,12 @@ impl AuthController {
search_rules: Option<SearchRules>,
) -> Result<AuthFilter> {
let mut filters = AuthFilter::default();
let key = self
.store
.get_api_key(uid)?
.ok_or_else(|| AuthControllerError::ApiKeyNotFound(uid.to_string()))?;
let key = self.get_key(uid)?;
if !key.indexes.iter().any(|i| i == &StarOr::Star) {
filters.search_rules = match search_rules {
// Intersect search_rules with parent key authorized indexes.
Some(search_rules) => SearchRules::Map(
key.indexes
.into_iter()
.filter_map(|index| {
search_rules.get_index_search_rules(&format!("{index}")).map(
|index_search_rules| (index.to_string(), Some(index_search_rules)),
)
})
.collect(),
),
None => SearchRules::Set(key.indexes.into_iter().map(|x| x.to_string()).collect()),
};
} else if let Some(search_rules) = search_rules {
filters.search_rules = search_rules;
}
filters.search_rules = match search_rules {
Some(search_rules) => search_rules,
None => SearchRules::Set(key.indexes.into_iter().collect()),
};
filters.allow_index_creation = self.is_key_authorized(uid, Action::IndexesAdd, None)?;
@@ -150,9 +134,7 @@ impl AuthController {
.get_expiration_date(uid, action, None)?
.or(match index {
// else check if the key has access to the requested index.
Some(index) => {
self.store.get_expiration_date(uid, action, Some(index.as_bytes()))?
}
Some(index) => self.store.get_expiration_date(uid, action, Some(index))?,
// or to any index if no index has been requested.
None => self.store.prefix_first_expiration_date(uid, action)?,
}) {
@@ -192,42 +174,54 @@ impl Default for AuthFilter {
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(untagged)]
pub enum SearchRules {
Set(HashSet<String>),
Map(HashMap<String, Option<IndexSearchRules>>),
Set(HashSet<IndexUidPattern>),
Map(HashMap<IndexUidPattern, Option<IndexSearchRules>>),
}
impl Default for SearchRules {
fn default() -> Self {
Self::Set(Some("*".to_string()).into_iter().collect())
Self::Set(hashset! { IndexUidPattern::all() })
}
}
impl SearchRules {
pub fn is_index_authorized(&self, index: &str) -> bool {
match self {
Self::Set(set) => set.contains("*") || set.contains(index),
Self::Map(map) => map.contains_key("*") || map.contains_key(index),
Self::Set(set) => {
set.contains("*")
|| set.contains(index)
|| set.iter().any(|pattern| pattern.matches_str(index))
}
Self::Map(map) => {
map.contains_key("*")
|| map.contains_key(index)
|| map.keys().any(|pattern| pattern.matches_str(index))
}
}
}
pub fn get_index_search_rules(&self, index: &str) -> Option<IndexSearchRules> {
match self {
Self::Set(set) => {
if set.contains("*") || set.contains(index) {
Self::Set(_) => {
if self.is_index_authorized(index) {
Some(IndexSearchRules::default())
} else {
None
}
}
Self::Map(map) => {
map.get(index).or_else(|| map.get("*")).map(|isr| isr.clone().unwrap_or_default())
// We must take the most retrictive rule of this index uid patterns set of rules.
map.iter()
.filter(|(pattern, _)| pattern.matches_str(index))
.max_by_key(|(pattern, _)| (pattern.is_exact(), pattern.len()))
.and_then(|(_, rule)| rule.clone())
}
}
}
/// Return the list of indexes such that `self.is_index_authorized(index) == true`,
/// or `None` if all indexes satisfy this condition.
pub fn authorized_indexes(&self) -> Option<Vec<String>> {
pub fn authorized_indexes(&self) -> Option<Vec<IndexUidPattern>> {
match self {
SearchRules::Set(set) => {
if set.contains("*") {
@@ -248,7 +242,7 @@ impl SearchRules {
}
impl IntoIterator for SearchRules {
type Item = (String, IndexSearchRules);
type Item = (IndexUidPattern, IndexSearchRules);
type IntoIter = Box<dyn Iterator<Item = Self::Item>>;
fn into_iter(self) -> Self::IntoIter {

View File

@@ -5,20 +5,21 @@ use std::convert::{TryFrom, TryInto};
use std::fs::create_dir_all;
use std::path::Path;
use std::str;
use std::str::FromStr;
use std::sync::Arc;
use hmac::{Hmac, Mac};
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::keys::KeyId;
use meilisearch_types::milli;
use meilisearch_types::milli::heed::types::{ByteSlice, DecodeIgnore, SerdeJson};
use meilisearch_types::milli::heed::{Database, Env, EnvOpenOptions, RwTxn};
use meilisearch_types::star_or::StarOr;
use sha2::Sha256;
use time::OffsetDateTime;
use uuid::fmt::Hyphenated;
use uuid::Uuid;
use super::error::Result;
use super::error::{AuthControllerError, Result};
use super::{Action, Key};
const AUTH_STORE_SIZE: usize = 1_073_741_824; //1GiB
@@ -129,7 +130,7 @@ impl HeedAuthStore {
}
}
let no_index_restriction = key.indexes.contains(&StarOr::Star);
let no_index_restriction = key.indexes.iter().any(|p| p.matches_all());
for action in actions {
if no_index_restriction {
// If there is no index restriction we put None.
@@ -214,11 +215,28 @@ impl HeedAuthStore {
&self,
uid: Uuid,
action: Action,
index: Option<&[u8]>,
index: Option<&str>,
) -> Result<Option<Option<OffsetDateTime>>> {
let rtxn = self.env.read_txn()?;
let tuple = (&uid, &action, index);
Ok(self.action_keyid_index_expiration.get(&rtxn, &tuple)?)
let tuple = (&uid, &action, index.map(|s| s.as_bytes()));
match self.action_keyid_index_expiration.get(&rtxn, &tuple)? {
Some(expiration) => Ok(Some(expiration)),
None => {
let tuple = (&uid, &action, None);
for result in self.action_keyid_index_expiration.prefix_iter(&rtxn, &tuple)? {
let ((_, _, index_uid_pattern), expiration) = result?;
if let Some((pattern, index)) = index_uid_pattern.zip(index) {
let index_uid_pattern = str::from_utf8(pattern)?;
let pattern = IndexUidPattern::from_str(index_uid_pattern)
.map_err(|e| AuthControllerError::Internal(Box::new(e)))?;
if pattern.matches_str(index) {
return Ok(Some(expiration));
}
}
}
Ok(None)
}
}
}
pub fn prefix_first_expiration_date(

View File

@@ -9,7 +9,7 @@ actix-web = { version = "4.2.1", default-features = false }
anyhow = "1.0.65"
convert_case = "0.6.0"
csv = "1.1.6"
deserr = "0.3.0"
deserr = "0.4.1"
either = { version = "1.6.1", features = ["serde"] }
enum-iterator = "1.1.3"
file-store = { path = "../file-store" }

View File

@@ -1,328 +0,0 @@
/*!
This module implements the error messages of deserialization errors.
We try to:
1. Give a human-readable description of where the error originated.
2. Use the correct terms depending on the format of the request (json/query param)
3. Categorise the type of the error (e.g. missing field, wrong value type, unexpected error, etc.)
*/
use deserr::{ErrorKind, IntoValue, ValueKind, ValuePointerRef};
use super::{DeserrJsonError, DeserrQueryParamError};
use crate::error::{Code, ErrorCode};
/// Return a description of the given location in a Json, preceded by the given article.
/// e.g. `at .key1[8].key2`. If the location is the origin, the given article will not be
/// included in the description.
pub fn location_json_description(location: ValuePointerRef, article: &str) -> String {
fn rec(location: ValuePointerRef) -> String {
match location {
ValuePointerRef::Origin => String::new(),
ValuePointerRef::Key { key, prev } => rec(*prev) + "." + key,
ValuePointerRef::Index { index, prev } => format!("{}[{index}]", rec(*prev)),
}
}
match location {
ValuePointerRef::Origin => String::new(),
_ => {
format!("{article} `{}`", rec(location))
}
}
}
/// Return a description of the list of value kinds for a Json payload.
fn value_kinds_description_json(kinds: &[ValueKind]) -> String {
// Rank each value kind so that they can be sorted (and deduplicated)
// Having a predictable order helps with pattern matching
fn order(kind: &ValueKind) -> u8 {
match kind {
ValueKind::Null => 0,
ValueKind::Boolean => 1,
ValueKind::Integer => 2,
ValueKind::NegativeInteger => 3,
ValueKind::Float => 4,
ValueKind::String => 5,
ValueKind::Sequence => 6,
ValueKind::Map => 7,
}
}
// Return a description of a single value kind, preceded by an article
fn single_description(kind: &ValueKind) -> &'static str {
match kind {
ValueKind::Null => "null",
ValueKind::Boolean => "a boolean",
ValueKind::Integer => "a positive integer",
ValueKind::NegativeInteger => "a negative integer",
ValueKind::Float => "a number",
ValueKind::String => "a string",
ValueKind::Sequence => "an array",
ValueKind::Map => "an object",
}
}
fn description_rec(kinds: &[ValueKind], count_items: &mut usize, message: &mut String) {
let (msg_part, rest): (_, &[ValueKind]) = match kinds {
[] => (String::new(), &[]),
[ValueKind::Integer | ValueKind::NegativeInteger, ValueKind::Float, rest @ ..] => {
("a number".to_owned(), rest)
}
[ValueKind::Integer, ValueKind::NegativeInteger, ValueKind::Float, rest @ ..] => {
("a number".to_owned(), rest)
}
[ValueKind::Integer, ValueKind::NegativeInteger, rest @ ..] => {
("an integer".to_owned(), rest)
}
[a] => (single_description(a).to_owned(), &[]),
[a, rest @ ..] => (single_description(a).to_owned(), rest),
};
if rest.is_empty() {
if *count_items == 0 {
message.push_str(&msg_part);
} else if *count_items == 1 {
message.push_str(&format!(" or {msg_part}"));
} else {
message.push_str(&format!(", or {msg_part}"));
}
} else {
if *count_items == 0 {
message.push_str(&msg_part);
} else {
message.push_str(&format!(", {msg_part}"));
}
*count_items += 1;
description_rec(rest, count_items, message);
}
}
let mut kinds = kinds.to_owned();
kinds.sort_by_key(order);
kinds.dedup();
if kinds.is_empty() {
// Should not happen ideally
"a different value".to_owned()
} else {
let mut message = String::new();
description_rec(kinds.as_slice(), &mut 0, &mut message);
message
}
}
/// Return the JSON string of the value preceded by a description of its kind
fn value_description_with_kind_json(v: &serde_json::Value) -> String {
match v.kind() {
ValueKind::Null => "null".to_owned(),
kind => {
format!(
"{}: `{}`",
value_kinds_description_json(&[kind]),
serde_json::to_string(v).unwrap()
)
}
}
}
impl<C: Default + ErrorCode> deserr::DeserializeError for DeserrJsonError<C> {
fn error<V: IntoValue>(
_self_: Option<Self>,
error: deserr::ErrorKind<V>,
location: ValuePointerRef,
) -> Result<Self, Self> {
let mut message = String::new();
message.push_str(&match error {
ErrorKind::IncorrectValueKind { actual, accepted } => {
let expected = value_kinds_description_json(accepted);
let received = value_description_with_kind_json(&serde_json::Value::from(actual));
let location = location_json_description(location, " at");
format!("Invalid value type{location}: expected {expected}, but found {received}")
}
ErrorKind::MissingField { field } => {
let location = location_json_description(location, " inside");
format!("Missing field `{field}`{location}")
}
ErrorKind::UnknownKey { key, accepted } => {
let location = location_json_description(location, " inside");
format!(
"Unknown field `{}`{location}: expected one of {}",
key,
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", ")
)
}
ErrorKind::UnknownValue { value, accepted } => {
let location = location_json_description(location, " at");
format!(
"Unknown value `{}`{location}: expected one of {}",
value,
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", "),
)
}
ErrorKind::Unexpected { msg } => {
let location = location_json_description(location, " at");
format!("Invalid value{location}: {msg}")
}
});
Err(DeserrJsonError::new(message, C::default().error_code()))
}
}
pub fn immutable_field_error(field: &str, accepted: &[&str], code: Code) -> DeserrJsonError {
let msg = format!(
"Immutable field `{field}`: expected one of {}",
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", ")
);
DeserrJsonError::new(msg, code)
}
/// Return a description of the given location in query parameters, preceded by the
/// given article. e.g. `at key5[2]`. If the location is the origin, the given article
/// will not be included in the description.
pub fn location_query_param_description(location: ValuePointerRef, article: &str) -> String {
fn rec(location: ValuePointerRef) -> String {
match location {
ValuePointerRef::Origin => String::new(),
ValuePointerRef::Key { key, prev } => {
if matches!(prev, ValuePointerRef::Origin) {
key.to_owned()
} else {
rec(*prev) + "." + key
}
}
ValuePointerRef::Index { index, prev } => format!("{}[{index}]", rec(*prev)),
}
}
match location {
ValuePointerRef::Origin => String::new(),
_ => {
format!("{article} `{}`", rec(location))
}
}
}
impl<C: Default + ErrorCode> deserr::DeserializeError for DeserrQueryParamError<C> {
fn error<V: IntoValue>(
_self_: Option<Self>,
error: deserr::ErrorKind<V>,
location: ValuePointerRef,
) -> Result<Self, Self> {
let mut message = String::new();
message.push_str(&match error {
ErrorKind::IncorrectValueKind { actual, accepted } => {
let expected = value_kinds_description_query_param(accepted);
let received = value_description_with_kind_query_param(actual);
let location = location_query_param_description(location, " for parameter");
format!("Invalid value type{location}: expected {expected}, but found {received}")
}
ErrorKind::MissingField { field } => {
let location = location_query_param_description(location, " inside");
format!("Missing parameter `{field}`{location}")
}
ErrorKind::UnknownKey { key, accepted } => {
let location = location_query_param_description(location, " inside");
format!(
"Unknown parameter `{}`{location}: expected one of {}",
key,
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", ")
)
}
ErrorKind::UnknownValue { value, accepted } => {
let location = location_query_param_description(location, " for parameter");
format!(
"Unknown value `{}`{location}: expected one of {}",
value,
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", "),
)
}
ErrorKind::Unexpected { msg } => {
let location = location_query_param_description(location, " in parameter");
format!("Invalid value{location}: {msg}")
}
});
Err(DeserrQueryParamError::new(message, C::default().error_code()))
}
}
/// Return a description of the list of value kinds for query parameters
/// Since query parameters are always treated as strings, we always return
/// "a string" for now.
fn value_kinds_description_query_param(_accepted: &[ValueKind]) -> String {
"a string".to_owned()
}
fn value_description_with_kind_query_param<V: IntoValue>(actual: deserr::Value<V>) -> String {
match actual {
deserr::Value::Null => "null".to_owned(),
deserr::Value::Boolean(x) => format!("a boolean: `{x}`"),
deserr::Value::Integer(x) => format!("an integer: `{x}`"),
deserr::Value::NegativeInteger(x) => {
format!("an integer: `{x}`")
}
deserr::Value::Float(x) => {
format!("a number: `{x}`")
}
deserr::Value::String(x) => {
format!("a string: `{x}`")
}
deserr::Value::Sequence(_) => "multiple values".to_owned(),
deserr::Value::Map(_) => "multiple parameters".to_owned(),
}
}
#[cfg(test)]
mod tests {
use deserr::ValueKind;
use crate::deserr::error_messages::value_kinds_description_json;
#[test]
fn test_value_kinds_description_json() {
insta::assert_display_snapshot!(value_kinds_description_json(&[]), @"a different value");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Boolean]), @"a boolean");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer]), @"a positive integer");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::NegativeInteger]), @"a negative integer");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer]), @"a positive integer");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::String]), @"a string");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Sequence]), @"an array");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Map]), @"an object");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer, ValueKind::Boolean]), @"a boolean or a positive integer");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Null, ValueKind::Integer]), @"null or a positive integer");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Sequence, ValueKind::NegativeInteger]), @"a negative integer or an array");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer, ValueKind::Float]), @"a number");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer, ValueKind::Float, ValueKind::NegativeInteger]), @"a number");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Integer, ValueKind::Float, ValueKind::NegativeInteger, ValueKind::Null]), @"null or a number");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Boolean, ValueKind::Integer, ValueKind::Float, ValueKind::NegativeInteger, ValueKind::Null]), @"null, a boolean, or a number");
insta::assert_display_snapshot!(value_kinds_description_json(&[ValueKind::Null, ValueKind::Boolean, ValueKind::Integer, ValueKind::Float, ValueKind::NegativeInteger, ValueKind::Null]), @"null, a boolean, or a number");
}
}

View File

@@ -1,18 +1,19 @@
use std::convert::Infallible;
use std::fmt;
use std::marker::PhantomData;
use std::ops::ControlFlow;
use deserr::{DeserializeError, MergeWithError, ValuePointerRef};
use deserr::errors::{JsonError, QueryParamError};
use deserr::{take_cf_content, DeserializeError, IntoValue, MergeWithError, ValuePointerRef};
use crate::error::deserr_codes::{self, *};
use crate::error::deserr_codes::*;
use crate::error::{
unwrap_any, Code, DeserrParseBoolError, DeserrParseIntError, ErrorCode, InvalidTaskDateError,
Code, DeserrParseBoolError, DeserrParseIntError, ErrorCode, InvalidTaskDateError,
ParseOffsetDateTimeError,
};
use crate::index_uid::IndexUidFormatError;
use crate::tasks::{ParseTaskKindError, ParseTaskStatusError};
pub mod error_messages;
pub mod query_params;
/// Marker type for the Json format
@@ -20,8 +21,8 @@ pub struct DeserrJson;
/// Marker type for the Query Parameter format
pub struct DeserrQueryParam;
pub type DeserrJsonError<C = deserr_codes::BadRequest> = DeserrError<DeserrJson, C>;
pub type DeserrQueryParamError<C = deserr_codes::BadRequest> = DeserrError<DeserrQueryParam, C>;
pub type DeserrJsonError<C = BadRequest> = DeserrError<DeserrJson, C>;
pub type DeserrQueryParamError<C = BadRequest> = DeserrError<DeserrQueryParam, C>;
/// A request deserialization error.
///
@@ -37,6 +38,7 @@ impl<Format, C: Default + ErrorCode> DeserrError<Format, C> {
Self { msg, code, _phantom: PhantomData }
}
}
impl<Format, C: Default + ErrorCode> std::fmt::Debug for DeserrError<Format, C> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("DeserrError").field("msg", &self.msg).field("code", &self.code).finish()
@@ -49,6 +51,16 @@ impl<Format, C: Default + ErrorCode> std::fmt::Display for DeserrError<Format, C
}
}
impl<F, C: Default + ErrorCode> actix_web::ResponseError for DeserrError<F, C> {
fn status_code(&self) -> actix_web::http::StatusCode {
self.code.http()
}
fn error_response(&self) -> actix_web::HttpResponse<actix_web::body::BoxBody> {
crate::error::ResponseError::from_msg(self.msg.to_string(), self.code).error_response()
}
}
impl<Format, C: Default + ErrorCode> std::error::Error for DeserrError<Format, C> {}
impl<Format, C: Default + ErrorCode> ErrorCode for DeserrError<Format, C> {
fn error_code(&self) -> Code {
@@ -64,8 +76,8 @@ impl<Format, C1: Default + ErrorCode, C2: Default + ErrorCode>
_self_: Option<Self>,
other: DeserrError<Format, C2>,
_merge_location: ValuePointerRef,
) -> Result<Self, Self> {
Err(DeserrError { msg: other.msg, code: other.code, _phantom: PhantomData })
) -> ControlFlow<Self, Self> {
ControlFlow::Break(DeserrError { msg: other.msg, code: other.code, _phantom: PhantomData })
}
}
@@ -74,17 +86,56 @@ impl<Format, C: Default + ErrorCode> MergeWithError<Infallible> for DeserrError<
_self_: Option<Self>,
_other: Infallible,
_merge_location: ValuePointerRef,
) -> Result<Self, Self> {
) -> ControlFlow<Self, Self> {
unreachable!()
}
}
impl<C: Default + ErrorCode> DeserializeError for DeserrJsonError<C> {
fn error<V: IntoValue>(
_self_: Option<Self>,
error: deserr::ErrorKind<V>,
location: ValuePointerRef,
) -> ControlFlow<Self, Self> {
ControlFlow::Break(DeserrJsonError::new(
take_cf_content(JsonError::error(None, error, location)).to_string(),
C::default().error_code(),
))
}
}
impl<C: Default + ErrorCode> DeserializeError for DeserrQueryParamError<C> {
fn error<V: IntoValue>(
_self_: Option<Self>,
error: deserr::ErrorKind<V>,
location: ValuePointerRef,
) -> ControlFlow<Self, Self> {
ControlFlow::Break(DeserrQueryParamError::new(
take_cf_content(QueryParamError::error(None, error, location)).to_string(),
C::default().error_code(),
))
}
}
pub fn immutable_field_error(field: &str, accepted: &[&str], code: Code) -> DeserrJsonError {
let msg = format!(
"Immutable field `{field}`: expected one of {}",
accepted
.iter()
.map(|accepted| format!("`{}`", accepted))
.collect::<Vec<String>>()
.join(", ")
);
DeserrJsonError::new(msg, code)
}
// Implement a convenience function to build a `missing_field` error
macro_rules! make_missing_field_convenience_builder {
($err_code:ident, $fn_name:ident) => {
impl DeserrJsonError<$err_code> {
pub fn $fn_name(field: &str, location: ValuePointerRef) -> Self {
let x = unwrap_any(Self::error::<Infallible>(
let x = deserr::take_cf_content(Self::error::<Infallible>(
None,
deserr::ErrorKind::MissingField { field },
location,
@@ -112,7 +163,7 @@ macro_rules! merge_with_error_impl_take_error_message {
_self_: Option<Self>,
other: $err_type,
merge_location: ValuePointerRef,
) -> Result<Self, Self> {
) -> ControlFlow<Self, Self> {
DeserrError::<Format, C>::error::<Infallible>(
None,
deserr::ErrorKind::Unexpected { msg: other.to_string() },

View File

@@ -15,10 +15,9 @@ use std::convert::Infallible;
use std::ops::Deref;
use std::str::FromStr;
use deserr::{DeserializeError, DeserializeFromValue, MergeWithError, ValueKind};
use deserr::{DeserializeError, Deserr, MergeWithError, ValueKind};
use super::{DeserrParseBoolError, DeserrParseIntError};
use crate::error::unwrap_any;
use crate::index_uid::IndexUid;
use crate::tasks::{Kind, Status};
@@ -38,7 +37,7 @@ impl<T> Deref for Param<T> {
}
}
impl<T, E> DeserializeFromValue<E> for Param<T>
impl<T, E> Deserr<E> for Param<T>
where
E: DeserializeError + MergeWithError<T::Err>,
T: FromQueryParameter,
@@ -50,9 +49,9 @@ where
match value {
deserr::Value::String(s) => match T::from_query_param(&s) {
Ok(x) => Ok(Param(x)),
Err(e) => Err(unwrap_any(E::merge(None, e, location))),
Err(e) => Err(deserr::take_cf_content(E::merge(None, e, location))),
},
_ => Err(unwrap_any(E::error(
_ => Err(deserr::take_cf_content(E::error(
None,
deserr::ErrorKind::IncorrectValueKind {
actual: value,

View File

@@ -127,7 +127,7 @@ macro_rules! make_error_codes {
}
impl Code {
/// return the HTTP status code associated with the `Code`
fn http(&self) -> StatusCode {
pub fn http(&self) -> StatusCode {
match self {
$(
Code::$code_ident => StatusCode::$status
@@ -381,14 +381,6 @@ impl ErrorCode for io::Error {
}
}
/// Unwrap a result, either its Ok or Err value.
pub fn unwrap_any<T>(any: Result<T, T>) -> T {
match any {
Ok(any) => any,
Err(any) => any,
}
}
/// Deserialization when `deserr` cannot parse an API key date.
#[derive(Debug)]
pub struct ParseOffsetDateTimeError(pub String);

View File

@@ -2,14 +2,14 @@ use std::error::Error;
use std::fmt;
use std::str::FromStr;
use deserr::DeserializeFromValue;
use deserr::Deserr;
use crate::error::{Code, ErrorCode};
/// An index uid is composed of only ascii alphanumeric characters, - and _, between 1 and 400
/// bytes long
#[derive(Debug, Clone, PartialEq, Eq, DeserializeFromValue)]
#[deserr(from(String) = IndexUid::try_from -> IndexUidFormatError)]
#[derive(Debug, Clone, PartialEq, Eq, Deserr)]
#[deserr(try_from(String) = IndexUid::try_from -> IndexUidFormatError)]
pub struct IndexUid(String);
impl IndexUid {

View File

@@ -0,0 +1,124 @@
use std::borrow::Borrow;
use std::error::Error;
use std::fmt;
use std::ops::Deref;
use std::str::FromStr;
use deserr::Deserr;
use serde::{Deserialize, Serialize};
use crate::error::{Code, ErrorCode};
use crate::index_uid::{IndexUid, IndexUidFormatError};
/// An index uid pattern is composed of only ascii alphanumeric characters, - and _, between 1 and 400
/// bytes long and optionally ending with a *.
#[derive(Serialize, Deserialize, Deserr, Debug, Clone, PartialEq, Eq, Hash)]
#[deserr(try_from(&String) = FromStr::from_str -> IndexUidPatternFormatError)]
pub struct IndexUidPattern(String);
impl IndexUidPattern {
pub fn new_unchecked(s: impl AsRef<str>) -> Self {
Self(s.as_ref().to_string())
}
/// Matches any index name.
pub fn all() -> Self {
IndexUidPattern::from_str("*").unwrap()
}
/// Returns `true` if it matches any index.
pub fn matches_all(&self) -> bool {
self.0 == "*"
}
/// Returns `true` if the pattern matches a specific index name.
pub fn is_exact(&self) -> bool {
!self.0.ends_with('*')
}
/// Returns wether this index uid matches this index uid pattern.
pub fn matches(&self, uid: &IndexUid) -> bool {
self.matches_str(uid.as_str())
}
/// Returns wether this string matches this index uid pattern.
pub fn matches_str(&self, uid: &str) -> bool {
match self.0.strip_suffix('*') {
Some(prefix) => uid.starts_with(prefix),
None => self.0 == uid,
}
}
}
impl Deref for IndexUidPattern {
type Target = str;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl Borrow<str> for IndexUidPattern {
fn borrow(&self) -> &str {
&self.0
}
}
impl TryFrom<String> for IndexUidPattern {
type Error = IndexUidPatternFormatError;
fn try_from(uid: String) -> Result<Self, Self::Error> {
let result = match uid.strip_suffix('*') {
Some("") => Ok(IndexUidPattern(uid)),
Some(prefix) => IndexUid::from_str(prefix).map(|_| IndexUidPattern(uid)),
None => IndexUid::try_from(uid).map(IndexUid::into_inner).map(IndexUidPattern),
};
match result {
Ok(index_uid_pattern) => Ok(index_uid_pattern),
Err(IndexUidFormatError { invalid_uid }) => {
Err(IndexUidPatternFormatError { invalid_uid })
}
}
}
}
impl FromStr for IndexUidPattern {
type Err = IndexUidPatternFormatError;
fn from_str(uid: &str) -> Result<IndexUidPattern, IndexUidPatternFormatError> {
uid.to_string().try_into()
}
}
impl From<IndexUidPattern> for String {
fn from(IndexUidPattern(uid): IndexUidPattern) -> Self {
uid
}
}
#[derive(Debug)]
pub struct IndexUidPatternFormatError {
pub invalid_uid: String,
}
impl fmt::Display for IndexUidPatternFormatError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"`{}` is not a valid index uid pattern. Index uid patterns \
can be an integer or a string containing only alphanumeric \
characters, hyphens (-), underscores (_), and \
optionally end with a star (*).",
self.invalid_uid,
)
}
}
impl Error for IndexUidPatternFormatError {}
impl ErrorCode for IndexUidPatternFormatError {
fn error_code(&self) -> Code {
Code::InvalidIndexUid
}
}

View File

@@ -2,7 +2,7 @@ use std::convert::Infallible;
use std::hash::Hash;
use std::str::FromStr;
use deserr::{DeserializeError, DeserializeFromValue, ValuePointerRef};
use deserr::{DeserializeError, Deserr, MergeWithError, ValuePointerRef};
use enum_iterator::Sequence;
use milli::update::Setting;
use serde::{Deserialize, Serialize};
@@ -11,31 +11,44 @@ use time::macros::{format_description, time};
use time::{Date, OffsetDateTime, PrimitiveDateTime};
use uuid::Uuid;
use crate::deserr::error_messages::immutable_field_error;
use crate::deserr::DeserrJsonError;
use crate::deserr::{immutable_field_error, DeserrError, DeserrJsonError};
use crate::error::deserr_codes::*;
use crate::error::{unwrap_any, Code, ParseOffsetDateTimeError};
use crate::index_uid::IndexUid;
use crate::star_or::StarOr;
use crate::error::{Code, ErrorCode, ParseOffsetDateTimeError};
use crate::index_uid_pattern::{IndexUidPattern, IndexUidPatternFormatError};
pub type KeyId = Uuid;
#[derive(Debug, DeserializeFromValue)]
impl<C: Default + ErrorCode> MergeWithError<IndexUidPatternFormatError> for DeserrJsonError<C> {
fn merge(
_self_: Option<Self>,
other: IndexUidPatternFormatError,
merge_location: deserr::ValuePointerRef,
) -> std::ops::ControlFlow<Self, Self> {
DeserrError::error::<Infallible>(
None,
deserr::ErrorKind::Unexpected { msg: other.to_string() },
merge_location,
)
}
}
#[derive(Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct CreateApiKey {
#[deserr(default, error = DeserrJsonError<InvalidApiKeyDescription>)]
pub description: Option<String>,
#[deserr(default, error = DeserrJsonError<InvalidApiKeyName>)]
pub name: Option<String>,
#[deserr(default = Uuid::new_v4(), error = DeserrJsonError<InvalidApiKeyUid>, from(&String) = Uuid::from_str -> uuid::Error)]
#[deserr(default = Uuid::new_v4(), error = DeserrJsonError<InvalidApiKeyUid>, try_from(&String) = Uuid::from_str -> uuid::Error)]
pub uid: KeyId,
#[deserr(error = DeserrJsonError<InvalidApiKeyActions>, missing_field_error = DeserrJsonError::missing_api_key_actions)]
pub actions: Vec<Action>,
#[deserr(error = DeserrJsonError<InvalidApiKeyIndexes>, missing_field_error = DeserrJsonError::missing_api_key_indexes)]
pub indexes: Vec<StarOr<IndexUid>>,
#[deserr(error = DeserrJsonError<InvalidApiKeyExpiresAt>, from(Option<String>) = parse_expiration_date -> ParseOffsetDateTimeError, missing_field_error = DeserrJsonError::missing_api_key_expires_at)]
pub indexes: Vec<IndexUidPattern>,
#[deserr(error = DeserrJsonError<InvalidApiKeyExpiresAt>, try_from(Option<String>) = parse_expiration_date -> ParseOffsetDateTimeError, missing_field_error = DeserrJsonError::missing_api_key_expires_at)]
pub expires_at: Option<OffsetDateTime>,
}
impl CreateApiKey {
pub fn to_key(self) -> Key {
let CreateApiKey { description, name, uid, actions, indexes, expires_at } = self;
@@ -65,7 +78,7 @@ fn deny_immutable_fields_api_key(
"expiresAt" => immutable_field_error(field, accepted, Code::ImmutableApiKeyExpiresAt),
"createdAt" => immutable_field_error(field, accepted, Code::ImmutableApiKeyCreatedAt),
"updatedAt" => immutable_field_error(field, accepted, Code::ImmutableApiKeyUpdatedAt),
_ => unwrap_any(DeserrJsonError::<BadRequest>::error::<Infallible>(
_ => deserr::take_cf_content(DeserrJsonError::<BadRequest>::error::<Infallible>(
None,
deserr::ErrorKind::UnknownKey { key: field, accepted },
location,
@@ -73,7 +86,7 @@ fn deny_immutable_fields_api_key(
}
}
#[derive(Debug, DeserializeFromValue)]
#[derive(Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields = deny_immutable_fields_api_key)]
pub struct PatchApiKey {
#[deserr(default, error = DeserrJsonError<InvalidApiKeyDescription>)]
@@ -90,7 +103,7 @@ pub struct Key {
pub name: Option<String>,
pub uid: KeyId,
pub actions: Vec<Action>,
pub indexes: Vec<StarOr<IndexUid>>,
pub indexes: Vec<IndexUidPattern>,
#[serde(with = "time::serde::rfc3339::option")]
pub expires_at: Option<OffsetDateTime>,
#[serde(with = "time::serde::rfc3339")]
@@ -108,7 +121,7 @@ impl Key {
description: Some("Use it for anything that is not a search operation. Caution! Do not expose it on a public frontend".to_string()),
uid,
actions: vec![Action::All],
indexes: vec![StarOr::Star],
indexes: vec![IndexUidPattern::all()],
expires_at: None,
created_at: now,
updated_at: now,
@@ -123,7 +136,7 @@ impl Key {
description: Some("Use it to search from the frontend".to_string()),
uid,
actions: vec![Action::Search],
indexes: vec![StarOr::Star],
indexes: vec![IndexUidPattern::all()],
expires_at: None,
created_at: now,
updated_at: now,
@@ -168,9 +181,7 @@ fn parse_expiration_date(
}
}
#[derive(
Copy, Clone, Serialize, Deserialize, Debug, Eq, PartialEq, Hash, Sequence, DeserializeFromValue,
)]
#[derive(Copy, Clone, Serialize, Deserialize, Debug, Eq, PartialEq, Hash, Sequence, Deserr)]
#[repr(u8)]
pub enum Action {
#[serde(rename = "*")]

View File

@@ -3,6 +3,7 @@ pub mod deserr;
pub mod document_formats;
pub mod error;
pub mod index_uid;
pub mod index_uid_pattern;
pub mod keys;
pub mod settings;
pub mod star_or;

View File

@@ -3,9 +3,10 @@ use std::convert::Infallible;
use std::fmt;
use std::marker::PhantomData;
use std::num::NonZeroUsize;
use std::ops::ControlFlow;
use std::str::FromStr;
use deserr::{DeserializeError, DeserializeFromValue, ErrorKind, MergeWithError, ValuePointerRef};
use deserr::{DeserializeError, Deserr, ErrorKind, MergeWithError, ValuePointerRef};
use fst::IntoStreamer;
use milli::update::Setting;
use milli::{Criterion, CriterionError, Index, DEFAULT_VALUES_PER_FACET};
@@ -13,7 +14,6 @@ use serde::{Deserialize, Serialize, Serializer};
use crate::deserr::DeserrJsonError;
use crate::error::deserr_codes::*;
use crate::error::unwrap_any;
/// The maximimum number of results that the engine
/// will be able to return in one search call.
@@ -41,7 +41,7 @@ pub struct Checked;
#[derive(Clone, Default, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct Unchecked;
impl<E> DeserializeFromValue<E> for Unchecked
impl<E> Deserr<E> for Unchecked
where
E: DeserializeError,
{
@@ -59,13 +59,13 @@ fn validate_min_word_size_for_typo_setting<E: DeserializeError>(
) -> Result<MinWordSizeTyposSetting, E> {
if let (Setting::Set(one), Setting::Set(two)) = (s.one_typo, s.two_typos) {
if one > two {
return Err(unwrap_any(E::error::<Infallible>(None, ErrorKind::Unexpected { msg: format!("`minWordSizeForTypos` setting is invalid. `oneTypo` and `twoTypos` fields should be between `0` and `255`, and `twoTypos` should be greater or equals to `oneTypo` but found `oneTypo: {one}` and twoTypos: {two}`.") }, location)));
return Err(deserr::take_cf_content(E::error::<Infallible>(None, ErrorKind::Unexpected { msg: format!("`minWordSizeForTypos` setting is invalid. `oneTypo` and `twoTypos` fields should be between `0` and `255`, and `twoTypos` should be greater or equals to `oneTypo` but found `oneTypo: {one}` and twoTypos: {two}`.") }, location)));
}
}
Ok(s)
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(deny_unknown_fields, rename_all = camelCase, validate = validate_min_word_size_for_typo_setting -> DeserrJsonError<InvalidSettingsTypoTolerance>)]
pub struct MinWordSizeTyposSetting {
@@ -77,7 +77,7 @@ pub struct MinWordSizeTyposSetting {
pub two_typos: Setting<u8>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(deny_unknown_fields, rename_all = camelCase, where_predicate = __Deserr_E: deserr::MergeWithError<DeserrJsonError<InvalidSettingsTypoTolerance>>)]
pub struct TypoSettings {
@@ -95,7 +95,7 @@ pub struct TypoSettings {
pub disable_on_attributes: Setting<BTreeSet<String>>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(rename_all = camelCase, deny_unknown_fields)]
pub struct FacetingSettings {
@@ -104,7 +104,7 @@ pub struct FacetingSettings {
pub max_values_per_facet: Setting<usize>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(rename_all = camelCase, deny_unknown_fields)]
pub struct PaginationSettings {
@@ -118,7 +118,7 @@ impl MergeWithError<milli::CriterionError> for DeserrJsonError<InvalidSettingsRa
_self_: Option<Self>,
other: milli::CriterionError,
merge_location: ValuePointerRef,
) -> Result<Self, Self> {
) -> ControlFlow<Self, Self> {
Self::error::<Infallible>(
None,
ErrorKind::Unexpected { msg: other.to_string() },
@@ -130,7 +130,7 @@ impl MergeWithError<milli::CriterionError> for DeserrJsonError<InvalidSettingsRa
/// Holds all the settings for an index. `T` can either be `Checked` if they represents settings
/// whose validity is guaranteed, or `Unchecked` if they need to be validated. In the later case, a
/// call to `check` will return a `Settings<Checked>` from a `Settings<Unchecked>`.
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr)]
#[serde(
deny_unknown_fields,
rename_all = "camelCase",
@@ -509,8 +509,8 @@ pub fn settings(
})
}
#[derive(Debug, Clone, PartialEq, Eq, DeserializeFromValue)]
#[deserr(from(&String) = FromStr::from_str -> CriterionError)]
#[derive(Debug, Clone, PartialEq, Eq, Deserr)]
#[deserr(try_from(&String) = FromStr::from_str -> CriterionError)]
pub enum RankingRuleView {
/// Sorted by decreasing number of matched query terms.
/// Query words at the front of an attribute is considered better than if it was at the back.

View File

@@ -1,13 +1,13 @@
use std::fmt;
use std::marker::PhantomData;
use std::ops::ControlFlow;
use std::str::FromStr;
use deserr::{DeserializeError, DeserializeFromValue, MergeWithError, ValueKind};
use deserr::{DeserializeError, Deserr, MergeWithError, ValueKind};
use serde::de::Visitor;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use crate::deserr::query_params::FromQueryParameter;
use crate::error::unwrap_any;
/// A type that tries to match either a star (*) or
/// any other thing that implements `FromStr`.
@@ -111,7 +111,7 @@ where
}
}
impl<T, E> DeserializeFromValue<E> for StarOr<T>
impl<T, E> Deserr<E> for StarOr<T>
where
T: FromStr,
E: DeserializeError + MergeWithError<T::Err>,
@@ -127,11 +127,11 @@ where
} else {
match T::from_str(&v) {
Ok(parsed) => Ok(StarOr::Other(parsed)),
Err(e) => Err(unwrap_any(E::merge(None, e, location))),
Err(e) => Err(deserr::take_cf_content(E::merge(None, e, location))),
}
}
}
_ => Err(unwrap_any(E::error::<V>(
_ => Err(deserr::take_cf_content(E::error::<V>(
None,
deserr::ErrorKind::IncorrectValueKind {
actual: value,
@@ -191,7 +191,7 @@ where
}
}
impl<T, E> DeserializeFromValue<E> for OptionStarOr<T>
impl<T, E> Deserr<E> for OptionStarOr<T>
where
E: DeserializeError + MergeWithError<T::Err>,
T: FromQueryParameter,
@@ -205,10 +205,10 @@ where
"*" => Ok(OptionStarOr::Star),
s => match T::from_query_param(s) {
Ok(x) => Ok(OptionStarOr::Other(x)),
Err(e) => Err(unwrap_any(E::merge(None, e, location))),
Err(e) => Err(deserr::take_cf_content(E::merge(None, e, location))),
},
},
_ => Err(unwrap_any(E::error::<V>(
_ => Err(deserr::take_cf_content(E::error::<V>(
None,
deserr::ErrorKind::IncorrectValueKind {
actual: value,
@@ -271,7 +271,7 @@ impl<T> OptionStarOrList<T> {
}
}
impl<T, E> DeserializeFromValue<E> for OptionStarOrList<T>
impl<T, E> Deserr<E> for OptionStarOrList<T>
where
E: DeserializeError + MergeWithError<T::Err>,
T: FromQueryParameter,
@@ -299,7 +299,10 @@ where
Err(e) => {
let location =
if len_cs > 1 { location.push_index(i) } else { location };
error = Some(E::merge(error, e, location)?);
error = match E::merge(error, e, location) {
ControlFlow::Continue(e) => Some(e),
ControlFlow::Break(e) => return Err(e),
};
}
}
}
@@ -314,7 +317,7 @@ where
Ok(OptionStarOrList::List(els))
}
}
_ => Err(unwrap_any(E::error::<V>(
_ => Err(deserr::take_cf_content(E::error::<V>(
None,
deserr::ErrorKind::IncorrectValueKind {
actual: value,

View File

@@ -19,7 +19,7 @@ byte-unit = { version = "4.0.14", default-features = false, features = ["std", "
bytes = "1.2.1"
clap = { version = "4.0.9", features = ["derive", "env"] }
crossbeam-channel = "0.5.6"
deserr = "0.3.0"
deserr = "0.4.1"
dump = { path = "../dump" }
either = "1.8.0"
env_logger = "0.9.1"

View File

@@ -1,7 +1,14 @@
use vergen::{vergen, Config};
use vergen::{vergen, Config, SemverKind};
fn main() {
if let Err(e) = vergen(Config::default()) {
// Note: any code that needs VERGEN_ environment variables should take care to define them manually in the Dockerfile and pass them
// in the corresponding GitHub workflow (publish_docker.yml).
// This is due to the Dockerfile building the binary outside of the git directory.
let mut config = Config::default();
// allow using non-annotated tags
*config.git_mut().semver_kind_mut() = SemverKind::Lightweight;
if let Err(e) = vergen(config) {
println!("cargo:warning=vergen: {}", e);
}

View File

@@ -284,7 +284,8 @@ impl From<Opt> for Infos {
ScheduleSnapshot::Enabled(interval) => Some(interval),
};
let IndexerOpts { max_indexing_memory, max_indexing_threads } = indexer_options;
let IndexerOpts { max_indexing_memory, max_indexing_threads, skip_index_budget: _ } =
indexer_options;
// We're going to override every sensible information.
// We consider information sensible if it contains a path, an address, or a key.
@@ -401,12 +402,19 @@ impl Segment {
if let Ok(stats) =
create_all_stats(index_scheduler.into(), auth_controller, &SearchRules::default())
{
// Replace the version number with the prototype name if any.
let version = if let Some(prototype) = crate::prototype_name() {
prototype
} else {
env!("CARGO_PKG_VERSION")
};
let _ = self
.batcher
.push(Identify {
context: Some(json!({
"app": {
"version": env!("CARGO_PKG_VERSION").to_string(),
"version": version.to_string(),
},
})),
user: self.user.clone(),

View File

@@ -199,6 +199,9 @@ pub mod policies {
token: &str,
index: Option<&str>,
) -> Option<AuthFilter> {
// A tenant token only has access to the search route which always defines an index.
let index = index?;
// Only search action can be accessed by a tenant token.
if A != actions::SEARCH {
return None;
@@ -206,7 +209,7 @@ pub mod policies {
let uid = extract_key_id(token)?;
// check if parent key is authorized to do the action.
if auth.is_key_authorized(uid, Action::Search, index).ok()? {
if auth.is_key_authorized(uid, Action::Search, Some(index)).ok()? {
// Check if tenant token is valid.
let key = auth.generate_key(uid)?;
let data = decode::<Claims>(
@@ -217,10 +220,8 @@ pub mod policies {
.ok()?;
// Check index access if an index restriction is provided.
if let Some(index) = index {
if !data.claims.search_rules.is_index_authorized(index) {
return None;
}
if !data.claims.search_rules.is_index_authorized(index) {
return None;
}
// Check if token is expired.
@@ -230,7 +231,10 @@ pub mod policies {
}
}
return auth.get_key_filters(uid, Some(data.claims.search_rules)).ok();
return match auth.get_key_filters(uid, Some(data.claims.search_rules)) {
Ok(auth) if auth.search_rules.is_index_authorized(index) => Some(auth),
_ => None,
};
}
None

View File

@@ -1,78 +0,0 @@
use std::fmt::Debug;
use std::future::Future;
use std::marker::PhantomData;
use std::pin::Pin;
use std::task::{Context, Poll};
use actix_web::dev::Payload;
use actix_web::web::Json;
use actix_web::{FromRequest, HttpRequest};
use deserr::{DeserializeError, DeserializeFromValue};
use futures::ready;
use meilisearch_types::error::{ErrorCode, ResponseError};
/// Extractor for typed data from Json request payloads
/// deserialised by deserr.
///
/// # Extractor
/// To extract typed data from a request body, the inner type `T` must implement the
/// [`deserr::DeserializeFromError<E>`] trait. The inner type `E` must implement the
/// [`ErrorCode`](meilisearch_error::ErrorCode) trait.
#[derive(Debug)]
pub struct ValidatedJson<T, E>(pub T, PhantomData<*const E>);
impl<T, E> ValidatedJson<T, E> {
pub fn new(data: T) -> Self {
ValidatedJson(data, PhantomData)
}
pub fn into_inner(self) -> T {
self.0
}
}
impl<T, E> FromRequest for ValidatedJson<T, E>
where
E: DeserializeError + ErrorCode + std::error::Error + 'static,
T: DeserializeFromValue<E>,
{
type Error = actix_web::Error;
type Future = ValidatedJsonExtractFut<T, E>;
#[inline]
fn from_request(req: &HttpRequest, payload: &mut Payload) -> Self::Future {
ValidatedJsonExtractFut {
fut: Json::<serde_json::Value>::from_request(req, payload),
_phantom: PhantomData,
}
}
}
pub struct ValidatedJsonExtractFut<T, E> {
fut: <Json<serde_json::Value> as FromRequest>::Future,
_phantom: PhantomData<*const (T, E)>,
}
impl<T, E> Future for ValidatedJsonExtractFut<T, E>
where
T: DeserializeFromValue<E>,
E: DeserializeError + ErrorCode + std::error::Error + 'static,
{
type Output = Result<ValidatedJson<T, E>, actix_web::Error>;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let ValidatedJsonExtractFut { fut, .. } = self.get_mut();
let fut = Pin::new(fut);
let res = ready!(fut.poll(cx));
let res = match res {
Err(err) => Err(err),
Ok(data) => match deserr::deserialize::<_, _, E>(data.into_inner()) {
Ok(data) => Ok(ValidatedJson::new(data)),
Err(e) => Err(ResponseError::from(e).into()),
},
};
Poll::Ready(res)
}
}

View File

@@ -1,6 +1,4 @@
pub mod payload;
#[macro_use]
pub mod authentication;
pub mod json;
pub mod query_parameters;
pub mod sequential_extractor;

View File

@@ -1,70 +0,0 @@
//! A module to parse query parameter with deserr
use std::marker::PhantomData;
use std::{fmt, ops};
use actix_http::Payload;
use actix_utils::future::{err, ok, Ready};
use actix_web::{FromRequest, HttpRequest};
use deserr::{DeserializeError, DeserializeFromValue};
use meilisearch_types::error::{Code, ErrorCode, ResponseError};
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
pub struct QueryParameter<T, E>(pub T, PhantomData<*const E>);
impl<T, E> QueryParameter<T, E> {
/// Unwrap into inner `T` value.
pub fn into_inner(self) -> T {
self.0
}
}
impl<T, E> QueryParameter<T, E>
where
T: DeserializeFromValue<E>,
E: DeserializeError + ErrorCode + std::error::Error + 'static,
{
pub fn from_query(query_str: &str) -> Result<Self, actix_web::Error> {
let value = serde_urlencoded::from_str::<serde_json::Value>(query_str)
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::BadRequest))?;
match deserr::deserialize::<_, _, E>(value) {
Ok(data) => Ok(QueryParameter(data, PhantomData)),
Err(e) => Err(ResponseError::from(e).into()),
}
}
}
impl<T, E> ops::Deref for QueryParameter<T, E> {
type Target = T;
fn deref(&self) -> &T {
&self.0
}
}
impl<T, E> ops::DerefMut for QueryParameter<T, E> {
fn deref_mut(&mut self) -> &mut T {
&mut self.0
}
}
impl<T: fmt::Display, E> fmt::Display for QueryParameter<T, E> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.0.fmt(f)
}
}
impl<T, E> FromRequest for QueryParameter<T, E>
where
T: DeserializeFromValue<E>,
E: DeserializeError + ErrorCode + std::error::Error + 'static,
{
type Error = actix_web::Error;
type Future = Ready<Result<Self, actix_web::Error>>;
#[inline]
fn from_request(req: &HttpRequest, _: &mut Payload) -> Self::Future {
QueryParameter::from_query(req.query_string()).map(ok).unwrap_or_else(err)
}
}

View File

@@ -45,6 +45,33 @@ use option::ScheduleSnapshot;
use crate::error::MeilisearchHttpError;
/// Default number of simultaneously opened indexes,
/// lower for Windows that dedicates a smaller virtual address space to processes.
///
/// The value was chosen this way:
///
/// - Windows provides a small virtual address space of about 10TiB to processes.
/// - The chosen value allows for indexes to reach a safe size of 1TiB.
/// - This can accomodate an unlimited number of indexes as long as they stay below 1TiB size.
#[cfg(windows)]
const DEFAULT_INDEX_COUNT: usize = 10;
/// Default number of simultaneously opened indexes.
///
/// The higher, the better for avoiding reopening indexes.
///
/// The value was chosen this way:
///
/// - Opening an index consumes a file descriptor.
/// - The default on many unices is about 256 file descriptors for a process.
/// - 100 is a little bit less than half this value.
///
/// In the future, this value could be computed from the dynamic number of allowed file descriptors for the current process.
///
/// On Unices, this value is largely irrelevant to virtual address space, because due to index resizing the indexes should take virtual memory in the same ballpark
/// as their disk size and it is unlikely for a user to have a sum of index weighing 128TB on a single Meilisearch node.
#[cfg(not(windows))]
const DEFAULT_INDEX_COUNT: usize = 100;
/// Check if a db is empty. It does not provide any information on the
/// validity of the data in it.
/// We consider a database as non empty when it's a non empty directory.
@@ -205,9 +232,11 @@ fn open_or_create_database_unchecked(
snapshots_path: opt.snapshot_dir.clone(),
dumps_path: opt.dump_dir.clone(),
task_db_size: opt.max_task_db_size.get_bytes() as usize,
index_size: opt.max_index_size.get_bytes() as usize,
index_base_map_size: opt.max_index_size.get_bytes() as usize,
indexer_config: (&opt.indexer_options).try_into()?,
autobatching_enabled: true,
index_growth_amount: byte_unit::Byte::from_str("10GiB").unwrap().get_bytes() as usize,
index_count: DEFAULT_INDEX_COUNT,
})?)
};
@@ -427,3 +456,30 @@ pub fn configure_metrics_route(config: &mut web::ServiceConfig, enable_metrics_r
);
}
}
/// Parses the output of
/// [`VERGEN_GIT_SEMVER_LIGHTWEIGHT`](https://docs.rs/vergen/latest/vergen/struct.Git.html#instructions)
/// as a prototype name.
///
/// Returns `Some(prototype_name)` if the following conditions are met on this value:
///
/// 1. starts with `prototype-`,
/// 2. ends with `-<some_number>`,
/// 3. does not end with `<some_number>-<some_number>`.
///
/// Otherwise, returns `None`.
pub fn prototype_name() -> Option<&'static str> {
let prototype: &'static str = option_env!("VERGEN_GIT_SEMVER_LIGHTWEIGHT")?;
if !prototype.starts_with("prototype-") {
return None;
}
let mut rsplit_prototype = prototype.rsplit('-');
// last component MUST be a number
rsplit_prototype.next()?.parse::<u64>().ok()?;
// before than last component SHALL NOT be a number
rsplit_prototype.next()?.parse::<u64>().err()?;
Some(prototype)
}

View File

@@ -8,7 +8,7 @@ use actix_web::web::Data;
use actix_web::HttpServer;
use index_scheduler::IndexScheduler;
use meilisearch::analytics::Analytics;
use meilisearch::{analytics, create_app, setup_meilisearch, Opt};
use meilisearch::{analytics, create_app, prototype_name, setup_meilisearch, Opt};
use meilisearch_auth::{generate_master_key, AuthController, MASTER_KEY_MIN_SIZE};
use termcolor::{Color, ColorChoice, ColorSpec, StandardStream, WriteColor};
@@ -137,6 +137,9 @@ pub fn print_launch_resume(
eprintln!("Commit SHA:\t\t{:?}", commit_sha.to_string());
eprintln!("Commit date:\t\t{:?}", commit_date.to_string());
eprintln!("Package version:\t{:?}", env!("CARGO_PKG_VERSION").to_string());
if let Some(prototype) = prototype_name() {
eprintln!("Prototype:\t\t{:?}", prototype);
}
#[cfg(all(not(debug_assertions), feature = "analytics"))]
{

View File

@@ -65,11 +65,10 @@ const MEILI_MAX_INDEXING_THREADS: &str = "MEILI_MAX_INDEXING_THREADS";
const DEFAULT_LOG_EVERY_N: usize = 100_000;
// Each environment (index and task-db) is taking space in the virtual address space.
//
// The size of the virtual address space is limited by the OS. About 100TB for Linux and about 10TB for Windows.
// This means that the number of indexes is limited to about 200 for Linux and about 20 for Windows.
pub const INDEX_SIZE: u64 = 536_870_912_000; // 500 GiB
pub const TASK_DB_SIZE: u64 = 10_737_418_240; // 10 GiB
// When creating a new environment, it starts its life with 10GiB of virtual address space.
// It is then later resized if needs be.
pub const INDEX_SIZE: u64 = 2 * 1024 * 1024 * 1024 * 1024; // 2 TiB
pub const TASK_DB_SIZE: u64 = 10 * 1024 * 1024 * 1024; // 10 GiB
#[derive(Debug, Default, Clone, Copy, Serialize, Deserialize)]
#[serde(rename_all = "UPPERCASE")]
@@ -494,12 +493,21 @@ pub struct IndexerOpts {
#[clap(long, env = MEILI_MAX_INDEXING_THREADS, default_value_t)]
#[serde(default)]
pub max_indexing_threads: MaxThreads,
/// Whether or not we want to determine the budget of virtual memory address space we have available dynamically
/// (the default), or statically.
///
/// Determining the budget of virtual memory address space dynamically takes some time on some systems (such as macOS)
/// and may make tests non-deterministic, so we want to skip it in tests.
#[clap(skip)]
#[serde(skip)]
pub skip_index_budget: bool,
}
impl IndexerOpts {
/// Exports the values to their corresponding env vars if they are not set.
pub fn export_to_env(self) {
let IndexerOpts { max_indexing_memory, max_indexing_threads } = self;
let IndexerOpts { max_indexing_memory, max_indexing_threads, skip_index_budget: _ } = self;
if let Some(max_indexing_memory) = max_indexing_memory.0 {
export_to_env_if_not_present(
MEILI_MAX_INDEXING_MEMORY,
@@ -527,6 +535,7 @@ impl TryFrom<&IndexerOpts> for IndexerConfig {
max_memory: other.max_indexing_memory.map(|b| b.get_bytes() as usize),
thread_pool: Some(thread_pool),
max_positions_per_attributes: None,
skip_index_budget: other.skip_index_budget,
..Default::default()
})
}

View File

@@ -1,7 +1,8 @@
use std::str;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::DeserializeFromValue;
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use deserr::Deserr;
use meilisearch_auth::error::AuthControllerError;
use meilisearch_auth::AuthController;
use meilisearch_types::deserr::query_params::Param;
@@ -16,8 +17,6 @@ use uuid::Uuid;
use super::PAGINATION_DEFAULT_LIMIT;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::json::ValidatedJson;
use crate::extractors::query_parameters::QueryParameter;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::Pagination;
@@ -37,7 +36,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
pub async fn create_api_key(
auth_controller: GuardedData<ActionPolicy<{ actions::KEYS_CREATE }>, AuthController>,
body: ValidatedJson<CreateApiKey, DeserrJsonError>,
body: AwebJson<CreateApiKey, DeserrJsonError>,
_req: HttpRequest,
) -> Result<HttpResponse, ResponseError> {
let v = body.into_inner();
@@ -51,7 +50,7 @@ pub async fn create_api_key(
Ok(HttpResponse::Created().json(res))
}
#[derive(DeserializeFromValue, Debug, Clone, Copy)]
#[derive(Deserr, Debug, Clone, Copy)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct ListApiKeys {
#[deserr(default, error = DeserrQueryParamError<InvalidApiKeyOffset>)]
@@ -59,6 +58,7 @@ pub struct ListApiKeys {
#[deserr(default = Param(PAGINATION_DEFAULT_LIMIT), error = DeserrQueryParamError<InvalidApiKeyLimit>)]
pub limit: Param<usize>,
}
impl ListApiKeys {
fn as_pagination(self) -> Pagination {
Pagination { offset: self.offset.0, limit: self.limit.0 }
@@ -67,7 +67,7 @@ impl ListApiKeys {
pub async fn list_api_keys(
auth_controller: GuardedData<ActionPolicy<{ actions::KEYS_GET }>, AuthController>,
list_api_keys: QueryParameter<ListApiKeys, DeserrQueryParamError>,
list_api_keys: AwebQueryParameter<ListApiKeys, DeserrQueryParamError>,
) -> Result<HttpResponse, ResponseError> {
let paginate = list_api_keys.into_inner().as_pagination();
let page_view = tokio::task::spawn_blocking(move || -> Result<_, AuthControllerError> {
@@ -104,7 +104,7 @@ pub async fn get_api_key(
pub async fn patch_api_key(
auth_controller: GuardedData<ActionPolicy<{ actions::KEYS_UPDATE }>, AuthController>,
body: ValidatedJson<PatchApiKey, DeserrJsonError>,
body: AwebJson<PatchApiKey, DeserrJsonError>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
let key = path.into_inner().key;

View File

@@ -4,7 +4,8 @@ use actix_web::http::header::CONTENT_TYPE;
use actix_web::web::Data;
use actix_web::{web, HttpMessage, HttpRequest, HttpResponse};
use bstr::ByteSlice;
use deserr::DeserializeFromValue;
use deserr::actix_web::AwebQueryParameter;
use deserr::Deserr;
use futures::StreamExt;
use index_scheduler::IndexScheduler;
use log::debug;
@@ -33,7 +34,6 @@ use crate::error::PayloadError::ReceivePayload;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::payload::Payload;
use crate::extractors::query_parameters::QueryParameter;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::routes::{PaginationView, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT};
@@ -80,7 +80,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
);
}
#[derive(Debug, DeserializeFromValue)]
#[derive(Debug, Deserr)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct GetDocument {
#[deserr(default, error = DeserrQueryParamError<InvalidDocumentFields>)]
@@ -90,7 +90,7 @@ pub struct GetDocument {
pub async fn get_document(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, Data<IndexScheduler>>,
document_param: web::Path<DocumentParam>,
params: QueryParameter<GetDocument, DeserrQueryParamError>,
params: AwebQueryParameter<GetDocument, DeserrQueryParamError>,
) -> Result<HttpResponse, ResponseError> {
let DocumentParam { index_uid, document_id } = document_param.into_inner();
let index_uid = IndexUid::try_from(index_uid)?;
@@ -125,7 +125,7 @@ pub async fn delete_document(
Ok(HttpResponse::Accepted().json(task))
}
#[derive(Debug, DeserializeFromValue)]
#[derive(Debug, Deserr)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct BrowseQuery {
#[deserr(default, error = DeserrQueryParamError<InvalidDocumentOffset>)]
@@ -139,7 +139,7 @@ pub struct BrowseQuery {
pub async fn get_all_documents(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
params: QueryParameter<BrowseQuery, DeserrQueryParamError>,
params: AwebQueryParameter<BrowseQuery, DeserrQueryParamError>,
) -> Result<HttpResponse, ResponseError> {
let index_uid = IndexUid::try_from(index_uid.into_inner())?;
debug!("called with params: {:?}", params);
@@ -155,7 +155,7 @@ pub async fn get_all_documents(
Ok(HttpResponse::Ok().json(ret))
}
#[derive(Deserialize, Debug, DeserializeFromValue)]
#[derive(Deserialize, Debug, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct UpdateDocumentsQuery {
#[deserr(default, error = DeserrJsonError<InvalidIndexPrimaryKey>)]
@@ -165,7 +165,7 @@ pub struct UpdateDocumentsQuery {
pub async fn add_documents(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
params: QueryParameter<UpdateDocumentsQuery, DeserrJsonError>,
params: AwebQueryParameter<UpdateDocumentsQuery, DeserrJsonError>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
@@ -195,7 +195,7 @@ pub async fn add_documents(
pub async fn update_documents(
index_scheduler: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
params: QueryParameter<UpdateDocumentsQuery, DeserrJsonError>,
params: AwebQueryParameter<UpdateDocumentsQuery, DeserrJsonError>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,

View File

@@ -2,14 +2,14 @@ use std::convert::Infallible;
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::{DeserializeError, DeserializeFromValue, ValuePointerRef};
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use deserr::{DeserializeError, Deserr, ValuePointerRef};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::error_messages::immutable_field_error;
use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::{DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::deserr::{immutable_field_error, DeserrJsonError, DeserrQueryParamError};
use meilisearch_types::error::deserr_codes::*;
use meilisearch_types::error::{unwrap_any, Code, ResponseError};
use meilisearch_types::error::{Code, ResponseError};
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::milli::{self, FieldDistribution, Index};
use meilisearch_types::tasks::KindWithContent;
@@ -21,8 +21,6 @@ use super::{Pagination, SummarizedTaskView, PAGINATION_DEFAULT_LIMIT};
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::{AuthenticationError, GuardedData};
use crate::extractors::json::ValidatedJson;
use crate::extractors::query_parameters::QueryParameter;
use crate::extractors::sequential_extractor::SeqHandler;
pub mod documents;
@@ -73,7 +71,7 @@ impl IndexView {
}
}
#[derive(DeserializeFromValue, Debug, Clone, Copy)]
#[derive(Deserr, Debug, Clone, Copy)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct ListIndexes {
#[deserr(default, error = DeserrQueryParamError<InvalidIndexOffset>)]
@@ -89,7 +87,7 @@ impl ListIndexes {
pub async fn list_indexes(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, Data<IndexScheduler>>,
paginate: QueryParameter<ListIndexes, DeserrQueryParamError>,
paginate: AwebQueryParameter<ListIndexes, DeserrQueryParamError>,
) -> Result<HttpResponse, ResponseError> {
let search_rules = &index_scheduler.filters().search_rules;
let indexes: Vec<_> = index_scheduler.indexes()?;
@@ -105,7 +103,7 @@ pub async fn list_indexes(
Ok(HttpResponse::Ok().json(ret))
}
#[derive(DeserializeFromValue, Debug)]
#[derive(Deserr, Debug)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct IndexCreateRequest {
#[deserr(error = DeserrJsonError<InvalidIndexUid>, missing_field_error = DeserrJsonError::missing_index_uid)]
@@ -116,7 +114,7 @@ pub struct IndexCreateRequest {
pub async fn create_index(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_CREATE }>, Data<IndexScheduler>>,
body: ValidatedJson<IndexCreateRequest, DeserrJsonError>,
body: AwebJson<IndexCreateRequest, DeserrJsonError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
@@ -149,7 +147,7 @@ fn deny_immutable_fields_index(
"uid" => immutable_field_error(field, accepted, Code::ImmutableIndexUid),
"createdAt" => immutable_field_error(field, accepted, Code::ImmutableIndexCreatedAt),
"updatedAt" => immutable_field_error(field, accepted, Code::ImmutableIndexUpdatedAt),
_ => unwrap_any(DeserrJsonError::<BadRequest>::error::<Infallible>(
_ => deserr::take_cf_content(DeserrJsonError::<BadRequest>::error::<Infallible>(
None,
deserr::ErrorKind::UnknownKey { key: field, accepted },
location,
@@ -157,7 +155,7 @@ fn deny_immutable_fields_index(
}
}
#[derive(DeserializeFromValue, Debug)]
#[derive(Deserr, Debug)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields = deny_immutable_fields_index)]
pub struct UpdateIndexRequest {
#[deserr(default, error = DeserrJsonError<InvalidIndexPrimaryKey>)]
@@ -181,7 +179,7 @@ pub async fn get_index(
pub async fn update_index(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_UPDATE }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
body: ValidatedJson<UpdateIndexRequest, DeserrJsonError>,
body: AwebJson<UpdateIndexRequest, DeserrJsonError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {

View File

@@ -1,5 +1,6 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::{AwebJson, AwebQueryParameter};
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_auth::IndexSearchRules;
@@ -14,8 +15,6 @@ use serde_json::Value;
use crate::analytics::{Analytics, SearchAggregator};
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::json::ValidatedJson;
use crate::extractors::query_parameters::QueryParameter;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::search::{
perform_search, MatchingStrategy, SearchQuery, DEFAULT_CROP_LENGTH, DEFAULT_CROP_MARKER,
@@ -31,7 +30,7 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
);
}
#[derive(Debug, deserr::DeserializeFromValue)]
#[derive(Debug, deserr::Deserr)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct SearchQueryGet {
#[deserr(default, error = DeserrQueryParamError<InvalidSearchQ>)]
@@ -150,7 +149,7 @@ fn fix_sort_query_parameters(sort_query: &str) -> Vec<String> {
pub async fn search_with_url_query(
index_scheduler: GuardedData<ActionPolicy<{ actions::SEARCH }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
params: QueryParameter<SearchQueryGet, DeserrQueryParamError>,
params: AwebQueryParameter<SearchQueryGet, DeserrQueryParamError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
@@ -184,7 +183,7 @@ pub async fn search_with_url_query(
pub async fn search_with_post(
index_scheduler: GuardedData<ActionPolicy<{ actions::SEARCH }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
params: ValidatedJson<SearchQuery, DeserrJsonError>,
params: AwebJson<SearchQuery, DeserrJsonError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {

View File

@@ -1,5 +1,6 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::actix_web::AwebJson;
use index_scheduler::IndexScheduler;
use log::debug;
use meilisearch_types::deserr::DeserrJsonError;
@@ -12,7 +13,6 @@ use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::json::ValidatedJson;
use crate::routes::SummarizedTaskView;
#[macro_export]
@@ -68,7 +68,7 @@ macro_rules! make_setting_route {
Data<IndexScheduler>,
>,
index_uid: actix_web::web::Path<String>,
body: $crate::routes::indexes::ValidatedJson<Option<$type>, $err_ty>,
body: deserr::actix_web::AwebJson<Option<$type>, $err_ty>,
req: HttpRequest,
$analytics_var: web::Data<dyn Analytics>,
) -> std::result::Result<HttpResponse, ResponseError> {
@@ -468,7 +468,7 @@ generate_configure!(
pub async fn update_all(
index_scheduler: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, Data<IndexScheduler>>,
index_uid: web::Path<String>,
body: ValidatedJson<Settings<Unchecked>, DeserrJsonError>,
body: AwebJson<Settings<Unchecked>, DeserrJsonError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {

View File

@@ -17,6 +17,8 @@ use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
const PAGINATION_DEFAULT_LIMIT: usize = 20;
mod api_key;
mod dump;
pub mod indexes;
@@ -34,8 +36,6 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
.service(web::scope("/swap-indexes").configure(swap_indexes::configure));
}
const PAGINATION_DEFAULT_LIMIT: usize = 20;
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct SummarizedTaskView {
@@ -59,6 +59,7 @@ impl From<Task> for SummarizedTaskView {
}
}
}
pub struct Pagination {
pub offset: usize,
pub limit: usize,

View File

@@ -1,6 +1,7 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::DeserializeFromValue;
use deserr::actix_web::AwebJson;
use deserr::Deserr;
use index_scheduler::IndexScheduler;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::deserr_codes::InvalidSwapIndexes;
@@ -14,14 +15,13 @@ use crate::analytics::Analytics;
use crate::error::MeilisearchHttpError;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::{AuthenticationError, GuardedData};
use crate::extractors::json::ValidatedJson;
use crate::extractors::sequential_extractor::SeqHandler;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(SeqHandler(swap_indexes))));
}
#[derive(DeserializeFromValue, Debug, Clone, PartialEq, Eq)]
#[derive(Deserr, Debug, Clone, PartialEq, Eq)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct SwapIndexesPayload {
#[deserr(error = DeserrJsonError<InvalidSwapIndexes>, missing_field_error = DeserrJsonError::missing_swap_indexes)]
@@ -30,7 +30,7 @@ pub struct SwapIndexesPayload {
pub async fn swap_indexes(
index_scheduler: GuardedData<ActionPolicy<{ actions::INDEXES_SWAP }>, Data<IndexScheduler>>,
params: ValidatedJson<Vec<SwapIndexesPayload>, DeserrJsonError>,
params: AwebJson<Vec<SwapIndexesPayload>, DeserrJsonError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {

View File

@@ -1,6 +1,7 @@
use actix_web::web::Data;
use actix_web::{web, HttpRequest, HttpResponse};
use deserr::DeserializeFromValue;
use deserr::actix_web::AwebQueryParameter;
use deserr::Deserr;
use index_scheduler::{IndexScheduler, Query, TaskId};
use meilisearch_types::deserr::query_params::Param;
use meilisearch_types::deserr::DeserrQueryParamError;
@@ -23,7 +24,6 @@ use super::SummarizedTaskView;
use crate::analytics::Analytics;
use crate::extractors::authentication::policies::*;
use crate::extractors::authentication::GuardedData;
use crate::extractors::query_parameters::QueryParameter;
use crate::extractors::sequential_extractor::SeqHandler;
const DEFAULT_LIMIT: u32 = 20;
@@ -162,7 +162,7 @@ impl From<Details> for DetailsView {
}
}
#[derive(Debug, DeserializeFromValue)]
#[derive(Debug, Deserr)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct TasksFilterQuery {
#[deserr(default = Param(DEFAULT_LIMIT), error = DeserrQueryParamError<InvalidTaskLimit>)]
@@ -181,19 +181,20 @@ pub struct TasksFilterQuery {
#[deserr(default, error = DeserrQueryParamError<InvalidIndexUid>)]
pub index_uids: OptionStarOrList<IndexUid>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterEnqueuedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterEnqueuedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_enqueued_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeEnqueuedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeEnqueuedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_enqueued_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterStartedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterStartedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_started_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeStartedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeStartedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_started_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterFinishedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterFinishedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_finished_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeFinishedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeFinishedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_finished_at: OptionStarOr<OffsetDateTime>,
}
impl TasksFilterQuery {
fn into_query(self) -> Query {
Query {
@@ -235,7 +236,7 @@ impl TaskDeletionOrCancelationQuery {
}
}
#[derive(Debug, DeserializeFromValue)]
#[derive(Debug, Deserr)]
#[deserr(error = DeserrQueryParamError, rename_all = camelCase, deny_unknown_fields)]
pub struct TaskDeletionOrCancelationQuery {
#[deserr(default, error = DeserrQueryParamError<InvalidTaskUids>)]
@@ -249,19 +250,20 @@ pub struct TaskDeletionOrCancelationQuery {
#[deserr(default, error = DeserrQueryParamError<InvalidIndexUid>)]
pub index_uids: OptionStarOrList<IndexUid>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterEnqueuedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterEnqueuedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_enqueued_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeEnqueuedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeEnqueuedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_enqueued_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterStartedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterStartedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_started_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeStartedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeStartedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_started_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterFinishedAt>, from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskAfterFinishedAt>, try_from(OptionStarOr<String>) = deserialize_date_after -> InvalidTaskDateError)]
pub after_finished_at: OptionStarOr<OffsetDateTime>,
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeFinishedAt>, from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
#[deserr(default, error = DeserrQueryParamError<InvalidTaskBeforeFinishedAt>, try_from(OptionStarOr<String>) = deserialize_date_before -> InvalidTaskDateError)]
pub before_finished_at: OptionStarOr<OffsetDateTime>,
}
impl TaskDeletionOrCancelationQuery {
fn into_query(self) -> Query {
Query {
@@ -284,7 +286,7 @@ impl TaskDeletionOrCancelationQuery {
async fn cancel_tasks(
index_scheduler: GuardedData<ActionPolicy<{ actions::TASKS_CANCEL }>, Data<IndexScheduler>>,
params: QueryParameter<TaskDeletionOrCancelationQuery, DeserrQueryParamError>,
params: AwebQueryParameter<TaskDeletionOrCancelationQuery, DeserrQueryParamError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
@@ -330,7 +332,7 @@ async fn cancel_tasks(
async fn delete_tasks(
index_scheduler: GuardedData<ActionPolicy<{ actions::TASKS_DELETE }>, Data<IndexScheduler>>,
params: QueryParameter<TaskDeletionOrCancelationQuery, DeserrQueryParamError>,
params: AwebQueryParameter<TaskDeletionOrCancelationQuery, DeserrQueryParamError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
@@ -383,7 +385,7 @@ pub struct AllTasks {
async fn get_tasks(
index_scheduler: GuardedData<ActionPolicy<{ actions::TASKS_GET }>, Data<IndexScheduler>>,
params: QueryParameter<TasksFilterQuery, DeserrQueryParamError>,
params: AwebQueryParameter<TasksFilterQuery, DeserrQueryParamError>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
@@ -498,7 +500,7 @@ pub fn deserialize_date_before(
#[cfg(test)]
mod tests {
use deserr::DeserializeFromValue;
use deserr::Deserr;
use meili_snap::snapshot;
use meilisearch_types::deserr::DeserrQueryParamError;
use meilisearch_types::error::{Code, ResponseError};
@@ -507,7 +509,7 @@ mod tests {
fn deserr_query_params<T>(j: &str) -> Result<T, ResponseError>
where
T: DeserializeFromValue<DeserrQueryParamError>,
T: Deserr<DeserrQueryParamError>,
{
let value = serde_urlencoded::from_str::<serde_json::Value>(j)
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::BadRequest))?;

View File

@@ -3,7 +3,7 @@ use std::collections::{BTreeMap, BTreeSet, HashSet};
use std::str::FromStr;
use std::time::Instant;
use deserr::DeserializeFromValue;
use deserr::Deserr;
use either::Either;
use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::deserr_codes::*;
@@ -29,7 +29,7 @@ pub const DEFAULT_CROP_MARKER: fn() -> String = || "…".to_string();
pub const DEFAULT_HIGHLIGHT_PRE_TAG: fn() -> String = || "<em>".to_string();
pub const DEFAULT_HIGHLIGHT_POST_TAG: fn() -> String = || "</em>".to_string();
#[derive(Debug, Clone, Default, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, Default, PartialEq, Eq, Deserr)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
pub struct SearchQuery {
#[deserr(default, error = DeserrJsonError<InvalidSearchQ>)]
@@ -74,7 +74,7 @@ impl SearchQuery {
}
}
#[derive(Debug, Clone, PartialEq, Eq, DeserializeFromValue)]
#[derive(Debug, Clone, PartialEq, Eq, Deserr)]
#[deserr(rename_all = camelCase)]
pub enum MatchingStrategy {
/// Remove query words from last to first

View File

@@ -377,7 +377,7 @@ async fn error_add_api_key_invalid_index_uids() {
meili_snap::snapshot!(code, @"400 Bad Request");
meili_snap::snapshot!(meili_snap::json_string!(response, { ".createdAt" => "[ignored]", ".updatedAt" => "[ignored]" }), @r###"
{
"message": "Invalid value at `.indexes[0]`: `invalid index # / \\name with spaces` is not a valid index uid. Index uid can be an integer or a string containing only alphanumeric characters, hyphens (-) and underscores (_).",
"message": "Invalid value at `.indexes[0]`: `invalid index # / \\name with spaces` is not a valid index uid pattern. Index uid patterns can be an integer or a string containing only alphanumeric characters, hyphens (-), underscores (_), and optionally end with a star (*).",
"code": "invalid_api_key_indexes",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_api_key_indexes"

View File

@@ -77,12 +77,14 @@ static INVALID_RESPONSE: Lazy<Value> = Lazy::new(|| {
})
});
const MASTER_KEY: &str = "MASTER_KEY";
#[actix_rt::test]
async fn error_access_expired_key() {
use std::{thread, time};
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let content = json!({
"indexes": ["products"],
@@ -111,7 +113,7 @@ async fn error_access_expired_key() {
#[actix_rt::test]
async fn error_access_unauthorized_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let content = json!({
"indexes": ["sales"],
@@ -144,7 +146,7 @@ async fn error_access_unauthorized_action() {
for ((method, route), action) in AUTHORIZATIONS.iter() {
// create a new API key letting only the needed action.
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let content = json!({
"indexes": ["products"],
@@ -168,7 +170,7 @@ async fn error_access_unauthorized_action() {
#[actix_rt::test]
async fn access_authorized_master_key() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
// master key must have access to all routes.
for ((method, route), _) in AUTHORIZATIONS.iter() {
@@ -185,7 +187,7 @@ async fn access_authorized_restricted_index() {
for ((method, route), actions) in AUTHORIZATIONS.iter() {
for action in actions {
// create a new API key letting only the needed action.
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let content = json!({
"indexes": ["products"],
@@ -222,7 +224,7 @@ async fn access_authorized_no_index_restriction() {
for ((method, route), actions) in AUTHORIZATIONS.iter() {
for action in actions {
// create a new API key letting only the needed action.
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let content = json!({
"indexes": ["*"],
@@ -255,7 +257,7 @@ async fn access_authorized_no_index_restriction() {
#[actix_rt::test]
async fn access_authorized_stats_restricted_index() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -295,7 +297,7 @@ async fn access_authorized_stats_restricted_index() {
#[actix_rt::test]
async fn access_authorized_stats_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -335,7 +337,7 @@ async fn access_authorized_stats_no_index_restriction() {
#[actix_rt::test]
async fn list_authorized_indexes_restricted_index() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -376,7 +378,7 @@ async fn list_authorized_indexes_restricted_index() {
#[actix_rt::test]
async fn list_authorized_indexes_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -414,10 +416,194 @@ async fn list_authorized_indexes_no_index_restriction() {
assert!(response.iter().any(|index| index["uid"] == "test"));
}
#[actix_rt::test]
async fn access_authorized_index_patterns() {
let mut server = Server::new_auth().await;
server.use_admin_key(MASTER_KEY).await;
// create products_1 index
let index_1 = server.index("products_1");
let (response, code) = index_1.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// create products index
let index_ = server.index("products");
let (response, code) = index_.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// create key with all document access on indices with product_* pattern.
let content = json!({
"indexes": ["products_*"],
"actions": ["documents.*"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::hours(1)).format(&Rfc3339).unwrap(),
});
// Register the key
let (response, code) = server.add_api_key(content).await;
assert_eq!(201, code, "{:?}", &response);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(key);
// refer to products_1 and products with modified api key.
let index_1 = server.index("products_1");
let index_ = server.index("products");
// try to create a index via add documents route
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
// Adding document to products_1 index. Should succeed with 202
let (response, code) = index_1.add_documents(documents.clone(), None).await;
assert_eq!(202, code, "{:?}", &response);
let task_id = response["taskUid"].as_u64().unwrap();
// Adding document to products index. Should Fail with 403 -- invalid_api_key
let (response, code) = index_.add_documents(documents, None).await;
assert_eq!(403, code, "{:?}", &response);
server.use_api_key(MASTER_KEY);
// refer to products_1 with modified api key.
let index_1 = server.index("products_1");
index_1.wait_task(task_id).await;
let (response, code) = index_1.get_task(task_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
}
#[actix_rt::test]
async fn raise_error_non_authorized_index_patterns() {
let mut server = Server::new_auth().await;
server.use_admin_key(MASTER_KEY).await;
// create products_1 index
let product_1_index = server.index("products_1");
let (response, code) = product_1_index.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// create products_2 index
let product_2_index = server.index("products_2");
let (response, code) = product_2_index.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// create test index
let test_index = server.index("test");
let (response, code) = test_index.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// create key with all document access on indices with product_* pattern.
let content = json!({
"indexes": ["products_*"],
"actions": ["documents.*"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::hours(1)).format(&Rfc3339).unwrap(),
});
// Register the key
let (response, code) = server.add_api_key(content).await;
assert_eq!(201, code, "{:?}", &response);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(key);
// refer to products_1 and products_2 with modified api key.
let product_1_index = server.index("products_1");
let product_2_index = server.index("products_2");
// refer to test index
let test_index = server.index("test");
// try to create a index via add documents route
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
// Adding document to products_1 index. Should succeed with 202
let (response, code) = product_1_index.add_documents(documents.clone(), None).await;
assert_eq!(202, code, "{:?}", &response);
let task1_id = response["taskUid"].as_u64().unwrap();
// Adding document to products_2 index. Should succeed with 202
let (response, code) = product_2_index.add_documents(documents.clone(), None).await;
assert_eq!(202, code, "{:?}", &response);
let task2_id = response["taskUid"].as_u64().unwrap();
// Adding document to test index. Should Fail with 403 -- invalid_api_key
let (response, code) = test_index.add_documents(documents, None).await;
assert_eq!(403, code, "{:?}", &response);
server.use_api_key(MASTER_KEY);
// refer to products_1 with modified api key.
let product_1_index = server.index("products_1");
// refer to products_2 with modified api key.
let product_2_index = server.index("products_2");
product_1_index.wait_task(task1_id).await;
product_2_index.wait_task(task2_id).await;
let (response, code) = product_1_index.get_task(task1_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
let (response, code) = product_1_index.get_task(task2_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
}
#[actix_rt::test]
async fn pattern_indexes() {
// Create server with master key
let mut server = Server::new_auth().await;
server.use_admin_key(MASTER_KEY).await;
// index.* constraints on products_* index pattern
let content = json!({
"indexes": ["products_*"],
"actions": ["indexes.*"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::hours(1)).format(&Rfc3339).unwrap(),
});
// Generate and use the api key
let (response, code) = server.add_api_key(content).await;
assert_eq!(201, code, "{:?}", &response);
let key = response["key"].as_str().expect("Key is not string");
server.use_api_key(key);
// Create Index products_1 using generated api key
let products_1 = server.index("products_1");
let (response, code) = products_1.create(Some("id")).await;
assert_eq!(202, code, "{:?}", &response);
// Fail to create products_* using generated api key
let products_1 = server.index("products_*");
let (response, code) = products_1.create(Some("id")).await;
assert_eq!(400, code, "{:?}", &response);
// Fail to create test_1 using generated api key
let products_1 = server.index("test_1");
let (response, code) = products_1.create(Some("id")).await;
assert_eq!(403, code, "{:?}", &response);
}
#[actix_rt::test]
async fn list_authorized_tasks_restricted_index() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -446,7 +632,6 @@ async fn list_authorized_tasks_restricted_index() {
let (response, code) = server.service.get("/tasks").await;
assert_eq!(200, code, "{:?}", &response);
println!("{}", response);
let response = response["results"].as_array().unwrap();
// key should have access on `products` index.
assert!(response.iter().any(|task| task["indexUid"] == "products"));
@@ -458,7 +643,7 @@ async fn list_authorized_tasks_restricted_index() {
#[actix_rt::test]
async fn list_authorized_tasks_no_index_restriction() {
let mut server = Server::new_auth().await;
server.use_admin_key("MASTER_KEY").await;
server.use_admin_key(MASTER_KEY).await;
// create index `test`
let index = server.index("test");
@@ -499,7 +684,7 @@ async fn list_authorized_tasks_no_index_restriction() {
#[actix_rt::test]
async fn error_creating_index_without_action() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
// create key with access on all indexes.
let content = json!({
@@ -587,7 +772,7 @@ async fn lazy_create_index() {
];
for content in contents {
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
let (response, code) = server.add_api_key(content).await;
assert_eq!(201, code, "{:?}", &response);
assert!(response["key"].is_string());
@@ -643,14 +828,114 @@ async fn lazy_create_index() {
}
}
#[actix_rt::test]
async fn lazy_create_index_from_pattern() {
let mut server = Server::new_auth().await;
// create key with access on all indexes.
let contents = vec![
json!({
"indexes": ["products_*"],
"actions": ["*"],
"expiresAt": "2050-11-13T00:00:00Z"
}),
json!({
"indexes": ["products_*"],
"actions": ["indexes.*", "documents.*", "settings.*", "tasks.*"],
"expiresAt": "2050-11-13T00:00:00Z"
}),
json!({
"indexes": ["products_*"],
"actions": ["indexes.create", "documents.add", "settings.update", "tasks.get"],
"expiresAt": "2050-11-13T00:00:00Z"
}),
];
for content in contents {
server.use_api_key(MASTER_KEY);
let (response, code) = server.add_api_key(content).await;
assert_eq!(201, code, "{:?}", &response);
assert!(response["key"].is_string());
// use created key.
let key = response["key"].as_str().unwrap();
server.use_api_key(key);
// try to create a index via add documents route
let index = server.index("products_1");
let test = server.index("test");
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
let (response, code) = index.add_documents(documents.clone(), None).await;
assert_eq!(202, code, "{:?}", &response);
let task_id = response["taskUid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
// Fail to create test index
let (response, code) = test.add_documents(documents, None).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via add settings route
let index = server.index("products_2");
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(202, code, "{:?}", &response);
let task_id = response["taskUid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
// Fail to create test index
let index = server.index("test");
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via add specialized settings route
let index = server.index("products_3");
let (response, code) = index.update_distinct_attribute(json!("test")).await;
assert_eq!(202, code, "{:?}", &response);
let task_id = response["taskUid"].as_u64().unwrap();
index.wait_task(task_id).await;
let (response, code) = index.get_task(task_id).await;
assert_eq!(200, code, "{:?}", &response);
assert_eq!(response["status"], "succeeded");
// Fail to create test index
let index = server.index("test");
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(403, code, "{:?}", &response);
}
}
#[actix_rt::test]
async fn error_creating_index_without_index() {
let mut server = Server::new_auth().await;
server.use_api_key("MASTER_KEY");
server.use_api_key(MASTER_KEY);
// create key with access on all indexes.
let content = json!({
"indexes": ["unexpected"],
"indexes": ["unexpected","products_*"],
"actions": ["*"],
"expiresAt": "2050-11-13T00:00:00Z"
});
@@ -690,4 +975,32 @@ async fn error_creating_index_without_index() {
let index = server.index("test3");
let (response, code) = index.create(None).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via add documents route
let index = server.index("products");
let documents = json!([
{
"id": 1,
"content": "foo",
}
]);
let (response, code) = index.add_documents(documents, None).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via add settings route
let index = server.index("products");
let settings = json!({ "distinctAttribute": "test"});
let (response, code) = index.update_settings(settings).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via add specialized settings route
let index = server.index("products");
let (response, code) = index.update_distinct_attribute(json!("test")).await;
assert_eq!(403, code, "{:?}", &response);
// try to create a index via create index route
let index = server.index("products");
let (response, code) = index.create(None).await;
assert_eq!(403, code, "{:?}", &response);
}

View File

@@ -120,7 +120,7 @@ async fn create_api_key_bad_indexes() {
snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###"
{
"message": "Invalid value at `.indexes[0]`: `good doggo` is not a valid index uid. Index uid can be an integer or a string containing only alphanumeric characters, hyphens (-) and underscores (_).",
"message": "Invalid value at `.indexes[0]`: `good doggo` is not a valid index uid pattern. Index uid patterns can be an integer or a string containing only alphanumeric characters, hyphens (-), underscores (_), and optionally end with a star (*).",
"code": "invalid_api_key_indexes",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_api_key_indexes"
@@ -138,7 +138,7 @@ async fn create_api_key_bad_expires_at() {
snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###"
{
"message": "Unknown field `expires_at`: expected one of `description`, `name`, `uid`, `actions`, `indexes`, `expiresAt`",
"message": "Unknown field `expires_at`: did you mean `expiresAt`? expected one of `description`, `name`, `uid`, `actions`, `indexes`, `expiresAt`",
"code": "bad_request",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#bad_request"
@@ -150,7 +150,7 @@ async fn create_api_key_bad_expires_at() {
snapshot!(code, @"400 Bad Request");
snapshot!(json_string!(response), @r###"
{
"message": "Unknown field `expires_at`: expected one of `description`, `name`, `uid`, `actions`, `indexes`, `expiresAt`",
"message": "Unknown field `expires_at`: did you mean `expiresAt`? expected one of `description`, `name`, `uid`, `actions`, `indexes`, `expiresAt`",
"code": "bad_request",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#bad_request"

View File

@@ -82,6 +82,11 @@ static ACCEPTED_KEYS: Lazy<Vec<Value>> = Lazy::new(|| {
"actions": ["search"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::days(1)).format(&Rfc3339).unwrap()
}),
json!({
"indexes": ["sal*", "prod*"],
"actions": ["search"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::days(1)).format(&Rfc3339).unwrap()
}),
]
});
@@ -104,6 +109,11 @@ static REFUSED_KEYS: Lazy<Vec<Value>> = Lazy::new(|| {
"actions": ["*"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::days(1)).format(&Rfc3339).unwrap()
}),
json!({
"indexes": ["prod*", "p*"],
"actions": ["*"],
"expiresAt": (OffsetDateTime::now_utc() + Duration::days(1)).format(&Rfc3339).unwrap()
}),
json!({
"indexes": ["products"],
"actions": ["search"],
@@ -245,6 +255,10 @@ async fn search_authorized_simple_token() {
"searchRules" => json!(["sales"]),
"exp" => Value::Null
},
hashmap! {
"searchRules" => json!(["sa*"]),
"exp" => Value::Null
},
];
compute_authorized_search!(tenant_tokens, {}, 5);
@@ -351,11 +365,19 @@ async fn filter_search_authorized_filter_token() {
}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!({
"*": {},
"sal*": {"filter": ["color = blue"]}
}),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
];
compute_authorized_search!(tenant_tokens, "color = yellow", 1);
}
/// Tests that those Tenant Token are incompatible with the REFUSED_KEYS defined above.
#[actix_rt::test]
async fn error_search_token_forbidden_parent_key() {
let tenant_tokens = vec![
@@ -383,6 +405,10 @@ async fn error_search_token_forbidden_parent_key() {
"searchRules" => json!(["sales"]),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
hashmap! {
"searchRules" => json!(["sali*", "s*", "sales*"]),
"exp" => json!((OffsetDateTime::now_utc() + Duration::hours(1)).unix_timestamp())
},
];
compute_forbidden_search!(tenant_tokens, REFUSED_KEYS);

View File

@@ -201,6 +201,7 @@ pub fn default_settings(dir: impl AsRef<Path>) -> Opt {
indexer_options: IndexerOpts {
// memory has to be unlimited because several meilisearch are running in test context.
max_indexing_memory: MaxMemory::unlimited(),
skip_index_budget: true,
..Parser::parse_from(None as Option<&str>)
},
#[cfg(feature = "metrics")]

View File

@@ -12,7 +12,7 @@ byteorder = "1.4.3"
charabia = { version = "0.7.0", default-features = false }
concat-arrays = "0.1.2"
crossbeam-channel = "0.5.6"
deserr = "0.3.0"
deserr = "0.4.1"
either = "1.8.0"
flatten-serde-json = { path = "../flatten-serde-json" }
fst = "0.4.7"

View File

@@ -7,45 +7,31 @@ use serde::{Deserialize, Serialize};
use thiserror::Error;
use crate::error::is_reserved_keyword;
use crate::search::facet::BadGeoError;
use crate::{CriterionError, Error, UserError};
/// This error type is never supposed to be shown to the end user.
/// You must always cast it to a sort error or a criterion error.
#[derive(Debug)]
#[derive(Error, Debug)]
pub enum AscDescError {
InvalidLatitude,
InvalidLongitude,
#[error(transparent)]
GeoError(BadGeoError),
#[error("Invalid syntax for the asc/desc parameter: expected expression ending by `:asc` or `:desc`, found `{name}`.")]
InvalidSyntax { name: String },
#[error("`{name}` is a reserved keyword and thus can't be used as a asc/desc rule.")]
ReservedKeyword { name: String },
}
impl fmt::Display for AscDescError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::InvalidLatitude => {
write!(f, "Latitude must be contained between -90 and 90 degrees.",)
}
Self::InvalidLongitude => {
write!(f, "Longitude must be contained between -180 and 180 degrees.",)
}
Self::InvalidSyntax { name } => {
write!(f, "Invalid syntax for the asc/desc parameter: expected expression ending by `:asc` or `:desc`, found `{}`.", name)
}
Self::ReservedKeyword { name } => {
write!(
f,
"`{}` is a reserved keyword and thus can't be used as a asc/desc rule.",
name
)
}
}
impl From<BadGeoError> for AscDescError {
fn from(geo_error: BadGeoError) -> Self {
AscDescError::GeoError(geo_error)
}
}
impl From<AscDescError> for CriterionError {
fn from(error: AscDescError) -> Self {
match error {
AscDescError::InvalidLatitude | AscDescError::InvalidLongitude => {
AscDescError::GeoError(_) => {
CriterionError::ReservedNameForSort { name: "_geoPoint".to_string() }
}
AscDescError::InvalidSyntax { name } => CriterionError::InvalidName { name },
@@ -85,9 +71,9 @@ impl FromStr for Member {
.map_err(|_| AscDescError::ReservedKeyword { name: text.to_string() })
})?;
if !(-90.0..=90.0).contains(&lat) {
return Err(AscDescError::InvalidLatitude)?;
return Err(BadGeoError::Lat(lat))?;
} else if !(-180.0..=180.0).contains(&lng) {
return Err(AscDescError::InvalidLongitude)?;
return Err(BadGeoError::Lng(lng))?;
}
Ok(Member::Geo([lat, lng]))
}
@@ -162,10 +148,8 @@ impl FromStr for AscDesc {
#[derive(Error, Debug)]
pub enum SortError {
#[error("{}", AscDescError::InvalidLatitude)]
InvalidLatitude,
#[error("{}", AscDescError::InvalidLongitude)]
InvalidLongitude,
#[error(transparent)]
ParseGeoError { error: BadGeoError },
#[error("Invalid syntax for the geo parameter: expected expression formated like \
`_geoPoint(latitude, longitude)` and ending by `:asc` or `:desc`, found `{name}`.")]
BadGeoPointUsage { name: String },
@@ -184,8 +168,7 @@ pub enum SortError {
impl From<AscDescError> for SortError {
fn from(error: AscDescError) -> Self {
match error {
AscDescError::InvalidLatitude => SortError::InvalidLatitude,
AscDescError::InvalidLongitude => SortError::InvalidLongitude,
AscDescError::GeoError(error) => SortError::ParseGeoError { error },
AscDescError::InvalidSyntax { name } => SortError::InvalidName { name },
AscDescError::ReservedKeyword { name } if name.starts_with("_geoPoint") => {
SortError::BadGeoPointUsage { name }
@@ -277,11 +260,11 @@ mod tests {
),
("_geoPoint(35, 85, 75):asc", ReservedKeyword { name: S("_geoPoint(35, 85, 75)") }),
("_geoPoint(18):asc", ReservedKeyword { name: S("_geoPoint(18)") }),
("_geoPoint(200, 200):asc", InvalidLatitude),
("_geoPoint(90.000001, 0):asc", InvalidLatitude),
("_geoPoint(0, -180.000001):desc", InvalidLongitude),
("_geoPoint(159.256, 130):asc", InvalidLatitude),
("_geoPoint(12, -2021):desc", InvalidLongitude),
("_geoPoint(200, 200):asc", GeoError(BadGeoError::Lat(200.))),
("_geoPoint(90.000001, 0):asc", GeoError(BadGeoError::Lat(90.000001))),
("_geoPoint(0, -180.000001):desc", GeoError(BadGeoError::Lng(-180.000001))),
("_geoPoint(159.256, 130):asc", GeoError(BadGeoError::Lat(159.256))),
("_geoPoint(12, -2021):desc", GeoError(BadGeoError::Lng(-2021.))),
];
for (req, expected_error) in invalid_req {

View File

@@ -123,7 +123,7 @@ impl<'t> Criterion for Attribute<'t> {
None => {
return Ok(Some(CriterionResult {
query_tree: Some(query_tree),
candidates: Some(RoaringBitmap::new()),
candidates: Some(allowed_candidates),
filtered_candidates: None,
initial_candidates: Some(self.initial_candidates.take()),
}));

View File

@@ -21,18 +21,51 @@ pub struct Filter<'a> {
condition: FilterCondition<'a>,
}
#[derive(Debug)]
pub enum BadGeoError {
Lat(f64),
Lng(f64),
BoundingBoxTopIsBelowBottom(f64, f64),
}
impl std::error::Error for BadGeoError {}
impl Display for BadGeoError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::BoundingBoxTopIsBelowBottom(top, bottom) => {
write!(f, "The top latitude `{top}` is below the bottom latitude `{bottom}`.")
}
Self::Lat(lat) => write!(
f,
"Bad latitude `{}`. Latitude must be contained between -90 and 90 degrees. ",
lat
),
Self::Lng(lng) => write!(
f,
"Bad longitude `{}`. Longitude must be contained between -180 and 180 degrees. ",
lng
),
}
}
}
#[derive(Debug)]
enum FilterError<'a> {
AttributeNotFilterable { attribute: &'a str, filterable_fields: HashSet<String> },
BadGeo(&'a str),
BadGeoLat(f64),
BadGeoLng(f64),
BadGeoBoundingBoxTopIsBelowBottom(f64, f64),
ParseGeoError(BadGeoError),
ReservedGeo(&'a str),
Reserved(&'a str),
TooDeep,
}
impl<'a> std::error::Error for FilterError<'a> {}
impl<'a> From<BadGeoError> for FilterError<'a> {
fn from(geo_error: BadGeoError) -> Self {
FilterError::ParseGeoError(geo_error)
}
}
impl<'a> Display for FilterError<'a> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
@@ -44,7 +77,11 @@ impl<'a> Display for FilterError<'a> {
attribute,
)
} else {
let filterables_list = filterable_fields.iter().map(AsRef::as_ref).collect::<Vec<&str>>().join(" ");
let filterables_list = filterable_fields
.iter()
.map(AsRef::as_ref)
.collect::<Vec<&str>>()
.join(" ");
write!(
f,
@@ -53,20 +90,19 @@ impl<'a> Display for FilterError<'a> {
filterables_list,
)
}
},
Self::TooDeep => write!(f,
}
Self::TooDeep => write!(
f,
"Too many filter conditions, can't process more than {} filters.",
MAX_FILTER_DEPTH
),
Self::ReservedGeo(keyword) => write!(f, "`{}` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` field coordinates.", keyword),
Self::Reserved(keyword) => write!(
f,
"`{}` is a reserved keyword and thus can't be used as a filter expression.",
keyword
),
Self::BadGeo(keyword) => write!(f, "`{}` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance)` or `_geoBoundingBox([latitude, longitude], [latitude, longitude])` built-in rules to filter on `_geo` field coordinates.", keyword),
Self::BadGeoBoundingBoxTopIsBelowBottom(top, bottom) => write!(f, "The top latitude `{top}` is below the bottom latitude `{bottom}`."),
Self::BadGeoLat(lat) => write!(f, "Bad latitude `{}`. Latitude must be contained between -90 and 90 degrees. ", lat),
Self::BadGeoLng(lng) => write!(f, "Bad longitude `{}`. Longitude must be contained between -180 and 180 degrees. ", lng),
Self::ParseGeoError(error) => write!(f, "{}", error),
}
}
}
@@ -298,10 +334,10 @@ impl<'a> Filter<'a> {
} else {
match fid.value() {
attribute @ "_geo" => {
Err(fid.as_external_error(FilterError::BadGeo(attribute)))?
Err(fid.as_external_error(FilterError::ReservedGeo(attribute)))?
}
attribute if attribute.starts_with("_geoPoint(") => {
Err(fid.as_external_error(FilterError::BadGeo("_geoPoint")))?
Err(fid.as_external_error(FilterError::ReservedGeo("_geoPoint")))?
}
attribute @ "_geoDistance" => {
Err(fid.as_external_error(FilterError::Reserved(attribute)))?
@@ -353,14 +389,10 @@ impl<'a> Filter<'a> {
let base_point: [f64; 2] =
[point[0].parse_finite_float()?, point[1].parse_finite_float()?];
if !(-90.0..=90.0).contains(&base_point[0]) {
return Err(
point[0].as_external_error(FilterError::BadGeoLat(base_point[0]))
)?;
return Err(point[0].as_external_error(BadGeoError::Lat(base_point[0])))?;
}
if !(-180.0..=180.0).contains(&base_point[1]) {
return Err(
point[1].as_external_error(FilterError::BadGeoLng(base_point[1]))
)?;
return Err(point[1].as_external_error(BadGeoError::Lng(base_point[1])))?;
}
let radius = radius.parse_finite_float()?;
let rtree = match index.geo_rtree(rtxn)? {
@@ -398,27 +430,26 @@ impl<'a> Filter<'a> {
bottom_right_point[1].parse_finite_float()?,
];
if !(-90.0..=90.0).contains(&top_left[0]) {
return Err(top_left_point[0]
.as_external_error(FilterError::BadGeoLat(top_left[0])))?;
return Err(
top_left_point[0].as_external_error(BadGeoError::Lat(top_left[0]))
)?;
}
if !(-180.0..=180.0).contains(&top_left[1]) {
return Err(top_left_point[1]
.as_external_error(FilterError::BadGeoLng(top_left[1])))?;
return Err(
top_left_point[1].as_external_error(BadGeoError::Lng(top_left[1]))
)?;
}
if !(-90.0..=90.0).contains(&bottom_right[0]) {
return Err(bottom_right_point[0]
.as_external_error(FilterError::BadGeoLat(bottom_right[0])))?;
.as_external_error(BadGeoError::Lat(bottom_right[0])))?;
}
if !(-180.0..=180.0).contains(&bottom_right[1]) {
return Err(bottom_right_point[1]
.as_external_error(FilterError::BadGeoLng(bottom_right[1])))?;
.as_external_error(BadGeoError::Lng(bottom_right[1])))?;
}
if top_left[0] < bottom_right[0] {
return Err(bottom_right_point[1].as_external_error(
FilterError::BadGeoBoundingBoxTopIsBelowBottom(
top_left[0],
bottom_right[0],
),
BadGeoError::BoundingBoxTopIsBelowBottom(top_left[0], bottom_right[0]),
))?;
}

View File

@@ -4,7 +4,7 @@ use heed::types::{ByteSlice, DecodeIgnore};
use heed::{BytesDecode, RoTxn};
pub use self::facet_distribution::{FacetDistribution, DEFAULT_VALUES_PER_FACET};
pub use self::filter::Filter;
pub use self::filter::{BadGeoError, Filter};
use crate::heed_codec::facet::{FacetGroupKeyCodec, FacetGroupValueCodec};
use crate::heed_codec::ByteSliceRefCodec;
mod facet_distribution;

View File

@@ -11,6 +11,7 @@ pub struct IndexerConfig {
pub chunk_compression_level: Option<u32>,
pub thread_pool: Option<ThreadPool>,
pub max_positions_per_attributes: Option<u32>,
pub skip_index_budget: bool,
}
impl Default for IndexerConfig {
@@ -24,6 +25,7 @@ impl Default for IndexerConfig {
chunk_compression_level: None,
thread_pool: None,
max_positions_per_attributes: None,
skip_index_budget: false,
}
}
}

View File

@@ -2,7 +2,7 @@ use std::collections::{BTreeSet, HashMap, HashSet};
use std::result::Result as StdResult;
use charabia::{Tokenizer, TokenizerBuilder};
use deserr::{DeserializeError, DeserializeFromValue};
use deserr::{DeserializeError, Deserr};
use itertools::Itertools;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use time::OffsetDateTime;
@@ -23,9 +23,9 @@ pub enum Setting<T> {
NotSet,
}
impl<T, E> DeserializeFromValue<E> for Setting<T>
impl<T, E> Deserr<E> for Setting<T>
where
T: DeserializeFromValue<E>,
T: Deserr<E>,
E: DeserializeError,
{
fn deserialize_from_value<V: deserr::IntoValue>(