Compare commits

...

702 Commits

Author SHA1 Message Date
f67167fa9f Merge #2178
2178: Refacto docker r=irevoire a=irevoire

closes #2166 and #2085

-----------

I noticed many people had issues with the default configuration of our Dockerfile.
Some examples:
- #2166: If you use ubuntu and mount your `data.ms` in a volume (as shown in the [doc](https://docs.meilisearch.com/learn/getting_started/installation.html#download-and-launch)), you can't run meilisearch
- #2085: Here, meilisearch was not able to erase the `data.ms` when loading a dump because it's the mount point
Currently, we don't show how to use the snapshot and dumps with docker in the documentation. And it's quite hard to do:
  - You either send a big command to meilisearch to change the dump-path, snapshot-path and db-path a single directory and then mount that one
  - Or you mount three volumes
- And there were other issues on the slack community

I think this PR solve the problem.
Now the image contains the `meilisearch` binary in the `/bin` directory, so it's easy to find and always in the `PATH`.
It creates a `data` directory and moves the working-dir in it.
So now you can find the `dumps`, `snapshots` and `data.ms` directory in `/data`.

Here is the new command to run meilisearch with a volume:
```
docker run -it --rm -v $PWD/meili_data:/data -p 7700:7700 getmeili/meilisearch:latest
```

And if you need to import a dump or a snapshot, you don't need to restart your container and mount another volume. You can directly hit the `POST /dumps` route and then run:
```
docker run -it --rm -v $PWD/meili_data:/data -p 7700:7700 getmeili/meilisearch:latest meilisearch --import-dump dumps/20220217-152115159.dump
```

-------

You can already try this PR with the following docker image:
```
getmeili/meilisearch:test-docker-v0.26.0
```

If you want to use the v0.25.2 I created another image;
```
getmeili/meilisearch:test-docker-v0.25.2
```

------

If you're using helm I created a branch [here](https://github.com/meilisearch/meilisearch-kubernetes/tree/test-docker-v0.26.0) that use the v0.26.0 image with the good volume 👍 
If you use this conf with the v0.25.2, it should also work.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 12:01:25 +00:00
31584f34e8 Merge #2298
2298: Nested fields r=irevoire a=irevoire

There are a few things that I want to fix _AFTER_ merging this PR.
For the following RCs.

## Stop the useless conversion
In the `search.rs` I convert a `Document` to a `Value`, and then the `Value` to a `Document` and then back to a `Value` etc. I should stop doing all these conversion and stick to one format.
Probably by merging my `permissive-json-pointer` crate into meilisearch.
That would also give me the opportunity to work directly with obkvs and stops deserializing fields I don't need.

## Add more test specific to the nested
Everything seems to works but I should write tests to double check that the nested works well with the `formatted` field.

## See how I could stop iterating on hashmap and instead fill them correctly
This is related to milli. I really often needs to iterate over hashmap to see if a field is a subset of another field. I could probably generate a structure containing all the possible key values.
ie. the user say `doggo` is an attribute to retrieve. Instead of iterating on all the attributes to retrieve to check if `doggo.name` is a subset of `doggo`. I should insert `doggo.name` in the attributes to retrieve map.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 11:45:37 +00:00
a70e0a6422 Merge #2304
2304: chore(bors): comments clippy out r=curquiza a=irevoire

There is currently an issue with clippy that stops us from merging PRs.
https://github.com/rust-lang/rust-clippy/issues/8662#issuecomment-1093899755

We can't use clippy in the CI while that's not merged


Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 11:20:18 +00:00
348345f555 chore(bors): comments clippy out
There is currently an issue with clippy that stops us from merging PRs.
https://github.com/rust-lang/rust-clippy/issues/8662#issuecomment-1093899755

We can't use clippy in the CI while that's not merged
2022-04-11 13:19:00 +02:00
683206e140 feat(docker): refactoring the dockerfile
- Move the meilisearch binary to `/bin/meilisearch` so it's always in scope.
- Create a `meili_data` directory used as the default working directory
2022-04-11 13:14:44 +02:00
69d312209e feat(search): Implements the nested fields
See https://github.com/meilisearch/specifications/pull/121
2022-04-07 19:47:20 +02:00
013fe4cbc9 Merge #2297
2297: Feat(Search): Enhance formating search results r=ManyTheFish a=ManyTheFish

Add new settings and change crop_len behavior to count words instead of characters.

- [x] `highlightPreTag`
- [x] `highlightPostTag`
- [x] `cropMarker`
- [x] `cropLength` count word instead of chars
- [x] `cropLength` 0 is now considered as no `cropLength`
- [ ] ~smart crop finding the best matches interval~ (postponed)

Partially fixes  #2214. (no smart crop)


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-04-07 13:29:56 +00:00
dc2cc1ee89 Feat(Search): Enhance formating search results 2022-04-07 15:04:08 +02:00
bb5f0e1485 Merge #2271
2271: Simplify Dockerfile r=ManyTheFish a=Thearas

# Pull Request

## What does this PR do?

1. Fixes #2234
2. Replace `$TARGETPLATFORM` with `apk --print-arch` to make Dockerfile available for `docker build` as well, not just `docker buildx` (inspired by [rust-lang/docker-rust](https://github.com/rust-lang/docker-rust/blob/master/1.59.0/alpine3.14/Dockerfile#L13))

PTAL `@curquiza` 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Thearas <thearas850@gmail.com>
2022-04-07 11:50:21 +00:00
d5e33637b7 Merge #2296
2296: disable typo for attributes r=curquiza a=MarinPostma

Introduce the disable typos on attribute feature as per https://github.com/meilisearch/specifications/pull/117.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 18:10:45 +00:00
c321ac61b5 Merge #2259
2259: disable typos on words r=MarinPostma a=MarinPostma

Introduce the disable typo setting as per https://github.com/meilisearch/specifications/pull/117.

waiting for https://github.com/meilisearch/milli/pull/474.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 17:40:08 +00:00
67dea08a0a feat(http, lib): enable disable typos on attributes 2022-04-06 19:25:12 +02:00
e9f66b8766 feat(all): introduce disable typo on words 2022-04-06 19:16:36 +02:00
dd43ba6234 feat(all): introduce disable typos 2022-04-06 19:10:12 +02:00
27a88bcd47 feat(all): introduce minWordLengthForTypo
fix typo in settting

skip serializing not set typo settings
2022-04-06 19:03:24 +02:00
065fe19452 Merge #2249
2249: feat(all): introduce disable typos r=irevoire a=MarinPostma

Introduce the disable typo setting, that allows disabling typos for an index.

waiting on https://github.com/meilisearch/milli/pull/469

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 16:55:46 +00:00
981fba5b44 feat(all): introduce disable typos 2022-04-06 15:47:48 +02:00
09734f0732 Merge #2294
2294: chore(lib): bump milli to 0.25.0 r=MarinPostma a=MarinPostma

bump milli


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 13:19:20 +00:00
a523828f61 chore(lib): bump milli to 0.25.0 2022-04-06 15:03:10 +02:00
7f7958f815 Merge #2277
2277: fix(http): fix panic when sending document update without content type header r=MarinPostma a=MarinPostma

I found a panic when pushing documents without a content-type. This fixes is by returning unknown instead of crashing.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-05 12:07:17 +00:00
9e344f6576 Merge #2207
2207: Fix: avoid embedding the user input into the error response. r=Kerollmops a=CNLHC

# Pull Request

## What does this PR do?
Fix #2107. 

The problem is meilisearch embeds the user input to the error message. 

The reason for this problem is `milli` throws a `serde_json: Error` whose `Display` implementation will do this embedding.  

I tried to solve this problem in this PR by manually implementing the `Display` trait for `DocumentFormatError` instead of deriving automatically.

<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Liu Hancheng <cn_lhc@qq.com>
Co-authored-by: LiuHanCheng <2463765697@qq.com>
2022-04-04 17:35:17 +00:00
09a72cee03 Merge #2281
2281: Hard limit the number of results returned by a search r=Kerollmops a=Kerollmops

This PR fixes #2133 by hard-limiting the number of results that a search request can return at any time. I would like the guidance of `@MarinPostma` to test that, should I use a mocking test here? Or should I do anything else?

I talked about touching the _nb_hits_ value with `@qdequele` and we concluded that it was not correct to do so.

Could you please confirm that it is the right place to change that?

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-04-04 17:19:05 +00:00
6fc6b83632 Update meilisearch-http/tests/documents/add_documents.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-04-01 09:30:40 +08:00
eee2cd5abf Update meilisearch-http/tests/documents/add_documents.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-04-01 09:30:32 +08:00
87e4125875 Merge #2267
2267: Add instance options for RAM and CPU usage r=Kerollmops a=2shiori17

# Pull Request

## What does this PR do?
Fixes #2212 
<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: 2shiori17 <98276492+2shiori17@users.noreply.github.com>
Co-authored-by: shiori <98276492+2shiori17@users.noreply.github.com>
2022-03-31 15:29:18 +00:00
7ece7a9d9e change truncate strategy and coresponding test 2022-03-31 10:39:21 +08:00
403f03cb2c Update meilisearch-http/tests/documents/add_documents.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-31 10:14:22 +08:00
b28aa8e666 Update meilisearch-lib/src/document_formats.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-31 10:14:13 +08:00
98107565c0 Add more detailed comments for max_indexing_threads 2022-03-31 09:32:45 +09:00
a2d7c16f91 Remove indexing_jobs option 2022-03-31 09:27:29 +09:00
ffafd5b976 Add tests for the hard limit 2022-03-30 16:36:02 -07:00
9f1c88680d Fix my mistake when resolving conflicts 2022-03-31 02:48:41 +09:00
9edd407a88 Merge branch 'main' into add-instance-options 2022-03-31 02:38:07 +09:00
8bc6e8dcf9 Make sure that offsets are clamped too 2022-03-30 10:06:15 -07:00
2624c76517 Merge #2254
2254: Test with default CLI opts r=Kerollmops a=Kerollmops

Fixes #2252.

This PR makes sure that we test the HTTP engine with the default CLI parameters and removes some useless internal CLI options.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-03-29 22:39:40 +00:00
891d042164 Remove the memory limit to let Windows tests pass 2022-03-29 11:37:08 -07:00
b3a11e04af Implement Default on IndexerOpts again 2022-03-29 11:37:08 -07:00
acdb10a307 Remove some useless indexer options 2022-03-29 11:37:08 -07:00
8fecc6238d Make the test use the default CLI options 2022-03-29 11:37:08 -07:00
405af09fc8 Hard limit the number of results returned by a search 2022-03-29 11:27:53 -07:00
0d6be2efab Merge #2280
2280: Bump the milli dependency to 0.24.1 r=curquiza a=Kerollmops

We had issues with lindera recently, it was unable to download the official dictionaries from Google Drive and this was causing issues with our CIs (and other users' CIs too). The maintainer changed the source to download the dictionaries to get it from Sourceforge and it is much better and stable now.

This PR bumps the milli dependency to the latest version which includes the latest version of the tokenizer which, itself, includes the latest version of lindera, I advise that we rebase the currently opened pull requests to include this PR when it is merged on main.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-03-29 17:39:03 +00:00
94f04e79eb Bump the milli dependency to 0.24.1 2022-03-29 09:17:25 -07:00
381e98053f Merge #2263
2263: Upgrade the config of the issue management r=curquiza a=curquiza

To reduce the feature feedback in the meilisearch repo

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-03-29 09:11:06 +00:00
6c2fdc7743 fix(http): fix panic when sending document update without content type header 2022-03-29 09:48:25 +02:00
513b37e245 Merge #2253
2253: refactor authentication key extraction r=ManyTheFish a=MarinPostma

I am concerned that the part of the code that performs the key prefix extraction from the jwt token migh be misused in the future. Since this is a critical part of the code, I moved it into it's own function. Since we deserialized the payload twice anyway, I reordered the verifications, and we now use the data from the validated token.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-28 08:53:13 +00:00
13a0e78d3f Update meilisearch-lib/src/document_formats.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-28 14:58:00 +08:00
80d8ac40af Update meilisearch-lib/src/document_formats.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-28 14:57:51 +08:00
48d107cc13 Simplify Dockerfile 2022-03-26 20:06:00 +08:00
3ef250c30a fix: docker image failed to boot on arm64 node 2022-03-26 17:41:00 +08:00
c7b489f8cb tidy 2022-03-25 21:36:11 +08:00
ce12000af3 add some comments 2022-03-25 21:33:47 +08:00
3c72f4dc51 fix test and add truncate test. 2022-03-25 21:31:23 +08:00
ce85981a4e add truncate logic 2022-03-25 20:53:28 +08:00
193c666bf9 Merge branch 'main' of github.com:meilisearch/meilisearch into CNLHC/change_json_error_message 2022-03-25 19:53:13 +08:00
705d10a96d Add instance options for RAM and CPU usage 2022-03-24 18:52:36 +00:00
c0056ab73f Merge #2264
2264: Import milli from meilisearch-lib r=Kerollmops a=Kerollmops

This PR directly imports the milli dependency used in _meilisearch-http_ from _meilisearch-lib_. I can't import _meilisearch-lib_ in _meiliserach-auth_ and that _meilisearch-auth_ can't use the milli exported by _meilisearch-lib_ 😞

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-03-24 16:28:54 +00:00
3df542f072 Export milli's heed from meilisearch-lib 2022-03-24 15:30:10 +01:00
09212abdf7 Update the cargo lock 2022-03-24 14:53:08 +01:00
ee6be4f6b9 Import milli from meilisearch-lib in meilisearch-http 2022-03-24 14:45:37 +01:00
4e3b20ed73 Update config.yml 2022-03-24 12:13:30 +01:00
6a82a055d3 chore(auth): refactor token validation 2022-03-21 11:18:51 +01:00
7e65816d63 Merge #2237
2237: Update dependencies r=MarinPostma a=Kerollmops

This PR upgrade and updates the dependencies of meilisearch, but first I removed three unused dependencies. I used [cargo udeps](https://github.com/est31/cargo-udeps) to detect those and [cargo upgrade](https://github.com/killercup/cargo-edit/blob/master/README.md#available-subcommands) to upgrade ⬆️

~This PR **must** be merged when https://github.com/meilisearch/milli/pull/465 is merged and then must be updated accordingly i.e. using the latest version of milli.~

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-03-17 17:15:19 +00:00
5bffa4b7f9 Tenant token validation is now created by a function 2022-03-17 17:55:50 +01:00
d1c0ecceb9 Merge #2245
2245: Add test to validate cli r=irevoire a=MarinPostma

followup on #2242 and #2243

Add a test to make sure the cli is valid, and add a CI task to run the tests in debug to make sure we hit debug assertions.

FYI `@curquiza,` because of CI changes

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-17 16:14:31 +00:00
1d683865cf chore(CI): add debug test to CI 2022-03-17 12:40:20 +01:00
4aef7c5ac5 Fix tenant token validation when exp is null 2022-03-17 11:05:03 +01:00
968053649b Change the jsonwebtoken crate usage 2022-03-17 11:03:32 +01:00
ac48860bbb Upgrade the workspace dependencies 2022-03-17 11:03:31 +01:00
46e6d23dd2 Remove the zstd dependency comming from actix-web/http 2022-03-17 11:00:25 +01:00
55c9514c6b Reorder the Meilisearch features for more readability 2022-03-17 11:00:25 +01:00
86c1e83ea1 Remove three unused dependencies 2022-03-17 11:00:24 +01:00
bb9372114c Merge #2244
2244: chore(all): bump milli r=curquiza a=MarinPostma

continues the work initiated by `@psvnlsaikumar` in #2228

Co-authored-by: Sai Kumar <psvnlsaikumar@gmail.com>
2022-03-16 17:15:10 +00:00
b2bbd13c27 Merge #2243
2243: bug(http): fix panic on startup r=MarinPostma a=MarinPostma

this seems to fix #2242


I am not sure why this doesn't reproduce on v0.26.0, so we should remain vigilant.

`@curquiza` FYI

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-16 16:22:13 +00:00
22c61a1ecb chore(http): add test for validity of cli 2022-03-16 17:17:57 +01:00
e271395971 chore(all): bump milli
* updates to Use the milli's heed dependency #2210

* Update index.rs

* Update store.rs

* Update mod.rs

* cargo fmt
2022-03-16 16:34:44 +01:00
32843f30d9 bug(http): fix panic on startup 2022-03-16 13:33:37 +01:00
469aa8feab Merge #2238
2238: cargo: use resolver 2 r=MarinPostma a=happysalada

# Pull Request

## What does this PR do?
use resolver 2 from cargo.
This enables mainly to propagate the `--no-default-features` flag to workspace crates.
I mistakenly thought before that it was enough to have edition 2021 enabled. However it turns out that for virtual workspaces, this needs to be explicitely defined.
https://doc.rust-lang.org/edition-guide/rust-2021/default-cargo-resolver.html

This will also change a little how your dependencies are compiled. See https://blog.rust-lang.org/2021/03/25/Rust-1.51.0.html#cargos-new-feature-resolver for more details.

Just to give a bit more context, this is for usage in nixos. I have tried to do the upgrade today with the latest version, and the no default features flag is just ignored.

Let me know if you need more details of course.


## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: happysalada <raphael@megzari.com>
2022-03-15 15:50:16 +00:00
748d5b69a5 cargo: use resolver 2 2022-03-14 23:18:59 -04:00
3b2fe3aec8 Merge #2232
2232: Bring `stable` into `main` (v0.26.0) r=Kerollmops a=curquiza

Bring `stable` into `main`

Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ad hoc <postma.marin@protonmail.com>
Co-authored-by: Rob Ede <robjtede@icloud.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-03-14 14:46:12 +00:00
d9eb1f7f00 Merge #2233
2233: Change CI name for publishing binaries r=Kerollmops a=curquiza

Minor change regarding the CI job names. Should not impact the usage.

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-03-14 14:17:05 +00:00
f5a72bb19a Change CI name for publishing binaries 2022-03-14 14:20:25 +01:00
35bf7ee538 fix test 2022-03-08 12:26:02 +08:00
c8895cab77 Update meilisearch-lib/src/document_formats.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-03-08 12:03:59 +08:00
b669a73432 Merge #2209
2209: rename auto batching cli r=curquiza a=MarinPostma

rename `--enable-autobatching` to `--enable-auto-batching`.

as per https://github.com/meilisearch/specifications/pull/96#issuecomment-1060693721

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-07 15:58:58 +00:00
833f7fbdbe Merge #2204
2204: Fix blocking auth r=Kerollmops a=MarinPostma

Fix auth blocking runtime

I have decided to remove async code from `meilisearch-auth` and let `meilisearch-http` handle that.

Because Actix polls the extractor futures concurrently, I have made a wrapper extractor that forces the errors from the futures to be returned sequentially (though is still polls them sequentially).
close #2201

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-07 15:39:24 +00:00
62ce8e0bda chore(http): rename auto batching cli option 2022-03-07 15:19:19 +01:00
ddd25bfe01 remove token from InvalidToken error 2022-03-07 15:16:07 +01:00
19da45c53b Update meilisearch-http/src/extractors/sequential_extractor.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-03-07 15:02:07 +01:00
0026410c61 review edits 2022-03-07 14:21:44 +01:00
b57c59baa4 sequential extractor 2022-03-04 20:43:12 +01:00
a356c8359c fix broken test 2022-03-04 15:39:18 +08:00
b138b92d39 show detailed error message 2022-03-04 15:31:11 +08:00
58e2903177 first try 2022-03-04 10:46:59 +08:00
af8a5f2c21 async auth 2022-03-02 19:25:51 +01:00
d6400aef27 remove async from meilsearch-authentication 2022-03-02 18:22:34 +01:00
81fe65afed Merge #2200
2200: Fix(dumps): Explicitly define serde for time r=ManyTheFish a=ManyTheFish

fix #2199 


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-03-02 10:39:24 +00:00
c2b58720d1 Fix(dumps): Explicitly define serde for time 2022-03-02 11:37:48 +01:00
5515aa5045 Merge #2197
2197: Additions to 0.26 (Update actix-web dependency to 4.0) r=curquiza a=MarinPostma

- `@robjtede`
`@MarinPostma`
[update actix-web dependency to 4.0](3b2e467ca6)

From main to release-v0.26.0


Co-authored-by: Rob Ede <robjtede@icloud.com>
2022-02-28 18:09:43 +00:00
15150db957 clippy 2022-02-28 19:03:38 +01:00
3b2e467ca6 update actix-web dependency to 4.0 2022-02-28 19:03:37 +01:00
0c9e8cdf8d Merge #2194
2194: update actix-web dependency to 4.0 r=irevoire a=robjtede

# Pull Request

## What does this PR do?
Updates Actix Web ecosystem crates to 4.0 stable.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] ~~Does this PR fix an existing issue?~~
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?




Co-authored-by: Rob Ede <robjtede@icloud.com>
2022-02-28 15:23:42 +00:00
8d624b3800 clippy 2022-02-28 13:43:22 +00:00
4fbb83a34d bug(snapshot): Correctly open environments in snapshots 2022-02-28 12:37:30 +01:00
961e22493c update actix-web dependency to 4.0 2022-02-25 23:28:55 +00:00
09ee8e34a5 Merge #2192
2192: Fix max dbs error r=Kerollmops a=MarinPostma

Factor the way we open environments to make sure they are always opened with the same options.


The issue was that indexes were first opened in snapshots with incorrect options, and heed cache returned an environment with incorrect open options on subsequent index open.

fix #2190


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-02-23 16:18:23 +00:00
7e832105d7 bug(snapshot): Correctly open environments in snapshots 2022-02-23 17:12:08 +01:00
ff6a7b6007 Merge #2191
2191: fix(analytics): flatten the scheduler options r=curquiza a=irevoire

Implement missing part of [this spec](https://github.com/meilisearch/specifications/blob/develop/text/0034-telemetry-policies.md) by flattening the scheduler options.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-02-22 16:19:07 +00:00
6312e7f1f3 fix(analytics): flatten the scheduler options 2022-02-22 15:55:50 +01:00
bfb375ac87 Merge #2189
2189: fix(all): fix two dates that were wrongly formatted r=irevoire a=irevoire

Fix #2188 and fix #2187 

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-02-22 10:48:45 +00:00
21d277a0ef fix(all): fix two dates that were wrongly formatted 2022-02-22 11:29:11 +01:00
c3e3c900f2 Merge #2173
2173: chore(all): replace chrono with time r=irevoire a=irevoire

Chrono has been unmaintained for a few month now and there is a CVE on it.

Also I updated all the error messages related to the API key as you can see here: https://github.com/meilisearch/specifications/pull/114

fix #2172

Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-02-17 14:12:23 +00:00
05c8d81e65 chore: get rid of chrono in favor of time
Chrono has been unmaintened for a few month now and there is a CVE on it.

make clippy happy

bump milli
2022-02-16 18:14:29 +01:00
cd6276eef9 Merge #2174
2174: Update dashboard with v0.1.8 r=curquiza a=mdubus



Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2022-02-16 16:15:50 +00:00
7bcaa2fd13 Update dashboard to v0.1.9 2022-02-16 15:53:15 +01:00
67ecd7c147 Update dashboard with v0.1.8 2022-02-16 14:10:47 +01:00
ce6ff294cf Merge #2171
2171: Update LICENSE with Meili SAS r=curquiza a=curquiza

Check with thomas, we must put the real name of the company

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-02-15 18:29:03 +00:00
b318ab46cb Update LICENSE 2022-02-15 15:54:45 +01:00
5890600101 Merge #2170
2170: Update version (v0.26.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-02-14 15:22:09 +00:00
e2a9414c7a Update version (v0.26.0) 2022-02-14 16:11:07 +01:00
216965e9d9 Merge #2122
2122: fix: docker image failed to boot on arm64 node r=curquiza a=Thearas

# Pull Request

## What does this PR do?
Fixes #2115.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Thearas <thearas850@gmail.com>
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-02-14 11:41:18 +00:00
d0ddbcc2b2 update 2022-02-14 18:08:57 +08:00
41db2601a6 Update Dockerfile 2022-02-14 10:44:03 +01:00
42cb94e1f4 fix: docker image failed to boot on arm64 node 2022-02-14 13:23:20 +08:00
6e7a0cc65d Merge #2157
2157: fix(auth): fix env being closed when dumping r=Kerollmops a=MarinPostma

When creating a dump, the auth store environment would be closed on drop, so subsequent dumps couldn't reopen the environment. I have added a flag in the environment to prevent the closing of the environment on drop when dumping.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-02-09 15:04:19 +00:00
23eba82038 fix(auth): fix env being closed when dumping 2022-02-09 13:56:26 +01:00
001b9acb63 Merge #2155
2155: Fix curl command in download-latest script r=Kerollmops a=curquiza

Fixes #2151 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-02-08 15:29:51 +00:00
af65ccfd6a Fix curl command in download-latest script 2022-02-08 16:28:26 +01:00
0c7251475d Merge #2150
2150: Bump milli to v0.22.1 r=curquiza a=curquiza

Fixes https://github.com/meilisearch/meilisearch/issues/2138 and https://github.com/meilisearch/meilisearch/issues/2123 by bumping milli to [v0.22.1](https://github.com/meilisearch/milli/releases/tag/v0.22.1)

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-02-08 15:05:06 +00:00
1a87b2f37d Bump milli to v0.22.1 2022-02-08 11:21:44 +01:00
752a0e13ad Merge #2136
2136: Refactoring CI regarding ARM binary publish r=curquiza a=curquiza

Fixes https://github.com/meilisearch/meilisearch/issues/1909

- Remove CI file to publish aarch64 binary and put the logic into `publish-binary.yml`
- Remove the job to publish armv8 binary
- Fix download-latest script accordingly
- Adapt dowload-latest with the specific case of the MacOS m1

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2022-02-07 16:07:46 +00:00
ccaca33446 Add --fail-with-body flag to curl in script 2022-02-07 16:16:49 +01:00
2a90e805a2 Fix script 2022-02-07 16:05:48 +01:00
c4a2d70d19 Fix error handler for curl command in script 2022-02-07 16:01:50 +01:00
f7e4a0177d Update download-latest.sh
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-02-07 13:53:30 +01:00
cca65499de Merge #2145
2145: Update LICENSE r=meili-bot a=curquiza



Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-02-07 12:51:50 +00:00
80fa7dbbfa Update LICENSE 2022-02-05 18:29:47 +01:00
c24b1e5250 Merge #2135
2135: bug(auth): Make API keys accept Null descriptions r=curquiza a=ManyTheFish

Fix #2116


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-02-03 15:26:11 +00:00
78cf8f1f9f Fix typo 2022-02-02 19:32:20 +01:00
1da7277817 Fix dowload-latest.sh according to the new name of the binary 2022-02-02 19:25:52 +01:00
c71c95feb0 Refactor CIs to publish aaarch64 binary 2022-02-02 19:25:28 +01:00
3bee31e6c7 bug(auth): Make API keys accept Null descriptions 2022-02-02 18:18:17 +01:00
9448ca58aa Merge #2005
2005: auto batching r=MarinPostma a=MarinPostma

This pr implements auto batching. The basic functioning of this is that all updates that can be batched together are batched together while the previous batch is being processed.

For now, the only updates that can be batched together are the document addition updates (both update and replace), for a single index.

The batching is disabled by default for multiple reasons:
- We need more experimentation with the scheduling techniques
- Right now, if one task fails in a batch, the whole batch fails. We need more permissive error handling when processing document indexation.

There are four CLI options, for now, to interact with how the batch is scheduled:
- `enable-autobatching`: enable the autobatching feature.
- `debounce-duration-sec`: When an update is received, wait that number of seconds before batching and performing the updates. Defaults to 0s.
- `max-batch-size`: the maximum number of tasks per batch, defaults to unlimited.
- `max-documents-per-batch`: the maximum number of documents in a batch, defaults to unlimited. The batch will always contain a least 1 task, no matter the number of documents in that task.

# Implementation

The current implementation is made of 3 major components:

## TaskStore
The `TaskStore` contains all the tasks. When a task is pushed, it is directly registered to the task store.

## Scheduler
The scheduler is in charge of making the batches. At its core, there is a `TaskQueue` and a job queue. `Job`s are always processed first. They are *volatile* tasks, that is, they don't have a TaskId and are not persisted to disk. Snapshots and dumps are examples of Jobs.

If no `Job` is available for processing, then the scheduler attempts to make a `Task` batch from the `TaskQueue`. The first step is to gather new tasks from the `TaskStore` to populate the `TaskQueue`. When this is done, we can prepare our batch. The `TaskQueue` is itself a `BinaryHeap` of `Tasklist`. Each `index_uid` is associated with a `TaskList` that contains all the updates associated with that index uid. Each `TaskList` in the `TaskQueue` is ordered by the id of its first task.

When preparing a batch, the `TaskList` at the top of the `TaskQueue` is popped, and the tasks are popped from the list to make the next batch. If there are remaining tasks in the list, the list is inserted back in the `TaskQueue`.

## UpdateLoop
The `UpdateLoop` role is to perform batch sequentially. Each time updates are pushed to the update store, the scheduler is notified, and will in turn notify the update loop that work can be performed. When notified, the update loop waits some time to wait for more incoming update and then asks the scheduler for the next batch to perform and perform it. When it is done, the status of the task is put back into the store, and the next batch is processed.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2022-02-02 11:04:30 +00:00
c9a236b0af feat(lib): auto-batching 2022-02-01 18:06:20 +01:00
622c15e825 Merge #2096
2096: feat(auth): Tenant token r=Kerollmops a=ManyTheFish

Make meilisearch support JWT authentication signed with meilisearch API keys
using HS256, HS384 or HS512 algorithms.

Related spec: [specifications#89](https://github.com/meilisearch/specifications/pull/89) [rendered](https://github.com/meilisearch/specifications/blob/scoped-api-keys/text/0089-tenant-tokens.md)
Fix #1991 


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-27 10:38:41 +00:00
054598734a Merge #2120
2120: Bring `stable` into `main` r=curquiza a=curquiza

I forgot to do it, tell me `@Kerollmops` or `@irevoire` if it's useful or not. I would say yes, otherwise I will have conflict when I will try to bring `main` into `stable` for the next release. Maybe I'm wrong

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-01-27 09:35:21 +00:00
7ca647f0d0 feat(auth): Implement Tenant token
Make meilisearch support JWT authentication signed with meilisearch API keys
using HS256, HS384 or HS512 algorithms.

Related spec: https://github.com/meilisearch/specifications/pull/89
Fix #1991
2022-01-27 08:25:39 +01:00
aa50fcb1f0 Merge branch 'main' into stable 2022-01-26 20:17:41 +01:00
b408de0761 Merge pull request #2117 from meilisearch/rebranding
Changes related to the rebranding
2022-01-26 19:58:54 +01:00
72d9c5ee5c fix(rebranding): Update the ascii art (#2118) 2022-01-26 18:53:07 +01:00
2b7440d4b5 Fix some typo 2022-01-26 17:56:18 +01:00
3bc6a18bcd Update README.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-01-26 17:54:51 +01:00
db56d6cb11 Update download-latest.sh 2022-01-26 17:54:22 +01:00
a5759139bf Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-01-26 17:51:38 +01:00
8a959da120 Update MeiliSearch into Meilisearch everywhere 2022-01-26 17:43:16 +01:00
0a78750465 Replace meilisearch by Meilisearch 2022-01-26 17:35:56 +01:00
372f4fc924 Replace logo 2022-01-26 17:34:31 +01:00
ae5b401e74 Update README.md 2022-01-26 16:31:04 +01:00
c562655be7 Update CONTRIBUTING.md 2022-01-26 16:31:03 +01:00
c8bb54cd94 Merge #2098
2098: feat(dump): Provide the same cli options as the snapshots r=MarinPostma a=irevoire

Add two cli options for the dump:
- `--ignore-missing-dump`
- `--ignore-dump-if-db-exists`

Fix #2087

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-26 14:32:23 +00:00
8c80326dd5 Merge #2086
2086: feat(analytics): send the whole set of cli options instead of only the snapshot r=MarinPostma a=irevoire

Fixes #2088 

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-26 13:53:20 +00:00
bad4bed439 feat(dump): Provide the same cli options as the snapshots
Add two cli options for the dump:
- `--ignore-missing-dump`
- `--ignore-dump-if-db-exists`

Fix #2087
2022-01-26 14:34:06 +01:00
7828da15c3 feat(analytics): send the whole set of cli options instead of only the snapshot 2022-01-26 13:52:41 +01:00
7e2f6063ae Merge #2099 #2108
2099: feat(analytics): Set the timestamp of the aggregated event as the first aggregate r=MarinPostma a=irevoire



2108: meta(auth): Enhance tests on authorization r=MarinPostma a=ManyTheFish

Enhance auth tests in order to be able to add new actions without changing tests.

Helping #2080 

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-24 15:13:01 +00:00
2b766a2f26 meta(auth): Enhance tests on authorization
Enhance auth tests in order to be able to add new action without changing tests
2022-01-24 15:35:39 +01:00
8ae504bfb0 Merge #2101
2101: chore(all): update actix-web dependency to 4.0.0-beta.21 r=MarinPostma a=robjtede

# Pull Request

## What does this PR do?
I don't expect any more breaking changes to Actix Web that will affect Meilisearch so bump to latest beta.

Fixes #N/A?
<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to MeiliSearch!


Co-authored-by: Rob Ede <robjtede@icloud.com>
2022-01-24 14:33:46 +00:00
5981e6c57c Merge #2095
2095: feat(error): Update the error message when you have no version file r=MarinPostma a=irevoire

Following this [issue](https://github.com/meilisearch/meilisearch-kubernetes/issues/95) we decided to change the error message from:
```
Version file is missing or the previous MeiliSearch engine version was below 0.24.0. Use a dump to update MeiliSearch.
```
to
```
Version file is missing or the previous MeiliSearch engine version was below 0.25.0. Use a dump to update MeiliSearch.
```

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-24 14:09:13 +00:00
1be3a1e945 Merge #2075
2075: Allow payloads with no documents  r=irevoire a=MarinPostma

accept addition with 0 documents.

0 bytes payload are still refused, since they are not valid json/jsonlines/csv anyways...

close #1987


Co-authored-by: mpostma <postma.marin@protonmail.com>
2022-01-24 12:55:29 +00:00
629b897845 feat(error): Update the error message when you have no version file 2022-01-24 13:44:00 +01:00
9f5fee404b chore(all): update actix-web dependency to 4.0.0-beta.21 2022-01-21 20:44:17 +00:00
40bf98711c feat(analytics): Set the timestamp of the aggregated event as the first aggregate 2022-01-20 19:08:57 +01:00
f9f075bca2 Merge #2068
2068: chore(http): migrate from structopt to clap3 r=Kerollmops a=MarinPostma

migrate from structopt to clap3

This fix the long lasting issue with flags require a value, such as `--no-analytics` or `--schedule-snapshot`.

All flag arguments now take NO argument, i.e:
`meilisearch --schedule-snapshot true` becomes `meilisearch --schedule-snapshot`

as per https://docs.rs/clap/latest/clap/struct.Arg.html#method.env, the env variable is defines as:
> A false literal is n, no, f, false, off or 0. An absent environment variable will also be considered as false. Anything else will considered as true.

`@gmourier` 
`@curquiza` 
`@meilisearch/docs-team` 

Co-authored-by: mpostma <postma.marin@protonmail.com>
2022-01-20 10:59:44 +00:00
0c1a3d59eb fix no-analytics 2022-01-20 11:50:24 +01:00
523fb5cd56 Merge #2084
2084: bump milli r=Kerollmops a=irevoire

- Fix https://github.com/meilisearch/MeiliSearch/issues/2082 by updating milli dependency
- Fix Clippy error
- Change the MeiliSearch version in the cargo.toml to anticipate the coming release (v0.25.2)

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-18 14:30:37 +00:00
436f61a7f4 chore: bump meilisearch 2022-01-18 12:27:15 +01:00
3fab5869fa chore: bump milli 2022-01-18 11:50:17 +01:00
0515c6e844 bug(http): fix task duration 2022-01-13 16:41:07 +01:00
38176181ac fix(dump): Fix the import of dump from the v24 and before 2022-01-13 16:40:58 +01:00
a7e634bd4f Merge #2074
2074: fix(dump): Fix the import of dumps when there is no data.ms r=irevoire a=irevoire



Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-01-13 13:47:03 +00:00
78a381a30b Merge #2076
2076: fix(dump): Fix the import of dump from the v24 and before r=ManyTheFish a=irevoire

Same as https://github.com/meilisearch/MeiliSearch/pull/2073 but on main this time

Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-01-13 13:09:23 +00:00
343bce6a29 fix(dump): Fix the import of dump from the v24 and before 2022-01-13 13:23:57 +01:00
d263f762bf feat(http): accept empty document additions
wip
2022-01-13 12:46:56 +01:00
dfaeb19566 fix(dump): Fix the import of dumps when there is no data.ms 2022-01-13 12:30:58 +01:00
010dcc3e80 Merge #2066
2066: bug(http): fix task duration r=MarinPostma a=MarinPostma

`@gmourier` found that the duration in the task view was not computed correctly, this pr fixes it.

`@curquiza,` I let you decide if we need to make a hotfix out of this or wait for the next release. This is not breaking.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2022-01-12 14:50:58 +00:00
d0aa5f747c Merge #2067
2067: chore(all): fix rust edition r=irevoire a=MarinPostma

I hadn't correctly set the rust edition in my previous pr, and cargo was returning a warning. This time I followed this guide: https://doc.rust-lang.org/edition-guide/editions/transitioning-an-existing-project-to-a-new-edition.html


Co-authored-by: mpostma <postma.marin@protonmail.com>
2022-01-12 13:32:42 +00:00
f6d53e03f1 chore(http): migrate from structopt to clap3 2022-01-12 14:07:19 +01:00
3ecebd15ee chore(all): fix rust edition 2022-01-12 11:14:50 +01:00
db83e39a7f bug(http): fix task duration 2022-01-11 18:01:25 +01:00
5d48f72ade Merge #2065
2065: MeiliSearch v0.25.0: `stable` -> `main` r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Marin Postma <postma.marin@protonmail.com>
Co-authored-by: Maxime Legendre <maximelegendre@MacBook-Pro-de-Maxime.local>
Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-11 16:30:22 +00:00
1818026a84 Merge #2057
2057: fix(dump): Uncompress the dump IN the data.ms  r=irevoire a=irevoire

When loading a dump with docker, we had two problems.
After creating a tempdirectory, uncompressing and re-indexing the dump:
1. We try to `move` the new “data.ms” onto the currently present
   one. The problem is that if the `data.ms` is a mount point because
   that's what peoples do with docker usually. We can't override
   a mount point, and thus we were throwing an error.
2. The tempdir is created in `/tmp`, which is usually quite small AND may not
   be on the same partition as the `data.ms`. This means when we tried to move
   the dump over the `data.ms`, it was also failing because we can't move data
   between two partitions.
------------------
1 was fixed by deleting the *content* of the `data.ms` and moving the *content*
of the tempdir *inside* the `data.ms`. If someone tries to create volumes inside
the `data.ms` that's his problem, not ours.
2 was fixed by creating the tempdir *inside* of the `data.ms`. If a user mounted
its `data.ms` on a large partition, there is no reason he could not load a big
dump because his `/tmp` was too small. This solves the issue; now the dump is
extracted and indexed on the same partition the `data.ms` will lay.

fix #1833

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-10 17:57:16 +00:00
0ad7d38eec Merge #2061
2061: Update dashboard for v0.25.0 r=curquiza a=mdubus



Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2022-01-10 16:29:31 +00:00
b17ad5c2be Update with latest release of the dashboard 2022-01-10 17:10:09 +01:00
1824b3c07b Merge #2060
2060: chore(all) set rust edition to 2021 r=MarinPostma a=MarinPostma

set the rust edition for the project to 2021

this make the MSRV to v1.56

#2058


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2022-01-10 15:04:14 +00:00
c9c7da3626 fix(dump): Uncompress the dump IN the data.ms
When loading a dump with docker, we had two problems.
After creating a tempdirectory, uncompressing and re-indexing the dump:
1. We try to `move` the new “data.ms” onto the currently present
   one. The problem is that if the `data.ms` is a mount point because
   that's what peoples do with docker usually. We can't override
   a mount point, and thus we were throwing an error.
2. The tempdir is created in `/tmp`, which is usually quite small AND may not
   be on the same partition as the `data.ms`. This means when we tried to move
   the dump over the `data.ms`, it was also failing because we can't move data
   between two partitions.
==============
1 was fixed by deleting the *content* of the `data.ms` and moving the *content*
of the tempdir *inside* the `data.ms`. If someone tries to create volumes inside
the `data.ms` that's his problem, not ours.
2 was fixed by creating the tempdir *inside* of the `data.ms`. If a user mounted
its `data.ms` on a large partition, there is no reason he could not load a big
dump because his `/tmp` was too small. This solves the issue; now the dump is
extracted and indexed on the same partition the `data.ms` will lay.

fix #1833
2022-01-10 14:56:03 +01:00
030a90523d Update dashboard for v0.25.0 2022-01-10 10:50:57 +01:00
56d223a51d Merge #2059
2059: change indexed doc count on error r=irevoire a=MarinPostma

change `indexed_documents` and `deleted_documents` to return 0 instead of null when empty when the task has failed.

close #2053


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2022-01-06 15:55:50 +00:00
f558ff826a feat(http): task view indexed and deleted documents return 0 instead of null 2022-01-06 14:55:02 +01:00
5fb4ed60e7 chore(all) set rust edition to 2021 2022-01-06 13:30:45 +01:00
0d2a358cc2 Merge #2056
2056: Allow any header for CORS r=curquiza a=curquiza

Bug fix: trigger a CORS error when trying to send the `User-Agent` header via the browser

`@bidoubiwa` thanks for the bug report!

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-01-05 16:45:51 +00:00
595250c93e Allow any header for CORS 2022-01-05 15:38:47 +01:00
c636988935 Merge #2055
2055: fix(dump): Fix the loading of dump with empty indexes r=irevoire a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2022-01-05 14:31:53 +00:00
eea483c470 fix(dump): Fix the loading of dump with empty indexes 2022-01-05 15:08:21 +01:00
d53c61a6d0 Merge #2054
2054: Bug(auth): Wrap key list in results r=irevoire a=ManyTheFish

fix #2052

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-04 15:44:55 +00:00
c0d4f71a34 Bug(auth): Wrap key list in results 2022-01-04 14:10:30 +01:00
f56989e46e Merge #2011
2011: bug(lib): drop env on last use r=curquiza a=MarinPostma

fixes the `too many open files` error when running tests by closing the
environment on last drop

To check that we are actually the last owner of the `env` we plan to drop, I have wrapped all envs in `Arc`, and check that we have the last reference to it.


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2022-01-04 11:03:08 +00:00
c0251eb680 Merge #2050
2050: Bug(CORS): Add missing allowed headers r=curquiza a=ManyTheFish

fix #2040

## test
html file to test:

```html
<!DOCTYPE html>
<html>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type">
<meta content="utf-8" http-equiv="encoding">

<script>
var xmlHttp = new XMLHttpRequest();
    xmlHttp.open( "GET", "http://127.0.0.1:7700/indexes/toto", false ); // false for synchronous request
    xmlHttp.setRequestHeader("Authorization", "Bearer manythefish");
    xmlHttp.send( null );
    console.log(xmlHttp.responseText);
</script>
</html>
```


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-01-03 14:23:34 +00:00
450b81ca13 Bug(CORS): Add missing allowed headers
fix #2040
2022-01-03 13:41:12 +01:00
2f3faadcbf Merge #2034
2034: Fix typo r=curquiza a=curquiza

Fix `Meilisearch` typo into `MeiliSearch`

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-01-03 09:40:56 +00:00
5986a2d126 Merge #2036
2036: chore(ci): Enable rust_backtrace in the ci r=curquiza a=irevoire

This should help us to understand unreproducible panics that happens in the CI all the time

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-12-22 19:00:08 +00:00
d75e84f625 chore(ci): Enable rust_backtrace in the ci 2021-12-22 18:20:44 +01:00
c221277fd2 Merge #2035
2035: Use self hosted GitHub runner r=curquiza a=curquiza

Checked with `@tpayet,` we have created a self hosted github runner to save time when pushing the docker images.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-22 15:28:33 +00:00
3b30fadb55 Merge #2037
2037: test: Ignore the auths tests on windows r=irevoire a=irevoire

Since the auths tests fail sporadically on the windows CI but we can't reproduce these failures with a real windows machine we are going to ignore these ones.

But we still ensure they compile.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-12-22 15:13:32 +00:00
d7df4d6b84 test: Ignore the auths tests on windows
Since the auths tests fail sporadically on the windows CI but we can't
reproduce these failures with a real windows machine we are going to
ignore theses one.
But we still ensure they compile.
2021-12-22 12:39:48 +01:00
fd854035c1 Use self hosted github runner 2021-12-21 18:32:29 +01:00
4d1c138842 Merge #2032
2032: Revert docker as non root PR r=curquiza a=ManyTheFish

Revert #1759

hotfix for #1969

Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 16:19:36 +00:00
7649239b08 Revert docker as non root PR 2021-12-21 16:59:15 +01:00
0e2f6ba1b6 Merge #2033
2033: Bug(FS): Consider empty pre-created directory as unexisting DB r=curquiza a=ManyTheFish

When the database directory was pre-created we were considering that DB is invalid, we are now accepting to create a database in it.


Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 15:12:07 +00:00
f529c46598 Fix typo in error messages and comments 2021-12-21 16:01:38 +01:00
1ba49d2ddb Bug(FS): Consider empty pre-created directory as unexisting DB 2021-12-21 15:30:11 +01:00
1b5ca88231 Merge #2026
2026: Bug(auth): Parse YMD date r=curquiza a=ManyTheFish

Use NaiveDate to parse YMD date instead of NaiveDatetime

fix #2017


Co-authored-by: Maxime Legendre <maximelegendre@mbp-de-maxime.home>
2021-12-21 13:48:21 +00:00
37329e0784 Bug(auth): Parse YMD date
Use NaiveDate to parse YMD date instead of NaiveDatetime

fix #2017
2021-12-20 15:30:11 +01:00
eaff393c76 Merge #2025
2025: Fix security index creation r=ManyTheFish a=ManyTheFish

Forbid index creation on alternates routes when the action `index.create` is not given

fix #2024

Co-authored-by: Maxime Legendre <maximelegendre@MacBook-Pro-de-Maxime.local>
2021-12-20 14:04:28 +00:00
a845cd8880 Fix(auth): Forbid index creation on alternates routes
Forbid index creation on alternates routes when the action `index.create` is not given

fix #2024
2021-12-20 14:48:18 +01:00
b28a465304 bug(lib): drop env on last use
fixes the `too many open files` error when running tests by closing the
environment on last drop
2021-12-16 10:57:55 +01:00
845d3114ea Merge #2008
2008: bug(lib): fix get dumps bad error code r=curquiza a=MarinPostma

fix bad error code being returned whet getting a dump status, and add a test
close #1994

Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-15 18:58:17 +00:00
287fa7ca74 Merge #2006 #2007
2006: chore(http): rename task types r=curquiza a=MarinPostma

Rename
- documentsAddition into documentAddition
- documentsPartial into documentPartial
- documentsDeletion into documentDeletion

close #1999


2007: bug(lib): ignore primary if already set on document addition r=curquiza a=MarinPostma

Ignore the primary key if it is already set on documents updates. Add a test for verify behaviour.

close #2002


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-15 16:55:40 +00:00
80ed9654e1 chore(http): rename task types 2021-12-15 17:01:34 +01:00
7ddab7ef31 bug(lib): fix get dumps bad error code 2021-12-15 16:58:05 +01:00
d534a7f7c8 bug(lib): ignore primary if already set on document addition 2021-12-15 14:58:37 +01:00
5af51c852c Merge #1989
1989: Extend API keys r=curquiza a=ManyTheFish

# Pull Request

## What does this PR do?

- Add API keys in snapshots
- Add API keys in dumps
- fix QA #1979

fix #1979
fix #1995
fix #2001
fix #2003

related to #1890

Co-authored-by: many <maxime@meilisearch.com>
2021-12-14 17:22:58 +00:00
ee7970f603 feat(auth): Extend API keys
- Add API keys in snapshots
- Add API keys in dumps
- Rename action indexes.add to indexes.create
- fix QA #1979

fix #1979
fix #1995
fix #2001
fix #2003
related to #1890
2021-12-14 17:33:39 +01:00
5453877ca7 Merge #1982
1982: Set fail-fast to false in publish-binaries CI r=curquiza a=curquiza

This avoids the other jobs to fail if one of the jobs fails.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-14 15:37:26 +00:00
ea0a5271f7 Merge #1908
1908: Update CONTRIBUTING.md r=curquiza a=ferdi05

Added a sentence on other means to contribute. If we like it, we can add it in some other places.

# Pull Request

## What does this PR do?
Improves `CONTRIBUTING`

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to MeiliSearch!


Co-authored-by: Ferdinand Boas <ferdinand.boas@gmail.com>
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2021-12-09 14:02:48 +00:00
879cc4ec26 Merge #1984
1984: Support boolean for the no-analytics flag r=Kerollmops a=Kerollmops

This PR fixes an issue with the `no-analytics` flag that was ignoring the value passed to it, therefore a `no-analytics false` was just understood as a `no-analytics` and was effectively disabling the analytics instead of enabling them. I found [a closed issue about this exact behavior on the structopt repository](https://github.com/TeXitoi/structopt/issues/468) and applied it here.

I don't think we should update the documentation as it must have worked like this from the start of this project. I tested it on my machine and it is working great now. Thank you `@nicolasvienot` for this issue report.

Fixes #1983.

Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-12-08 16:41:46 +00:00
6ac2475aba Fix the no-analytics flag in the tests 2021-12-08 12:02:18 +01:00
47d5f659e0 Bump the structopt crate to 0.3.25 2021-12-08 11:24:40 +01:00
8c9e51e94f Make sure that we can also specify the no-analytics flags with a boolean 2021-12-08 11:23:21 +01:00
0da5aca9f6 Set fail-fast to false in publish-binaries CI 2021-12-08 09:54:13 +01:00
9906db9e64 Merge #1978
1978: Fix of `release-v0.25.0` branch into `main` r=curquiza a=curquiza

The fixes in #1976 should be on main to be taken into account by 
```
curl -L https://install.meilisearch.com | sh
```

Co-authored-by: Yann Prono <yann.prono@nist.gov>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-12-07 17:13:38 +00:00
8096b568f0 Merge #1976 #1977
1976: Fix download-latest.sh r=curquiza a=Mcdostone

# Pull Request

## What does this PR do?
Fixes #1975

The script was broken because `grep` matches the word **draft** in the changelog of [v0.25.0rc0](https://github.com/meilisearch/MeiliSearch/releases/tag/v0.25.0rc0)

> Misc
>    Remove email address from the launch message (#1896) `@curquiza`
>    Remove release drafter workflow (#1882) `@curquiza`                           ⚠️👀


## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to MeiliSearch!
> Your product is awesome!


1977: Fix Dockerfile r=MarinPostma a=curquiza

Remove this error

<img width="830" alt="Capture d’écran 2021-12-07 à 17 00 46" src="https://user-images.githubusercontent.com/20380692/145063294-51ae2c50-2468-47e9-a891-542d824cad8e.png">


Co-authored-by: Yann Prono <yann.prono@nist.gov>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-07 16:27:51 +00:00
2934a77832 Fix Dockerfile 2021-12-07 17:00:24 +01:00
cf6cb938a6 update download-latest.sh 2021-12-07 16:37:22 +01:00
8ff6b1b540 update download-latest.sh 2021-12-07 16:23:30 +01:00
a938a9ab0f Merge #1974
1974: Update version for the next release (v0.25.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-12-07 13:11:24 +00:00
ae73386723 Update version for the next release (v0.25.0) 2021-12-07 14:00:43 +01:00
34c8a859eb Merge pull request #1973 from meilisearch/drop-dump-v1
feat(dumps): drop dump V1 support
2021-12-07 14:00:06 +01:00
80d039042b Update CONTRIBUTING.md 2021-12-07 11:58:44 +01:00
5606e22d97 Update CONTRIBUTING.md
typo
2021-12-07 11:46:08 +01:00
23e35fa526 feat(dumps): drop dump V1 support 2021-12-07 10:36:27 +01:00
82033f935e Merge #1970
1970: Use milli reexported tokenizer  r=curquiza a=ManyTheFish

Use milli reexported tokenizer instead of importing meilisearch-tokenizer dependency.

fix #1888

Co-authored-by: many <maxime@meilisearch.com>
2021-12-07 08:50:44 +00:00
ae2b0e7aa7 Use milli reexported tokenizer instead of importing meilisearch-tokenizer dependency 2021-12-06 17:18:28 +01:00
948615537b Merge #1965
1965: Reintroduce engine version file r=MarinPostma a=irevoire

Right now if you boot up MeiliSearch and point it to a DB directory created with a previous version of MeiliSearch the existing indexes will be deleted. This [used to be](51d7c84e73) prevented by a startup check which would compare the current engine version vs what was stored in the DB directory's version file, but this functionality seems to have been lost after a few refactorings of the code.

In order to go back to the old behavior we'll need to reintroduce the `VERSION` file that used to be present; I considered reusing the `metadata.json` file used in the dumps feature, but this seemed like the simpler and more approach. As the intent is just to restore functionality, the implementation is quite basic. I imagine that in the future we could build on this and do things like compatibility across major/minor versions and even migrating between formats.

This PR was made thanks to `@mbStavola` and is basically a port of his PR #1860 after a big refacto of the code #1796.

Closes #1840

Co-authored-by: Matt Stavola <m.freitas@offensive-security.com>
2021-12-06 13:39:37 +00:00
a0e129304c feat(lib): Reintroduce engine version file
Right now if you boot up MeiliSearch and point it to a DB directory created with a previous version of MeiliSearch the existing indexes will be deleted. This used to be prevented by a startup check which would compare the current engine version vs what was stored in the DB directory's version file, but this functionality seems to have been lost after a few refactorings of the code.
In order to go back to the old behavior we'll need to reintroduce the VERSION file that used to be present; I considered reusing the metadata.json file used in the dumps feature, but this seemed like the simpler and more approach. As the intent is just to restore functionality, the implementation is quite basic. I imagine that in the future we could build on this and do things like compatibility across major/minor versions and even migrating between formats.

This PR was made thanks to @mbStavola

Closes #1840
2021-12-06 14:30:56 +01:00
8d72d538de Merge #1890
1890: Api keys r=MarinPostma a=ManyTheFish

# Pull Request

## API keys management and authorizations
Fixes #1867

## PR checklist

- [x] Test `/keys` routes
- [x] Implement `/keys` routes
- [x] Test API key authorization
- [x] Implement API key authorization
- [x] Rename Authentication header
- [x] default key creation

## Postponed in another PR
- dumps  (waiting for #1796)
- snapshot  (waiting for #1796)




Co-authored-by: many <maxime@meilisearch.com>
2021-12-06 13:23:23 +00:00
ffefd0caf2 feat(auth): API keys
implements:
https://github.com/meilisearch/specifications/blob/develop/text/0085-api-keys.md

- Add tests on API keys management route (meilisearch-http/tests/auth/api_keys.rs)
- Add tests checking authorizations on each meilisearch routes (meilisearch-http/tests/auth/authorization.rs)
- Implement API keys management routes (meilisearch-http/src/routes/api_key.rs)
- Create module to manage API keys and authorizations (meilisearch-auth)
- Reimplement GuardedData to extend authorizations (meilisearch-http/src/extractors/authentication/mod.rs)
- Change X-MEILI-API-KEY by Authorization Bearer (meilisearch-http/src/extractors/authentication/mod.rs)
- Change meilisearch routes to fit to the new authorization feature (meilisearch-http/src/routes/)

- close #1867
2021-12-06 09:52:41 +01:00
fa196986c2 Merge #1796
1796: Feature branch: Task store r=irevoire a=MarinPostma

# Feature branch: Task Store

## Spec todo
https://github.com/meilisearch/specifications/blob/develop/text/0060-refashion-updates-apis.md

- [x] The update resource is renamed task. The names of existing API routes are also changed to reflect this change.
- [x]    Tasks are now also accessible as an independent resource of an index. GET - /tasks; GET - /tasks/:taskUid
- [x] The task uid is not incremented by index anymore. The sequence is generated globally.
- [x] A task_not_found error is introduced.
- [x] The format of the task object is updated.
- [x] updateId becomes uid.
- [x] Attributes of an error appearing in a failed task are now contained in a dedicated error object.
- [x] type is no longer an object. It now becomes a string containing the values of its name field previously defined in the type object.
- [x] The possible values for the type field are reworked to be more clear and consistent with our naming rules.
- [x] A details object is added to contain specific information related to a task payload that was previously displayed in the type nested object. Previous number key is renamed numberOfDocuments.
- [x] An indexUid field is added to give information about the related index on which the task is performed.
- [x] duration format has been updated to express an ISO 8601 duration.
- [x] processed status changes to succeeded.
- [x] startedProcessingAt is updated to startedAt.
- [x] processedAt is updated to finishedAt.
- [x] 202 Accepted requests previously returning an updateId are now returning a summarized task object.
- [x] MEILI_MAX_UDB_SIZE env var is updated MEILI_MAX_TASK_DB_SIZE.
- [x] --max-udb-size cli option is updated to --max-task-db-size.
- [x] task object lists are now returned under a results array.
- [x] Each operation on an index (creation, update, deletion) is now asynchronous and represented by a task.

## Todo tech

- [x] Restore Snapshots
- [x] Restore dumps of documents
- [x] Implements the dump of updates
- [x] Error  handling
- [x] Fix stats
- [x] Restore the Analytics
- [x] [Add the new analytics](https://github.com/meilisearch/specifications/pull/92/files)
- [x] Fix tests
- [x] ~Deleting tasks when index is deleted (see bellow)~ see #1891 instead
- [x] Improve details for documents addition and deletion tasks
- [ ] Add integration test
- [ ] Test task store filtering
- [x] Rename `UuidStore` to `IndexMetaStore`, and simplify the trait.
- [x] Fix task store initialization: fill pending queue from hard state
- [x] Synchronously return error when creating an index with an invalid index_uid and add test
- [x] Task should be returned in decreasing uid + tests (on index task route)
- [x] Summarized task view
- [x] fix snapshot permissions


## Implementation

### Linked PRs
- #1889
- #1891
- #1892
- #1902
- #1906
- #1911
- #1914
- #1915
- #1916
- #1918
- #1924
- #1925
- #1926
- #1930
- #1936
- #1937
- #1942
- #1944
- #1945
- #1946
- #1947
- #1950
- #1951
- #1957
- #1959
- #1960
- #1961
- #1962
- #1964

### Linked PRs in milli:
- https://github.com/meilisearch/milli/pull/414
- https://github.com/meilisearch/milli/pull/409
- https://github.com/meilisearch/milli/pull/406
- https://github.com/meilisearch/milli/pull/418

### Issues
- close #1687
- close #1786
- close #1940
- close #1948
- close #1949
- close #1932
- close #1956

### Spec patches
- https://github.com/meilisearch/specifications/pull/90


Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-12-03 11:36:53 +00:00
a30e02c18c feat(all): Task store
implements:
https://github.com/meilisearch/specifications/blob/develop/text/0060-refashion-updates-apis.md

linked PR:

- #1889
- #1891
- #1892
- #1902
- #1906
- #1911
- #1914
- #1915
- #1916
- #1918
- #1924
- #1925
- #1926
- #1930
- #1936
- #1937
- #1942
- #1944
- #1945
- #1946
- #1947
- #1950
- #1951
- #1957
- #1959
- #1960
- #1961
- #1962
- #1964

- https://github.com/meilisearch/milli/pull/414
- https://github.com/meilisearch/milli/pull/409
- https://github.com/meilisearch/milli/pull/406
- https://github.com/meilisearch/milli/pull/418

- close #1687
- close #1786
- close #1940
- close #1948
- close #1949
- close #1932
- close #1956
2021-12-02 20:14:42 +01:00
c9f3726447 Merge #1893
1893: Make matches work with numerical value r=MarinPostma a=Thearas

# Pull Request

## What does this PR do?

Implement #1883.

I have test this PR with unit test. It appears to be working properly:
![image](https://user-images.githubusercontent.com/44015907/141141082-dad8cd18-e803-408f-ad6a-c7a212b7ec88.png)

PTAL `@curquiza` 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Thearas <thearas850@gmail.com>
2021-11-29 13:38:57 +00:00
8363200fd7 Merge #1910
1910: After v0.24.0: import `stable` in `main` r=MarinPostma a=curquiza



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-17 12:48:56 +00:00
37548eb720 Merge #1896
1896: Remove email address from the message at the launch r=irevoire a=curquiza

I suggest removing this email address from the message at the launch since it can encourage people to think this is an email address for support. Is it something we want `@meilisearch/devrel-team` since we mostly redirect them to the forum or the slack?

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-16 16:29:05 +00:00
dadce6032d Update CONTRIBUTING.md
Added a sentence on other means to contribute. If we like it, we can add it in some other places.
2021-11-16 17:27:18 +01:00
0a1d2ce231 Merge #1904
1904: Update mini-dashboard version to v0.1.5 r=curquiza a=curquiza

Update the mini-dashboard with its latest version (v0.1.5)

Check with `@mdubus,` replaces https://github.com/meilisearch/MeiliSearch/pull/1903

Fixes #1898 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-15 18:02:07 +00:00
9d01c5d882 Update mini-dashboard version to v0.1.5 2021-11-15 18:54:55 +01:00
53fc2edab3 Merge #1897
1897: Add ARM image for Docker to CI r=irevoire a=curquiza

Fixes #1315 

- [x] Publish MeiliSearch's docker image for `arm64`
- [x] Add `workflow_dispatch` event in case we need to re-trigger it after a failure without creating a new release
- [x] Use our own server to run the github runner since this CI is really slow (1h instead of 4h)
- [x] Open an issue for a refactor by merging both files in one file (https://github.com/meilisearch/MeiliSearch/issues/1901)

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-15 16:16:34 +00:00
b7c5b78a61 Remove workflow_dispatch 2021-11-15 14:13:40 +01:00
f081dc2001 Remove checkout 2021-11-12 21:31:18 +01:00
2cf7daa227 Use self-hosted runner 2021-11-12 18:49:02 +01:00
9d75fbc619 Fix docker meta job 2021-11-11 19:42:30 +01:00
3b1b9a277b Remove context 2021-11-11 16:38:45 +01:00
40e87b9544 Add checkout 2021-11-11 16:37:01 +01:00
ded7922be5 Add context and platform 2021-11-11 16:30:16 +01:00
11ef64ee43 Fix credentials 2021-11-11 16:02:32 +01:00
5e6d7b7649 Add worflow dispatch event 2021-11-11 16:00:10 +01:00
5fd9616b5f Add ARM image for Docker to CI 2021-11-11 15:58:44 +01:00
a1227648ba Remove email address from the message at the launch 2021-11-11 14:36:45 +01:00
919f4173cf Merge #1895
1895: Fix aggregated search events name r=irevoire a=gmourier



Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-11-11 09:35:34 +00:00
7c5aad4073 fix aggregated search event names 2021-11-11 01:38:10 +01:00
d47ccd9199 Merge #1894
1894: Fix the 99th percentile in the analytics r=gmourier a=irevoire



Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-11-10 17:27:39 +00:00
cc5e884b34 fix the 99th percentile in the analytics 2021-11-10 18:26:38 +01:00
ac5535055f Make matches work with numerical value 2021-11-10 23:10:30 +08:00
15cb4dafa9 Merge #1882
1882: Remove release drafter r=curquiza a=curquiza

Remove release drafter since it's not used at the moment due to the specific release process of MeiliSearch.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-11-09 14:36:32 +00:00
8ca76d9fdf Merge #1884
1884: Remove Hacktoberfest section from CONTRIBUTING.md r=curquiza a=meili-bot

_This PR is auto-generated._

Remove Hacktoberfest section from CONTRIBUTING.md


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2021-11-09 13:19:30 +00:00
f62e52ec68 Update CONTRIBUTING.md 2021-11-09 14:15:50 +01:00
bf01c674ea Remove release drafter 2021-11-08 14:49:50 +01:00
e9b6a05b75 Merge #1878
1878: Add error object in task r=MarinPostma a=ManyTheFish

# Pull Request

## What does this PR do?
Fixes #1877

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Update error test
- [x] Remove flattening of errors during task serialization


Co-authored-by: many <maxime@meilisearch.com>
2021-11-04 17:16:30 +00:00
6bbc1b4316 Remove error flattening in task serialization 2021-11-04 17:40:28 +01:00
3c696da274 Update tests 2021-11-04 17:40:28 +01:00
d9d6dee550 Merge #1873
1873: Change lacking errors r=ManyTheFish a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-11-04 14:21:52 +00:00
cc6306c0e1 Update milli version 2021-11-04 14:57:45 +01:00
b59145385e Fix PR comments 2021-11-04 14:57:27 +01:00
3f4e0ec971 Merge #1875 #1876
1875: Fix search post event and disk size analytics r=irevoire a=gmourier

- Branch POST search on the post_search aggregator
- Use largest disk `total_space` instead of `available_space` 

1876: Update SEGMENT_API_KEY r=irevoire a=gmourier

Branch it on our Segment production stack

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-11-04 10:16:13 +00:00
ec0716ddd1 Merge #1874
1874: Fix small typo in download-latest.sh r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2021-11-04 09:50:55 +00:00
6d6725b3b8 Update SEGMENT_API_KEY 2021-11-04 08:10:12 +01:00
6660be2cb7 Branch POST /search on the dedicated analytics aggregator 2021-11-04 08:03:48 +01:00
847fcb570b Use total_space of the largest disk instead of available_space 2021-11-04 08:03:11 +01:00
4095ec462e Merge #1865
1865: Aggregate the search even when it fails fail r=MarinPostma a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2021-11-03 16:49:05 +00:00
f7f2421e71 Update download-latest.sh 2021-11-03 17:09:50 +01:00
b664a46e91 Update milli version 2021-11-03 16:11:20 +01:00
06e6eaa7b4 Remove useless Facet variant 2021-11-03 16:11:09 +01:00
30a094cbb2 Change lacking errors 2021-11-03 14:33:33 +01:00
904bae98f8 send the analytics even when the search fail 2021-11-02 12:38:01 +01:00
c32f13a909 Merge #1800
1800: Analytics r=irevoire a=irevoire

Closes #1784
Implements [this spec](https://github.com/meilisearch/specifications/blob/update-analytics-specs/text/0034-telemetry-policies.md) 

# Anonymous Analytics Policy

## 1. Functional Specification

### I. Summary

This specification describes an exhaustive list of anonymous metrics collected by the MeiliSearch binary. It also describes the tools we use for this collection and how we identify a Meilisearch instance.

### II. Motivation

At MeiliSearch, our vision is to provide an easy-to-use search solution that meets the essential needs of our users. At all times, we strive to understand our users better and meet their expectations in the best possible way.

Although we can gather needs and understand our users through several channels such as Github, Slack, surveys, interviews or roadmap votes, we realize that this is not enough to have a complete view of MeiliSearch usage and features adoption. By cross-referencing our product discovery phases with aggregated quantitative data, we want to make the product much better than what it is today. Our decision-making will be taken a step further to make a product that users love.

### III. Explanation

#### General Data Protection Regulation (GDPR)

The metrics collected are non-sensitive, non-personal and do not identify an individual or a group of individuals using MeiliSearch. The data collected is secured and anonymized. We do not collect any data from the values stored in the documents.

We, the MeiliSearch team, provide an email address so that users can request the removal of their data: privacy@meilisearch.com.<br>
Thanks to the unique identifier generated for their MeiliSearch installation (`Instance uuid` when launching MeiliSearch), we can remove the corresponding data from all the tools we describe below. Any questions regarding the management of the data collected can be sent to the email address as well.

#### Tools

##### Segment

The collected data is sent to [Segment](https://segment.com/). Segment is a platform for data collection and provides data management tools.

##### Amplitude

[Amplitude](https://amplitude.com/) is a tool for graphing and highlighting collected data. Segment feeds Amplitude so that we can build visualizations according to our needs.

-----------
# The `identify` call we send every hour:

## System Configuration `system`

This property allows us to gather essential information to better understand on which type of machine MeiliSearch is used. This allows us to better advise users on the machines to choose according to their data volume and their use-cases.

 - [x] `system` => Never changes but still sent every hours
     - [x] distribution | On which distribution MeiliSearch is launched, eg: Arch Linux
     - [x] kernel_version | On which kernel version MeiliSearch is launched, eg: 5.14.10-arch1-1
     - [x] cores | How many cores does the machine have, eg: 24
     - [x] ram_size | Total capacity of the machine's ram. Expressed in `Kb`, eg: 33604210
     - [x] disk_size | Total capacity of the biggest disk. Expressed in `Kb`, eg: 336042103
     - [x] server_provider | Users can tell us on which provider MeiliSearch is hosted by filling the `MEILI_SERVER_PROVIDER` env var. This is also filled by our providers deploy scripts. e.g. GCP [cloud-config.yaml](56a7c2630c/scripts/providers/gcp/cloud-config.yaml (L33)), eg: gcp

## MeiliSearch Configuration

- [x] `context.app.version`: MeiliSearch version, eg: 0.23.0
- [x] `env`: `production` / `development`, eg: `production`
- [x] `has_snapshot`: Does the MeiliSearch instance has snapshot activated, eg: `true`

## MeiliSearch Statistics `stats`

 - [x] `stats`
     - [x] `database_size`: Size of indexed data. Expressed in `Kb`, eg: 180230
     - [x] `indexes_number`: Number of indexes, eg: 2
     - [x] `documents_number`: Number of indexed documents, eg: 165847
     - [x] `start_since_days`: How many days ago was the instance launched?, eg: 328

---------

- [x] Launched | This is the first event sent to mark that MeiliSearch is launched a first time

---------

- [x] `Documents Searched POST`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Documents Searched GET`: The Documents Searched event is sent once an hour. The event's properties are averaged over all search operations during that time so as not to track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents all the user-agents encountered on this endpoint during one hour, eg: `["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]`
  - [x] `requests`
      - [x] `99th_response_time`: The maximum latency, in ms, for the fastest 99% of requests, eg: `57ms`
      - [x] `total_suceeded`: The total number of succeeded search requests, eg: `3456`
      - [x] `total_failed`: The total number of failed search requests, eg: `24`
      - [x] `total_received`: The total number of received search requests, eg: `3480`
  - [x] `sort`
      - [x] `with_geoPoint`: Does the built-in sort rule _geoPoint rule has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of sort criteria among all the requests containing the sort parameter. "sort": [] equals to 0 while not sending sort does not influence the average, eg: `2`
  - [x] `filter`
      - [x] `with_geoRadius`: Does the built-in filter rule _geoRadius has been used?, eg: `true` /`false`
      - [x] `avg_criteria_number`: The average number of filter criteria among all the requests containing the filter parameter. "filter": [] equals to 0 while not sending filter does not influence the average, eg: `4`
      - [x] `most_used_syntax`: The most used filter syntax among all the requests containing the requests containing the filter parameter. `string` / `array` / `mixed`, `mixed`
  - [x] `q`
      - [x] `avg_terms_number`: The average number of terms for the `q` parameter among all requests, eg: `5`
  - [x] `pagination`:
      - [x] `max_limit`: The maximum limit encountered among all requests, eg: `20`
      - [x] `max_offset`: The maxium offset encountered among all requests, eg: `1000` 

---

- [x] `Index Created`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Index Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`

---

- [x] `Documents Added`: The Documents Added event is sent once an hour. The event's properties are averaged over all POST /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`text/csv`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] `Documents Updated`: The Documents Added event is sent once an hour. The event's properties are averaged over all PUT /documents additions operations during that time to not track everything and generate unnecessary noise.
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `payload_type`: Represents all the `payload_type` encountered on this endpoint during one hour, eg: [`application/json`]
  - [x] `primary_key`: The name of the field used as primary key if set, otherwise `null`, eg: `id`
  - [x] `index_creation`: Does an index creation happened, eg: `false`

---

- [x] Settings Updated
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `ranking_rules`
      - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`
  - [x] `sortable_attributes`
      - [x] `total`: Number of sortable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`
  - [x] `filterable_attributes`
      - [x] `total`: Number of filterable attributes, eg: `3`
      - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] `RankingRules Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `sort_position`: Position of the `sort` ranking rule if any, otherwise `null`, eg: `5`

---

- [x] `SortableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of sortable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a sortable attribute, eg: `false`

---

- [x] `FilterableAttributes Updated`
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]
  - [x] `total`: Number of filterable attributes, eg: `3`
  - [x] `has_geo`: Indicate if `_geo` is set as a filterable attribute, eg: `false`

---

- [x] Dump Created
  - [x] `user-agent`: Represents the user-agent encountered for this API call, eg: ["MeiliSearch Ruby (2.1)", "Ruby (3.0)"]

---

Ensure the user-id file is well saved and loaded with:
- [x] the dumps
- [x]  the snapshots



- [x] Ensure the CLI uuid only show if analytics are activate at launch **or already exists** (=even if meilisearch was launched without analytics)

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-10-29 16:11:03 +00:00
519093ea65 fix bad rebase 2021-10-29 17:32:49 +02:00
bd49d1c4b5 fix one small bug 2021-10-29 17:25:56 +02:00
2665c0099d clippy + fmt 2021-10-29 17:25:56 +02:00
d65f055030 pass anaytics into Arc instead of static ref 2021-10-29 17:25:55 +02:00
66d87761b7 align the parameters in the launche resume 2021-10-29 17:25:55 +02:00
ba69ad672a fix the timing issue 2021-10-29 17:25:55 +02:00
7934e3956b replace all mutexes by channel 2021-10-29 17:25:55 +02:00
68fe93b7db add ranking_rules marker before sort_position 2021-10-29 17:25:55 +02:00
efd0ea9e1e makes clippy happier 2021-10-29 17:25:55 +02:00
6ef73eb226 fix all the single settings route and add the searchable attributes Updated event 2021-10-29 17:25:55 +02:00
fc2f23d36c move the start_since_days to teh root of the identify 2021-10-29 17:25:54 +02:00
7c39fab453 move the user-agent out of the context in every request 2021-10-29 17:25:54 +02:00
c5164c01c0 set the total of sortable attributes and filterable-attributes to 0 when not set 2021-10-29 17:25:54 +02:00
351ad32d77 fix the index_creation boolean 2021-10-29 17:25:54 +02:00
3ad8311bdd split the analytics in a module 2021-10-29 17:25:54 +02:00
ea5ae2bae5 sort the imports 2021-10-29 17:25:54 +02:00
72e3adc55e display an instance-id instead of a user-id 2021-10-29 17:25:54 +02:00
b250392e8d remove the first - in the path to the db instance in the instance-id 2021-10-29 17:25:53 +02:00
d8b0d68840 use a regex to count the number of filters instead of split + flatten 2021-10-29 17:25:53 +02:00
c4737749ab bump segment to be able to display a user 2021-10-29 17:25:53 +02:00
a1ab02f9fb remove some commented code 2021-10-29 17:25:53 +02:00
bba64b32ca async_traits is not needed anymore 2021-10-29 17:25:53 +02:00
9abd2aa9d7 make the analytics interval a const 2021-10-29 17:25:53 +02:00
de35a9a605 use an official release of segment 2021-10-29 17:25:53 +02:00
ed750e8792 fix start_since_day 2021-10-29 17:25:53 +02:00
37ca50832c fix the sort position 2021-10-29 17:25:52 +02:00
31c7a0105b fix a bug on the batch documents function 2021-10-29 17:25:52 +02:00
ddab9eafa1 fix a typo 2021-10-29 17:25:52 +02:00
76a4f86e0c rename user-id to instance-uid 2021-10-29 17:25:52 +02:00
6b34318274 makes clippy happy 2021-10-29 17:25:52 +02:00
5508c6c154 a bit of styling 2021-10-29 17:25:52 +02:00
9a62ac0c94 send the analytics only once every hours 2021-10-29 17:25:52 +02:00
01737ef847 remove all the debug prints 2021-10-29 17:25:51 +02:00
3144b572c4 remove the debug mode in release 2021-10-29 17:25:51 +02:00
10de92987a compile write_user_id only when the analytics are enabled 2021-10-29 17:25:51 +02:00
c752c14c46 refactorize the dump and snapshot 2021-10-29 17:25:51 +02:00
87a8bf5e96 write and load the user-id in the dumps 2021-10-29 17:25:51 +02:00
ba14ea1243 plug the new batchers into the documents route 2021-10-29 17:25:51 +02:00
9be90011c6 save the user-id in the config dir of the OS 2021-10-29 17:25:51 +02:00
f9b14ca149 simplify the search batcher 2021-10-29 17:25:50 +02:00
6591acfdfa rename the documents batchers 2021-10-29 17:25:50 +02:00
e64ba122e1 factorize the code between the two documents batcher 2021-10-29 17:25:50 +02:00
a9523146a3 simplify the into_events methods 2021-10-29 17:25:50 +02:00
392ee86714 implement the documents batcher 2021-10-29 17:25:50 +02:00
1d73f484f0 update the primary key when creating a new index 2021-10-29 17:25:50 +02:00
cfcd3ae048 move the version to context.app 2021-10-29 17:25:50 +02:00
5395041dcb fix the stats and stop sending events when no request happened 2021-10-29 17:25:49 +02:00
40eabd50d1 integrate the search batcher in the search route 2021-10-29 17:25:49 +02:00
35ffd0ec3a integrate the search batcher in the tick 2021-10-29 17:25:49 +02:00
d3d76bf97a wip create a search batcher 2021-10-29 17:25:49 +02:00
595ae42e94 update the name of the Launched event 2021-10-29 17:25:49 +02:00
0667d940f9 update the name of nb_cores in the identify 2021-10-29 17:25:49 +02:00
75d1272325 log the dump creation 2021-10-29 17:25:49 +02:00
8e2d6cf87d add the content type to all the route 2021-10-29 17:25:48 +02:00
9e1bba40f7 do not print anything if no user id was found 2021-10-29 17:25:48 +02:00
f7bb499c28 send the first identify + launched for the first time events right away instead of batching them 2021-10-29 17:25:48 +02:00
b33b1ef3dd update the way of getting and saving the user-id to the file system 2021-10-29 17:25:48 +02:00
30aeda7a1a update the identify call to the latest spec version 2021-10-29 17:25:48 +02:00
22d9d660cc log all the required settings route 2021-10-29 17:25:48 +02:00
7524bfc07f log the all settings updated route 2021-10-29 17:25:48 +02:00
bda7472880 log the documetns updated route 2021-10-29 17:25:48 +02:00
1ed05c6c07 log documents added 2021-10-29 17:25:47 +02:00
0b3e0a59cb log index updated 2021-10-29 17:25:47 +02:00
0616f68eb0 implements part of the search 2021-10-29 17:25:47 +02:00
6b8e5a4c92 log the index created route 2021-10-29 17:25:47 +02:00
d72c887422 makes the analytics available for all the routes 2021-10-29 17:25:47 +02:00
664d09e86a makes the analytics works with the option and the feature 2021-10-29 17:25:47 +02:00
e226b1a87f rewrite the main analytics module and the information sent in the tick 2021-10-29 17:25:42 +02:00
b227666271 Merge #1855
1855: Change `authentication` error type to be  `auth` r=curquiza a=gmourier



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 15:29:21 +00:00
6fea050813 Change authentication error type to be 2021-10-28 16:57:48 +02:00
cf67964133 Merge #1848
1848: Error format and Definition r=MarinPostma a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-10-28 14:15:35 +00:00
f8d04b11d5 Merge #1854
1854: Update version for the next release (v0.24.0) r=MarinPostma a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-28 14:03:55 +00:00
3a29cbf0ae Use milli v0.20.0 2021-10-28 15:59:06 +02:00
66f5de9703 Change missing authrization code 2021-10-28 15:56:57 +02:00
cbaca2b579 Fix PR comments 2021-10-28 15:42:42 +02:00
a76d9b15c9 Update version for the next release (v0.24.0) 2021-10-28 12:24:49 +02:00
59636fa688 Pimp error where no document is provided 2021-10-28 12:13:51 +02:00
ff0908d3fa Ignore errors tests that show unrelated bugs 2021-10-28 11:41:59 +02:00
21f35762ca Fix content type test 2021-10-28 10:57:11 +02:00
7464720426 Fix some errors 2021-10-28 10:47:59 +02:00
6e57c40c37 Merge #1853
1853: Create SECURITY.md r=curquiza a=CaroFG



Co-authored-by: CaroFG <48251481+CaroFG@users.noreply.github.com>
2021-10-27 15:54:48 +00:00
c8518f4ab2 Create SECURITY.md 2021-10-27 17:49:00 +02:00
b9c061ab3d Merge #1852
1852: Add tests for mini-dashboard status and assets r=curquiza a=CuriousCorrelation

## Summery

Added tests for `mini-dashboard` status including assets.

## Ticket link

PR closes  #1767

Co-authored-by: CuriousCorrelation <CuriousCorrelation@protonmail.com>
2021-10-27 13:26:42 +00:00
d905bbf961 Merge #1787
1787: Handle empty dump r=MarinPostma a=irevoire

Fixes #1701

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-27 12:47:45 +00:00
6641e7aa50 Add tests for mini-dashboard status and assets 2021-10-27 17:57:25 +05:30
61c15b69fb Change malformed_payload error 2021-10-27 11:13:12 +02:00
8ec0c4c913 Add bad_request error tests 2021-10-27 11:13:12 +02:00
0a9d6e8210 Merge #1847
1847: Optimize document transform r=MarinPostma a=MarinPostma

integrate the optimization from https://github.com/meilisearch/milli/pull/402.

optimize payload read, by reading it to RAM first instead of streaming it. This means that the payload must fit into RAM, which should not be a problem.

Add BufWriter to the obkv writer to improve write speed.

I have measured a gain of 40-45% in speed after these optimizations.


Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 15:28:58 +00:00
0ed800b612 Merge #1830
1830: Add MEILI_SERVER_PROVIDER to Dockerfile r=irevoire a=curquiza

Add docker information in `MEILI_SERVER_PROVIDER` env variable

It does not impact the telemetry spec since it's an already existing variable used on our side.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-26 14:01:06 +00:00
4ac005b094 optimize document transform
fix error types

bump milli
2021-10-26 13:51:15 +02:00
5e3a53b576 fix a bug in the generation of empty dumps 2021-10-25 14:17:57 +02:00
e87146b0d9 Merge #1811
1811: Reducing ArmV8 binary build time with action-rs (cross build with Rust) r=curquiza a=patrickdung

This pull request is based on [discussion #1790](https://github.com/meilisearch/MeiliSearch/discussions/1790)

Note:
1) The binaries of this PR is additional to existing binary built
Existing binary would be produced (by existing GitHub workflow/action)

meilisearch-linux-amd64
meilisearch-linux-armv8
meilisearch-macos-amd64
meilisearch-windows-amd64.exe
meilisearch.deb

2) This PR produce these binaries. The name 'meilisearch-linux-aarch64' is used to avoid naming conflict with 'meilisearch-linux-armv8'.

meilisearch-linux-aarch64
meilisearch-linux-aarch64-musl
meilisearch-linux-aarch64-stripped
meilisearch-linux-amd64-musl

3) If it's fine (in next release), we should submit another PR to stop generating meilisearch-linux-armv8 (which could take two to three hours to build it)

Co-authored-by: Patrick Dung <38665827+patrickdung@users.noreply.github.com>
2021-10-24 12:24:39 +00:00
5caa79df67 Update .github/workflows/publish-crossbuild.yml
Update to use the correct syntax

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-24 11:31:04 +00:00
d519e1036f Update .github/workflows/publish-crossbuild.yml
better naming

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:28 +00:00
19eebc0b0a Update .github/workflows/publish-crossbuild.yml
better naming

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:16 +00:00
5585020753 Update .github/workflows/publish-crossbuild.yml
better spacing

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-21 17:02:00 +00:00
ef7e7a8f11 Only generate aarch64 binary with action-rs 2021-10-21 00:40:46 +08:00
eb91f27b65 Add MEILI_SERVER_PROVIDER to dockerfile 2021-10-20 17:53:43 +02:00
24eef577c5 Merge #1822
1822: Tiny improvements in download-latest.sh r=irevoire a=curquiza

- Add check on `$latest` to check if it's empty. We have some issue on the swift SDK currently where the version number seems not to be retrieved, but we don't why https://github.com/meilisearch/meilisearch-swift/pull/216
- Replace some `"` by `'`
- Rename `$BINARY_NAME` by `$binary_name` to make them consistent with the other variables that are filled all along the script

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-19 09:32:27 +00:00
e7e4ccf74f Merge pull request #1817 from nfsec/patch-1
Improve RUNs in Dockerfile
2021-10-19 00:00:22 +02:00
017ecf76e3 Replace double quotes by single ones 2021-10-18 23:39:07 +02:00
1c9ceadd8d Merge #1824
1824: Fix indexation perfomances on mounted disk r=ManyTheFish a=ManyTheFish

We were creating all of our tempfiles in data.ms directory, but when the database directory is stored in a mounted disk, tempfiles I/O throughput decreases, impacting the indexation time.

Now, only the persisting tempfiles will be created in the database directory. Other tempfiles will stay in the default tmpdir.

Co-authored-by: many <maxime@meilisearch.com>
2021-10-18 12:42:47 +00:00
36ab7b3ebd Fix small typo 2021-10-18 14:17:32 +02:00
b4038597ba Keep persisting tmp files in database directory and put non-persisting tmp files in default tmp dir 2021-10-18 14:16:35 +02:00
79817bd465 Merge #1813
1813: Apply highlight tags on numbers in the formatted search result output r=irevoire a=Jhnbrn90

This is my first ever Rust related PR. 

As described in #1758, I've attempted to highlighting numbers correctly under the `_formatted` key.

Additionally, I added a test which should assert appropriate highlighting. 

I'm open to suggestions and improvements. 


Co-authored-by: John Braun <john@brn.email>
2021-10-18 09:05:01 +00:00
93ad8f04b5 Add check if $latest is empty 2021-10-16 17:36:36 +02:00
e4cb7ed30f Tiny improvement in download-latest.sh 2021-10-16 17:23:50 +02:00
b9e060423f Merge #1760
1760: Add option to use environment variable to increase rate limit r=curquiza a=nav1s

This closes #1655.

Added GITHUB_PAT environment variable and a comment to explain how to create it (I found the ```public_repo``` scope to be the best fit out of the available [scopes](https://docs.github.com/en/developers/apps/building-oauth-apps/scopes-for-oauth-apps#available-scopes)).

Co-authored-by: Aviv <avivnt@gmail.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-16 12:56:02 +00:00
ead1ec3396 Update download-latest.sh 2021-10-16 14:55:10 +02:00
306a8cd059 Update download-latest.sh 2021-10-16 14:55:06 +02:00
4c50deb4b7 2 RUNs less. 2021-10-16 11:37:01 +02:00
be75426e64 Apply formatting according code style guidelines 2021-10-15 21:32:29 +02:00
23458de588 One RUN less
Align apk add commands between images.
2021-10-15 21:18:31 +02:00
9fd849d48b Merge #1808
1808: Add Milestone Check status to bors.toml r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 13:09:32 +00:00
2b28bc9510 Merge #1693
1693: Remove dataset r=Kerollmops a=curquiza

Fixes https://github.com/meilisearch/MeiliSearch/issues/1230

⚠️ Should be merged once https://github.com/meilisearch/documentation/pull/1109 is merged! ⚠️

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 12:56:23 +00:00
d107b3f46c Merge #1759
1759: Feature docker as non root r=curquiza a=igaul

This closes #1757 . 
Adding a non root user with default name meiliuser.

Co-authored-by: gaul@pdx.edu <gaul@pdx.edu>
Co-authored-by: igaul <40813772+igaul@users.noreply.github.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-14 12:45:49 +00:00
44149bec60 Merge branch 'main' into feature-docker-as-non-root 2021-10-14 14:45:28 +02:00
f80b4fdedd Use pr_status isntead of status 2021-10-14 14:21:42 +02:00
fd4a90549b Merge #1803
1803: Import hotfix from `stable` into `main` (v0.23.1) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-10-14 11:43:45 +00:00
b602a0836a Merge branch 'main' into stable 2021-10-14 13:43:21 +02:00
7349fca607 Add Milestone Check status to bors.toml 2021-10-13 19:10:20 +02:00
4bacc8e47d Merge #1806
1806: Fix csv content-type error message r=curquiza a=sanders41

Fixes #1805

I was not sure if the `application/csv` [here](23f11e355d/meilisearch-http/tests/content_type.rs (L29)) should also be changed? I'm thinking yes, but `applicaiton/csv` is a bad type.

Co-authored-by: Paul Sanders <psanders1@gmail.com>
2021-10-13 10:11:47 +00:00
7141f89c5f Split entrypoint and cmd 2021-10-12 11:44:59 -07:00
893654fb15 change default user name
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 11:42:08 -07:00
c9e1d054c7 Fix csv content-type error 2021-10-12 13:38:48 -04:00
2e2eeb0a42 Merge #1801
1801: Update milli version to v0.17.3 to fix inference issue r=curquiza a=curquiza

Fixes #1798

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 14:47:06 +00:00
0f342ac46e Update MeiliSearch version 2021-10-12 16:43:31 +02:00
29ac324e90 Update milli version to v0.17.3 2021-10-12 16:12:16 +02:00
23f11e355d Merge #1799
1799: Update README.md with Telemetry page r=curquiza a=maryamsulemani97

Updated readme to link the Telemetry page in the documentation 

Co-authored-by: maryamsulemani97 <90181761+maryamsulemani97@users.noreply.github.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-12 11:50:14 +00:00
f09016b2bc Update README.md 2021-10-12 13:49:31 +02:00
1fa3aeceeb Update README.md 2021-10-12 13:47:38 +02:00
443afdc412 Update README.md 2021-10-12 14:37:19 +04:00
776befc1f0 Merge #1797
1797: Import stable into main (v0.22.0) r=MarinPostma a=curquiza



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-11 16:43:01 +00:00
3edbc74430 Merge branch 'main' into stable 2021-10-11 18:30:10 +02:00
3172c96042 Merge #1795
1795: Update milli version r=irevoire a=curquiza

Closes #1788 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-11 14:27:46 +00:00
60473637fe Update milli version 2021-10-11 16:21:19 +02:00
b969f34317 Merge #1793
1793: Remove memmap dependency r=curquiza a=palfrey

Fixes #1792. I was going to replace with [memmap2](https://github.com/RazrFalcon/memmap2-rs) which should be a drop-in replacement, but I couldn't actually find anything that actually directly used it. It ends up being a dependency in [milli](https://github.com/meilisearch/milli) so I'm going to go there next and fix that.

Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
2021-10-11 08:29:40 +00:00
6c46fbbc57 Remove memmap dependency 2021-10-10 22:33:40 +01:00
87115b02d9 Fixing the passing of environment variables 2021-10-10 03:27:51 +08:00
c614520405 Cross build with action-rs 2021-10-10 02:21:30 +08:00
3756f5a0ca Add test for highlighting numbers 2021-10-08 15:07:45 +02:00
5b4e4bb858 Highlight numbers (int) as string in formatted JSON 2021-10-08 15:07:15 +02:00
dffd90b966 Merge #1783
1783: Fix too many open file error r=ManyTheFish a=ManyTheFish

- prepare_for_closing() function wasn't called when an index is deleted, we are now calling it
- Index wasn't deleted in the case where we couldn't insert `uid` in `index_uuid_store`, we are now cleaning it

Fix #1736

Co-authored-by: many <maxime@meilisearch.com>
2021-10-07 15:48:07 +00:00
a92a0c3ed3 Log the error instead of returning it when deletion fails 2021-10-07 17:38:22 +02:00
0774b1efa5 Close index's heed environment when index is deleted 2021-10-07 17:09:41 +02:00
7fc7eb7457 Make sure to remove newly created index if uid is already taken 2021-10-07 16:49:21 +02:00
602a327aa8 Merge #1781
1781: Optimize build size r=irevoire a=MarinPostma

Remove debug symbols from the release build, and strip the binaries.

We used to need to debug symbols for sentry, but since it was removed with #1616, we don't need them anymore.

Shrinks the binary size from ~300MB to ~50MB on linux.

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-07 13:31:03 +00:00
14c6ae4735 disable stripping 2021-10-07 12:10:36 +02:00
493a0e377d optimize build size 2021-10-07 11:49:52 +02:00
5e3e108143 Merge #1769
1769: Enforce `Content-Type` header for routes requiring a body payload r=MarinPostma a=irevoire

closes #1665 

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-06 15:43:50 +00:00
66dbd3cd34 makes clippy happy 2021-10-06 17:39:04 +02:00
9a1e44dc78 Apply suggestion
- remove the payload_error_handler in favor of a PayloadError::from

- merge the two match branch into one

- makes the accepted content type a const instead of recalculating it for every error
2021-10-06 17:15:47 +02:00
37b267ffb3 duplicate the post document tests with the put verb 2021-10-06 17:15:47 +02:00
dfa199f98f add content-type tests
fix typo

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-06 17:15:47 +02:00
c6d107a05f makes the content-type mandatory for every routes 2021-10-06 17:15:47 +02:00
ddbcf449da Merge #1763
1763: Index tests r=MarinPostma a=MarinPostma

This pr aims to test more thorougly the usage on index in the meilisearch database, by writing unit tests.

work included:
- [x] Create index mock and stub methods
- [x] Test snapshot creation
- [x] Test Dumps
- [x] Test search

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-06 14:39:53 +00:00
9fa61439b1 fix clippy warning & unsafety 2021-10-06 14:51:46 +02:00
02dd1ea29d Merge pull request #1771 from ferdi05/ferdi05-patch-contributing
Update CONTRIBUTING.md
2021-10-06 14:40:38 +02:00
a38215de98 edit documentation 2021-10-06 14:35:18 +02:00
85b5260d9d simple search unit test 2021-10-06 14:20:05 +02:00
4b4ebad9a9 test dumps 2021-10-06 14:10:26 +02:00
ece4c739f4 update store tests 2021-10-06 14:10:26 +02:00
85ae34cf9f test snapshots 2021-10-06 14:10:23 +02:00
0448f0ce56 handle panic in stubs 2021-10-06 14:09:04 +02:00
4835d82a0b implement index mock 2021-10-06 14:09:01 +02:00
eaddee9fe2 Update CONTRIBUTING.md
typo + text improvement
all credits go to @guimachiavelli
2021-10-05 18:07:59 +02:00
2190764162 Merge #1768
1768: Fix auth error r=irevoire a=MarinPostma

fix a small auth error, that set the invalid token error token to "hello". This was invilisble to the user because the invalid token is not returned.

thank you hawk-eye `@irevoire` 

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-05 15:16:14 +00:00
3b91764587 fix auth error 2021-10-05 09:09:40 +02:00
0c3ec549f8 Merge #1764
1764: Bump Milli to improve geosearch errors r=curquiza a=irevoire

closes #1734 

`@curquiza,` your two examples still don't work because a filter must be composed of multiples operations; look at my screenshot to see what works and what doesn't.
Is this ok? 🤔 

`@gmourier,`
What do you think?

![image](https://user-images.githubusercontent.com/7032172/135846911-588f652d-16db-4d88-89fd-148640bac0f7.png)


And here is a screenshot with all the new errors that have been implemented

![image](https://user-images.githubusercontent.com/7032172/135854851-da469fef-0dd0-4ff1-b15e-89934ed8fb6f.png)

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-04 17:30:22 +00:00
fca686e7f8 bump meilisearch 2021-10-04 13:52:37 +02:00
607e28749a Merge #1755
1755: Fix mini dashboard r=curquiza a=anirudhRowjee

This commit is a fix to issue #1750.
As a part of the changes to solve this issue, the following changes have
been made -
1. Route registration for static assets has been modified
2. the `mut` keyword on the `scope` has been removed.

Co-authored-by: Anirudh Rowjee <ani.rowjee@gmail.com>
2021-10-04 09:56:21 +00:00
bffab21b10 Changes
1. Removed redundant scope registration
2021-10-04 14:47:05 +05:30
d9165c7f77 Add option to use enviroment variable to increase rate limit 2021-10-03 13:07:40 +03:00
2ef58ccce9 Fix formatting 2021-10-02 10:59:01 -07:00
4009804221 Creates non root user to run Meilisearch in Dockerfile 2021-10-02 10:43:13 -07:00
151f691609 Fixes #1750
This commit is a fix to issue #1750.
As a part of the changes to solve this issue, the following changes have
been made -
1. Route registration for static assets has been modified
2. the `mut` keyword on the `scope` has been removed.
2021-10-02 15:24:04 +05:30
81993b6a15 Merge #1747
1747: Add new error types for document additions r=curquiza a=MarinPostma

Adds the missing errors for the documents routes, as specified.

close #1691
close #1690


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-30 15:16:44 +00:00
4eb3817b03 missing payload error 2021-09-30 16:58:13 +02:00
18cb514073 invalid content type error 2021-09-30 16:58:13 +02:00
ddd40d87a7 malformed payload error 2021-09-30 16:58:13 +02:00
137272b8de empty content type error 2021-09-30 16:58:13 +02:00
e400ae900d Merge #1746
1746: Do not commit transaction on failed updates r=irevoire a=Kerollmops

This PR fixes MeiliSearch that was always committing the transactions even when an update was invalid and the whole transaction should have been trashed. It was the source of a bug where an invalid update (with an invalid primary key) was creating an index with the specified primary key and should instead have failed and done nothing on the server.

Fixes #1735.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-09-30 13:46:43 +00:00
c388dca5ec Check that invalid updates do not create an index with a primary key 2021-09-30 15:46:04 +02:00
6a691db7f8 Do not commit transaction on failed updates 2021-09-30 15:46:03 +02:00
ed783b67ca Merge #1742
1742: Create dumps v3 r=irevoire a=MarinPostma

The introduction of the obkv document format has changed the format of the updates, by removing the need for the document format of the addition (it is not necessary since update are store in the obkv format). This has caused breakage in the dumps that this pr solves by introducing a 3rd version of the dumps.

A v2 compat layer has been created that support the import of v2 dumps into meilisearch. This has permitted to move the compat code that existed elsewhere in meiliearch to be moved into the v2 module. The asc/desc patching is now only done for forward compatibility when loading a v2 dump, and the v3 write the asc/desc to the dump with the new syntax.




Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-30 13:17:30 +00:00
ee372a7b30 implement new dump v2 2021-09-30 14:49:13 +02:00
66f39aaa92 fix dump v3 2021-09-30 14:49:13 +02:00
03af99650d fix dumpv1 2021-09-30 14:49:13 +02:00
44a2ff07b1 Merge #1697
1697: Make exec binary for M1 mac available for download r=irevoire a=k-nasa

## Why

fix: https://github.com/meilisearch/MeiliSearch/issues/1661

Now, Do not supported getting exec file for m1 mac on using`download-latest.sh`.


## What

Download x86 binary when run `download-latest.sh` on m1 mac, because it can execute binary targeting x86.

## Proof

I verified like this.
I got executable binary on M1 mac 💡 

```sh
:) % arch
arm64

:) % ./download-latest.sh
Downloading MeiliSearch binary v0.21.1 for macos, architecture amd64...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   631  100   631    0     0   2035      0 --:--:-- --:--:-- --:--:--  2035
100 43.5M  100 43.5M    0     0  7826k      0  0:00:05  0:00:05 --:--:-- 9123k
MeiliSearch binary successfully downloaded as 'meilisearch' file.

Run it:
    $ ./meilisearch
Usage:
    $ ./meilisearch --help

:) % file ./meilisearch
./meilisearch: Mach-O 64-bit executable x86_64

:) % ./meilisearch --help # this is execuable
meilisearch-http 0.21.1
...
...
```

Co-authored-by: k-nasa <htilcs1115@gmail.com>
Co-authored-by: nasa <htilcs1115@gmail.com>
2021-09-30 11:33:48 +00:00
fb95540394 Merge #1748
1748: Add a link to join the cloud-hosted beta r=MarinPostma a=gmourier

The product team would like to add a link to communicate and invite users to fill out the form to test the closed beta of our cloud solution.

We have done the same thing on the documentation side https://github.com/meilisearch/documentation/pull/1148. 😇

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-09-30 09:42:22 +00:00
be00fafb29 Add a link to join the cloud-hosted beta 2021-09-30 11:28:51 +02:00
0bc376a17b Merge #1738
1738: Update Dockerfile following the refactor r=MarinPostma a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-30 08:08:19 +00:00
05d5de47cb Merge #1737
1737: Update version for the next release (v0.23.0) r=irevoire a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-29 22:06:46 +00:00
d3eb604d66 Merge #1740
1740: Add Hacktoberfest section to CONTRIBUTING.md r=curquiza a=meili-bot

_This PR is auto-generated._

Add Hacktoberfest section to CONTRIBUTING.md


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2021-09-29 17:55:47 +00:00
d6db210ef3 Update CONTRIBUTING.md 2021-09-29 19:54:12 +02:00
80ca42922f Merge #1739
1739: Fix add document Content-Type r=curquiza a=MarinPostma

change the `Content-Type` guards of the document addition routes to match the specification.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-29 17:13:45 +00:00
fe5df6d06f fix payload content type guards 2021-09-29 19:04:47 +02:00
535442179e Update Dockerfile following the refactor 2021-09-29 18:54:14 +02:00
b17dae9ac0 Update version for the next release (v0.23.0) 2021-09-29 18:40:35 +02:00
5fad37aebd Merge #1711
1711: MeiliSearch refactor introducing OBKV format r=MarinPostma a=MarinPostma

This PR refactor some multiple components of meilisearch, and introduce the obkv document format to meilisearch

- [x] Split meilisearch-http and meilisearch-lib
- [x] Replace `IndexActor` and `UuidResolver` with `IndexResolver`
- [x] Remove mentions to Actor
- [x] Remove Actor traits to simplify code
- [x] Integrate obkv document format
- [x] Remove `Data`
- [x] Restore all route
- [x] Replace `Box<dyn error>` with `anyhow::Error`
- [x] Introduce update file store
- [x] Update file store error handling
- [x] Fix dumps
- [x] Fix snapshots
- [x] Fix tests
- [x] Update module documentation
- [x] add csv suppport (feat `@ManyTheFish` #1729 )
- [x] add jsonl support
- [x] integrate geosearch (feat `@irevoire` #1725) 

partially implements #1691 and #1690. The error handling is very basic now, I will finish it in the next pr.

Some unit tests have been disabled, I will re-enable them ASAP, but they need a bit more work.

close #1531 


P.S: sorry for this monstrous PR :'(

Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
2021-09-29 14:38:55 +00:00
311933614e bump milli to v0.17.0 2021-09-29 15:44:54 +02:00
8fa6502b16 review changes 2021-09-29 14:17:41 +02:00
1f537e1b60 jsonl support 2021-09-29 11:28:02 +02:00
5bac65f8b8 add missing content type errors 2021-09-29 09:55:35 +02:00
911630000f split csv and json document routes 2021-09-29 00:12:25 +02:00
6e8a3fe8de move csv parsing to document_formats 2021-09-28 22:58:48 +02:00
2a14948123 Use an existing revision of milli 2021-09-28 22:30:34 +02:00
61e5eed493 Call csv specialized function 2021-09-28 22:29:26 +02:00
d30830a55c Add csv deserializer for documents 2021-09-28 22:28:13 +02:00
102c46f88b clippy + fmt 2021-09-28 22:22:59 +02:00
5fa9bc67d7 remove unused dependencies 2021-09-28 22:16:18 +02:00
3503fbf7fe re-export milli from meilisearch_lib 2021-09-28 22:08:03 +02:00
1cc733f801 fix get_info 2021-09-28 22:02:04 +02:00
7a27cbcc78 rename RegisterUpdate to store::Update 2021-09-28 20:20:13 +02:00
6f8e670dee move json reader to document_formats module 2021-09-28 20:13:26 +02:00
df4e9f4e1e restore dump v1 2021-09-28 19:49:25 +02:00
3747f5bdd8 replace unwraps with correct error 2021-09-28 19:29:14 +02:00
56766cffc3 remove module level doc 2021-09-28 18:58:56 +02:00
692c676625 fix tests 2021-09-28 18:57:36 +02:00
ddfd7def35 add a TODO while waiting for the tests to be fixed 2021-09-28 18:17:56 +02:00
bcaee4d179 fix uuid store size 2021-09-28 18:17:56 +02:00
539a57026d fix the sort error messages 2021-09-28 14:50:26 +02:00
654f49ccec [WIP] put milli on branch main 2021-09-28 14:50:26 +02:00
c1376a9f2a add the geosearch to Meilisearch 2021-09-28 14:50:26 +02:00
9ac999ca59 remove uuid resolver and index actor 2021-09-28 12:00:35 +02:00
6a1964f146 restore dumps 2021-09-28 11:59:55 +02:00
90018755c5 restore snapshots 2021-09-27 16:48:03 +02:00
95211e2665 Merge #1703
1703: Trigger CodeCoverage manually instead of on each PR r=irevoire a=curquiza

Since no one is using it now on the PRs, we would rather get a state of the code coverage once (triggered manually) rather than on each PR.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-27 13:55:05 +00:00
7cd94e5486 Merge #1724
1724: Redo CONTRIBUTING.md r=curquiza a=curquiza

- Update `Development` section
- Update the `Git Guidelines` section
- Remove `Benchmarking & Profiling` -> done on the milli side at the moment
- Remove `Humans` -> synchronization job done by the manager of the core team at the moment
- Remove `Changelog` section -> done by the manager and the docs team 
- Remove `Documentation` section -> job done by the manager to synchronize both teams.

Fixes #1723 at the same time

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-27 13:20:29 +00:00
35ef6a9204 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:56 +02:00
41272e7148 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:38 +02:00
8ff39d8432 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:42:28 +02:00
e22f57cae5 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:32:47 +02:00
67afb0e3fe Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:53 +02:00
f56f31c277 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:42 +02:00
b7c4754be2 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:37 +02:00
3118f32221 Update CONTRIBUTING.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-27 14:30:30 +02:00
c9cc504e7b Update CONTRIBUTING.md 2021-09-27 14:18:15 +02:00
b9d189bf12 restore document deletion routes 2021-09-24 15:21:07 +02:00
c32012c44a restore settings updates 2021-09-24 14:55:57 +02:00
dfce44fa3b rename data to meilisearch 2021-09-24 12:03:16 +02:00
42a6260b65 introduce index resolver 2021-09-24 11:53:11 +02:00
5353be74c3 refactor index actor 2021-09-22 15:07:04 +02:00
12542bf922 refactor update actor 2021-09-22 11:52:50 +02:00
def737edee refactor uuid resolver 2021-09-22 10:49:59 +02:00
60518449fc split meilisearch-http and meilisearch-lib 2021-09-21 13:23:22 +02:00
09d4e37044 split data and api keys 2021-09-20 15:31:03 +02:00
e14640e530 refactor meilisearch 2021-09-20 14:54:20 +02:00
7f734f0a18 get x84 binary 2021-09-20 20:57:47 +09:00
cd2f886234 Update download-latest.sh
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-19 23:46:36 +09:00
dda4fe10b3 Update download-latest.sh
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-19 23:46:31 +09:00
148a896cca Merge #1666
1666: Proposed fix for ARM binary on RHEL r=irevoire a=kappa-wingman

https://github.com/meilisearch/MeiliSearch/issues/1410

Co-authored-by: Kappa Wingman <64772920+kappa-wingman@users.noreply.github.com>
2021-09-15 14:16:25 +00:00
e8eb589d6e Merge #1659
1659: deps: unify pest dependency r=MarinPostma a=happysalada

meilisearch dependends on two different versions of pest.
This can be problematic for some build systems (e.g. NixOS).
Since the repo hasn't received an update in a while, in the meantime, use the later version of the two pest dependencies.

Context: this has been discussed previously https://github.com/meilisearch/MeiliSearch/issues/1273
meilisearch has been selected by ngi to be packaged for nixos. A patch can be applied to make the changes proposed in this PR. This PR intends to see how the maintainers of meilisearch would feel about the patch.

What was done.
- Add an override for the pest dependency in Cargo.toml.
- recreate the Cargo.lock with `cargo update`. This has had the side effect of updating some dependencies.

I ran the tests on darwin. My machine is quite old so I had 8 failures due to a timeout. None of the failures look like they are due to the new dependencies.

Checking the pest repo, it seems there are some recent commits, however no sure date of when there could be a new release.

If this gets accepted, there is no need to do a new release, nixos can just target the new commit.

If you feel it's too much pain for not enough gain, no worries at all!

Co-authored-by: happysalada <raphael@megzari.com>
2021-09-15 07:56:03 +00:00
770b6d25ae deps: unify pest dependency 2021-09-15 12:15:44 +09:00
be10d90f07 Trigger CodeCoverage manually instead of on each PR 2021-09-14 18:59:20 +02:00
fd3fa1ef45 Merge #1692
1692: Use tikv-jemallocator instead of jemallocator r=curquiza a=felixonmars

`jemallocator` has been abandoned for nearly two years, and `rustc`
itself moved to use `tikv-jemallocator` instead:
3965773ae7

Let's switch to a better maintained version.

Co-authored-by: Felix Yan <felixonmars@archlinux.org>
2021-09-14 16:26:27 +00:00
a57943b77e Use tikv-jemallocator instead of jemallocator
`jemallocator` has been abandoned for nearly two years, and `rustc`
itself moved to use `tikv-jemallocator` instead:
3965773ae7

Let's switch to a better maintained version.
2021-09-14 18:30:24 +03:00
6fafdb7711 Merge #1651 #1676 #1684
1651: Use reset_sortable_fields r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/MeiliSearch/issues/1635

1676: Add curl binary to final stage image r=curquiza a=ook

Reference: #1673 

Changes: * add `curl` binary to final docker Melisearch image.

For metrics, docker funny layer management makes this add a  shrink from 319MB to 315MB:

```
☁  MeiliSearch [feature/1673-add-curl-to-docker-image]  docker image ls
REPOSITORY                                                         TAG                  IMAGE ID       CREATED         SIZE
getmeili/meilisearch                                               0.22.0_ook_1673      938e239ad989   2 hours ago     315MB
getmeili/meilisearch                                               latest               258fa3aa1230   6 days ago      319MB
```

1684: bump dependencies r=MarinPostma a=MarinPostma

Bump meilisearch dependencies.

We still depend on custom patch that have been upgraded along the way.

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
Co-authored-by: Thomas Lecavelier <thomas@followanalytics.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-13 13:20:29 +00:00
0f7625e29a bump dependencies 2021-09-13 15:17:08 +02:00
7afccfc92d Merge #1683
1683: Better dependencies cache for CI r=curquiza a=shekhirin



Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-09-13 12:48:35 +00:00
6e59da26d4 Merge #1700
1700: `stable` into `main` after v0.22.0 r=MarinPostma a=curquiza



Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-13 12:32:23 +00:00
928930ddd5 Merge #1699
1699: Bump milli: fix some crashes r=ManyTheFish a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-13 10:17:10 +00:00
6d2f7af642 Bump milli: fix some crashes 2021-09-13 12:14:54 +02:00
5b995ba080 feat: Make exec binary for M1 mac available for download 2021-09-11 23:11:33 +09:00
168a1315de Remove dataset 2021-09-10 10:35:10 +02:00
c101b2a5cb Merge #1686
1686: Bump milli r=curquiza a=irevoire

 fixes #1685, #1678, #1671, #1677 and #1680

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-08 16:31:02 +00:00
971c361e0f Merge #1682
1682: Change the format of custom ranking rules when importing old dumps r=curquiza a=Kerollmops

This PR changes the format of the custom ranking rules from `asc(price)` to `title:asc` as the format changed between v0.21 and v0.22. The dumps are now correctly importing the custom ranking rules.

This PR also change the previous default ranking rules (without sort) to the new default ranking rules (with the new sort).

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-09-08 16:20:10 +00:00
be50b2bec6 Change the format of custom ranking rules when importing v2 dumps 2021-09-08 17:56:21 +02:00
49c918defa bump milli 2021-09-08 17:41:47 +02:00
d595623162 Merge #1669
1669: Fix windows integration tests r=MarinPostma a=ManyTheFish

Set max_memory value to unlimited during tests:
because tests run several meilisearch in parallel,
we overestimate the value for max_memory making the tests on Windows crash

Co-authored-by: many <maxime@meilisearch.com>
2021-09-08 12:44:50 +00:00
169e739634 Remove useless indexer options 2021-09-08 13:40:05 +02:00
08138c7c23 Use set indexer options instead of create a default one 2021-09-08 13:40:00 +02:00
bbb012ad0f chore(ci): use smarter dependencies cache 2021-09-07 19:56:38 +03:00
331d28102f Change the format of custom ranking rules when importing v1 dumps 2021-09-07 17:16:40 +02:00
efa69875d9 refactor(http): use reset_sortable_fields 2021-09-07 15:04:44 +03:00
59797bfc7b meilisearch/MeiliSearch#1673 Add curl binary to final stage image 2021-09-06 19:35:23 +02:00
c0f9c891f5 Set max_memory value to unlimited during tests
because tests run several meilisearch in parallel,
we over estimate the value for max_memory making the tests on widows crash
2021-09-06 14:38:10 +02:00
33514b28be Merge pull request #1588 from meilisearch/test-new-indexer
Integrate the new indexer
2021-09-06 10:21:42 +02:00
e3a913e03f Merge #1660
1660: Update version for the next release (v0.22.0) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-02 16:43:32 +00:00
7e80337e5b Bump milli to v0.12.0 2021-09-02 18:19:12 +02:00
8d4723d91b Update lock file 2021-09-02 18:19:12 +02:00
4cdf680a81 Make the MaxMemory use the default value when undefined 2021-09-02 18:19:11 +02:00
63e67f72e3 Update tokenizer and new milli version 2021-09-02 18:19:00 +02:00
0cd66c3a89 Bump the milli version 2021-09-02 18:19:00 +02:00
b092a624ed Introduce the MaxMemory struct that defaults to 2/3 of the available memory 2021-09-02 18:18:59 +02:00
24e84d7ca1 Test new indexer 2021-09-02 18:11:20 +02:00
14f9056349 Merge #1662
1662: Fix link in download script r=irevoire a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-02 11:27:14 +00:00
b444e2e074 Proposed fix for ARM binary on RHEL
https://github.com/meilisearch/MeiliSearch/issues/1410
2021-09-02 01:01:46 +00:00
723cb4d520 Fix link in download script 2021-09-01 15:57:11 +02:00
90116155b4 Update version for the next release (v0.22.0) 2021-09-01 12:33:30 +02:00
0d01c0e935 Merge #1658
1658: Remove COMMIT_SHA and COMMIT_DATE build arg from the Docker CIs r=irevoire a=curquiza

Since `@irevoire` add the `.git` folder in the Dockerfile, no need to compute `COMMIT_SHA` and `COMMIT_DATE` in the CI.
Can you confirm `@irevoire?`

Also, update some CIs using `checkout@v1` to `checkout@v2`

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-31 15:15:24 +00:00
e002509bf2 Remove COMMIT_SHA and COMMIT_DATE build arg 2021-08-31 17:01:58 +02:00
19c5c74291 Merge #1652 #1654 #1657
1652: Remove dependabot r=MarinPostma a=curquiza

Fixes #1649 

Dependabot for vulnerability and security updates is still activated.

1654: Add Script for Windows r=MarinPostma a=singh08prashant

fixes #1570 

changes:

1. added script for detecting windows os running git bash
2. appended `.exe` to `$release_file` for windows as listed [here](https://github.com/meilisearch/MeiliSearch/releases/)
3. removed global `$BINARY_NAME='meilisearch'` as windows require `.exe` file

1657: Bring vergen hotfix from `stable` to `main` r=MarinPostma a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: singh08prashant <singh08prashant@gmail.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-08-31 14:31:42 +00:00
b6fec60243 Merge #1656
1656: Remove unused Arc import r=MarinPostma a=Kerollmops

This PR removes a warning introduced by #1606 which removed Sentry that was using an `Arc` but forgot to remove the scope import, we remove it here.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-08-31 14:04:16 +00:00
9d0fa8112b Remove unused Arc import 2021-08-31 14:50:36 +02:00
d30f5b1bef add scrpit for git-bash 2021-08-31 08:34:21 +05:30
7691b0d721 Merge #1636
1636: Hotfix: Log but don't panic when vergen can't retrieve commit information r=curquiza a=Kerollmops

This pull request fixes an issue we discovered when we tried to publish meilisearch v0.21 on brew, brew uses the tarball downloaded from github directly which doesn't contain the `.git` folder.

We use the `.git` folder with [vergen](https://docs.rs/vergen) to retrieve the commit and datetime information. Unfortunately, we were unwrapping the vergen result and it was crashing when the git folder was missing.

We no more panic when vergen can't find the `.git` folder and just log out a potential error returned by [the git2 library](https://docs.rs/git2). We then just check that the env variables are available at compile-time and replace it with "unknown" if not.

### When the `.git` folder is available

```
xh localhost:7700/version
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 26 Aug 2021 13:44:23 GMT
Transfer-Encoding: chunked

{
    "commitSha": "81a76eab69944de8a8d5006345b5aec7b02acf50",
    "commitDate": "2021-08-26T13:41:30+00:00",
    "pkgVersion": "0.21.0"
}
```

### When the `.git` folder is unavailable

```bash
cp -R meilisearch meilisearch-cpy
cd meilisearch-cpy
rm -rf .git
cargo clean
cargo run --release
   <snip>
   Compiling meilisearch-http v0.21.0 (/Users/clementrenault/Documents/meilisearch-cpy/meilisearch-http)
warning: vergen: could not find repository from '/Users/clementrenault/Documents/meilisearch-cpy/meilisearch-http'; class=Repository (6); code=NotFound (-3)
```

```
xh localhost:7700/version
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 26 Aug 2021 13:46:33 GMT
Transfer-Encoding: chunked

{
    "commitSha": "unknown",
    "commitDate": "unknown",
    "pkgVersion": "0.21.0"
}
```

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-08-30 16:25:12 +00:00
b8c954eb3f Bump the MeiliSearch version to v0.21.1 2021-08-30 17:41:25 +02:00
a8c146fd13 Unwrap or unknown the commit hash 2021-08-30 17:41:24 +02:00
70df41bc62 Remove dependabot 2021-08-30 16:51:50 +02:00
1782753387 Bump vergen and remove unused build feature 2021-08-30 15:03:45 +02:00
23ccf4429e Merge #1639
1639: Add new mini-dahsboard gif r=curquiza a=CaroFG



Co-authored-by: CaroFG <48251481+CaroFG@users.noreply.github.com>
Co-authored-by: CaroFG <carolina.ferreira131@gmail.com>
2021-08-26 15:58:39 +00:00
bf4e799dba Update README.md
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-08-26 17:47:29 +02:00
cb695bdec3 Update README with new gif 2021-08-26 17:43:41 +02:00
be70eb881a Remove old gif 2021-08-26 17:42:56 +02:00
867c277088 Add files via upload 2021-08-26 16:40:44 +02:00
96f72f009a Merge #1615
1615: Integrate the query time sort feature r=Kerollmops a=Kerollmops

This pull request integrates the sort at query time feature that was implemented on the milli side https://github.com/meilisearch/milli/pull/320. It follows the specification file https://github.com/meilisearch/specifications/blob/develop/text/0055-sort.md.

A bunch of tests has been added to ensure that the search works correctly and that the settings are fine too!

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-08-26 14:09:38 +00:00
cf4a466b6b Make sure that the order of the filterableAttributes is constant 2021-08-26 11:06:05 +02:00
087e4626ce Make sure that the order of the sortableAttributes is constant 2021-08-26 11:06:04 +02:00
64462c842b Test the search with sort time queries with POST and GET methods 2021-08-25 17:39:25 +02:00
e0f73fe742 Introduce the sort search parameter 2021-08-25 17:39:25 +02:00
ea4c831de0 Integrate the sortable-attributes into the settings 2021-08-25 17:39:25 +02:00
51387b2c80 Introduce the new invalid sortable error codes 2021-08-25 17:29:30 +02:00
2d8dd87cad Merge #1623
1623: Use Setting enum r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/MeiliSearch/issues/1620

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-08-25 14:58:40 +00:00
d9dd2a038b refactor(http): use Setting enum 2021-08-25 17:43:46 +03:00
1227ce8091 Merge #1622
1622: Update README to welcome the contribution again r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-25 13:08:08 +00:00
cd63c80be8 Merge #1616
1616: Remove sentry r=Kerollmops a=irevoire

closes #1606 

Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-08-25 11:40:30 +00:00
e0a5eebe79 Update README to welcome the contribution again 2021-08-24 20:31:05 +02:00
850069af75 Merge #1610
1610: Fix Docker CI for `latest` tag r=irevoire a=curquiza

Fixes https://github.com/meilisearch/MeiliSearch/issues/1608

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-24 11:46:04 +00:00
672fcee8aa remove sentry 2021-08-24 12:38:31 +02:00
d9b023c11f Update publish-docker-latest.yml 2021-08-23 19:27:48 +02:00
6b228f56cb Merge #1607
1607: Merge changes in `stable` into `main` r=Kerollmops a=curquiza

Containing all the fixes since v0.21.0rc0

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: Charlotte Vermandel <charlottevermandel@gmail.com>
2021-08-23 16:27:46 +00:00
dd645e6da4 Merge #1605
1605: Fix pacic when decoding r=curquiza a=curquiza

Update milli to fix the panic during document deletion

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-23 11:06:45 +00:00
149f46c184 Fix pacic when decoding 2021-08-23 12:37:51 +02:00
96839c48c9 Direct users to milli for the core library in the README (#1520)
* Update README.md

* Update README.md

* Update README.md

* Update README.md

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>

* Update README.md

Co-authored-by: gui machiavelli <hey@guimachiavelli.com>

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: gui machiavelli <hey@guimachiavelli.com>
2021-08-19 16:24:12 +02:00
3e27d5e885 Merge #1596
1596: Update milli and tokenizer version: fix panic during indexation r=curquiza a=curquiza

Fixes #1590 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-18 13:44:30 +00:00
38fc876704 Update tokenizer and new milli version with new tags 2021-08-18 14:55:10 +02:00
39d5a99095 Update milli and tokenizer version 2021-08-18 12:09:34 +02:00
2beb306834 Merge #1577
1577: Update milli dependency: fix facet values bugs r=Kerollmops a=curquiza

Fixes #1576 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-16 16:13:42 +00:00
f3e595e2f0 Update milli dependency 2021-08-16 13:36:42 +02:00
5d80d11b23 Merge #1580
1580: Update telemetry link r=curquiza a=curquiza

Here is the page the user will have: https://dev.docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html
instead of: https://docs.meilisearch.com/reference/features/configuration.html#disable-analytics

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-12 17:11:30 +00:00
621529e9dc Update telemetry link 2021-08-12 18:58:07 +02:00
535aff8f7e Merge #1578
1578: Update tokenizer version to v0.2.4 r=ManyTheFish a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-12 15:27:12 +00:00
7531280764 Update tokenizer version to v0.2.4 2021-08-12 13:55:47 +02:00
63daa8b15a Update README.md (#1568) 2021-08-09 16:38:52 +02:00
92913e1eb8 Add information about product repo (#1567)
* Add information about product repo

* Update README.md

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>

Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com>
2021-08-09 14:56:43 +02:00
418be3daa8 Update issue templates (#1564) 2021-08-09 10:51:02 +02:00
7e3b2ddff2 Merge #1554
1554: Fix dump v1 (attributesForFaceting, and criteria) r=curquiza a=MarinPostma

close #1553


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-08-05 19:45:52 +00:00
312d93961a Merge #1556
1556: Update milli to v0.9.0 r=MarinPostma a=curquiza

Fixes #1552 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-05 14:04:55 +00:00
8f05d8d546 fix clippy warnings 2021-08-05 16:00:47 +02:00
f5ddea481a reintroduce exactness 2021-08-05 15:59:39 +02:00
29ca8271b3 test dumpv1 format regression 2021-08-05 15:59:39 +02:00
3084537d1e restore attributes for faceting in dump v1 2021-08-05 15:59:39 +02:00
86ac994543 Merge #1557
1557: Fix docs link anchor r=MarinPostma a=curquiza

thank you `@guimachiavelli` 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-05 13:34:48 +00:00
992b082c6f Fix docs link anchor 2021-08-05 13:28:32 +02:00
31fe263356 Update milli to v0.9.0 2021-08-05 13:08:27 +02:00
7a0b20c740 Merge #1532
1532: Start writing documentation for newcomers r=MarinPostma a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2021-08-03 09:26:45 +00:00
9810f6b695 Merge #1540
1540: Update milli to version 0.8.1 r=curquiza a=curquiza

Integrates this fix into MeiliSearch https://github.com/meilisearch/milli/pull/296

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-07-29 17:15:52 +00:00
09c74c04a0 Merge #1539
1539: Use serdeval for validating json format. r=curquiza a=MarinPostma

uses [serdeval](https://github.com/MarinPostma/serdeval) to validate that the json payload is valid json, and in the correct format.

fix #1535


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-29 17:05:13 +00:00
b6cc932c09 Merge #1541
1541: Make clippy happy r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-07-29 16:53:53 +00:00
1b5d918cb9 Fix rustfmt 2021-07-29 18:32:09 +02:00
bf76d4a43c Make clippy happy 2021-07-29 18:14:36 +02:00
53b4b2fcbc Use serdeval for validating json format. 2021-07-29 18:02:54 +02:00
9a8629a6a9 Update milli 2021-07-29 17:45:31 +02:00
78308365ec fix typos 2021-07-29 14:40:41 +02:00
976075578f Merge #1537
1537: Import `.git` to the docker build image to fix vergen r=curquiza a=irevoire

I observed a small difference in the size of the build image, but I think we can allow it:
![image](https://user-images.githubusercontent.com/7032172/127369567-d03f9a41-3ad5-4933-888e-a3777df8c6cf.png)

I was not able to see any difference in build time, though.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-07-29 12:31:55 +00:00
243233f652 import .git to docker to fix vergen 2021-07-28 19:12:40 +02:00
d66eea42bb Merge #1536
1536: Remove ARMv7 binary publish r=MarinPostma a=curquiza

Fixes #1315 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-07-28 15:39:34 +00:00
c55f73bbc3 Remove ARMv7 support 2021-07-28 17:29:40 +02:00
3e30d4270b Merge #1533
1533: Update milli version to v0.8.0 r=MarinPostma a=curquiza

- Update milli, heed and obkv
- fix relevancy issue and the `facetsDistribution` display

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-07-28 11:15:31 +00:00
80916baa21 Add FieldId in import 2021-07-28 12:25:13 +02:00
1df8f041bd Update meilisearch-http/src/index/search.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:10:25 +02:00
6a6e2a8cd1 Update meilisearch-http/src/index/search.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:08:51 +02:00
f9d337b320 Update meilisearch-http/src/index/search.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:08:36 +02:00
feb069f604 Update meilisearch-http/src/index/search.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:08:28 +02:00
7e0eed5772 Update meilisearch-http/src/index/search.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:08:24 +02:00
9bdd040dd0 Update meilisearch-http/src/index/mod.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-07-28 12:08:19 +02:00
e5dabf265a Update milli version to v0.8.0 2021-07-28 10:52:47 +02:00
1a1046a0ef start writing some documentation for newcomers 2021-07-27 16:35:42 +02:00
dd18319b44 Merge #1530
1530: Update mini-dashboard version to v0.1.4 r=irevoire a=mdubus



Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2021-07-27 10:11:02 +00:00
d3cd7e92d1 Update mini-dashboard version to v0.1.4 2021-07-27 11:44:20 +02:00
553e7d8aaa Merge #1528
1528: Update of the Date Time Format in commitDate  r=MarinPostma a=irevoire

Since we were relying on a [super old version of `vergen`](https://docs.rs/crate/vergen/3.0.1), we could not get the `commit timestamp`, so I updated `vergen` to the latest version.
This also allows us to remove all the features we don't use.

closes #1522

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-07-27 07:49:31 +00:00
f79b8287f5 update vergen 2021-07-26 15:25:30 +02:00
b4c98f6cc3 Merge #1521
1521: Sentry was never sending anything r=Kerollmops a=irevoire

@Kerollmops noticed that we had no log of this release in sentry, and it look like I badly tested my code after ignoring the “No space left on device” errors.

Now it should be fixed.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-07-21 14:46:56 +00:00
5d4a0ac844 sentry was never sending anything 2021-07-21 11:50:54 +02:00
0136b02e5b Merge #1498
1498: Show the filterable and not the faceted attributes in the settings r=Kerollmops a=Kerollmops

Fixes #1497

Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-07-13 07:27:14 +00:00
f49a01703a Show the filterable and not the faceted attributes in the settings 2021-07-09 16:11:37 +02:00
e4f82aa441 Merge #1494
1494: Add cache to the ci r=irevoire a=irevoire

closes #1446 

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-07-08 14:06:03 +00:00
751d1af2a6 Merge #1492
1492: auth tests r=irevoire a=MarinPostma

add regression tests on route authentication.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-08 13:13:55 +00:00
076d8fbb84 add cache to the ci 2021-07-08 11:19:12 +02:00
4b2d01a453 Merge #1484
1484: Add MeiliSearch version to issue template r=irevoire a=bidoubiwa

It is relevant to know the version of MeiliSearch before any other additional information that might be important to know.

We could also reduce the number of required information asked to the user. I would like to suggest the following:

Instead of the section of `Desktop` and `Smartphone`  I would just improve the last section

```
**Additional context**
Additional information that may be relevant to the issue.
[e.g. architecture, device, OS, browser]
```

By applying this, the template final look will be the following: 

-----

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**MeiliSearch version:** [e.g. v0.20.0]

**Additional context**
Additional information that may be relevant to the issue.
[e.g. architecture, device, OS, browser]

Co-authored-by: Charlotte Vermandel <charlottevermandel@gmail.com>
2021-07-08 08:49:35 +00:00
a71fa25ebe auth tests 2021-07-07 17:47:48 +02:00
b4db54cb1f Merge #1488
1488: fix search permissions r=MarinPostma a=MarinPostma

fix #1485


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-07 13:16:07 +00:00
b2ca600e79 Remove unecessary questions 2021-07-07 11:15:08 +02:00
83725a1330 fix search permissions 2021-07-07 10:39:04 +02:00
587b837a6c Add MeiliSearch version to issue template 2021-07-06 22:04:15 +02:00
2844fe959f Merge #1483
1483: search tests r=MarinPostma a=MarinPostma

adds search tests from #1440.

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-06 15:13:07 +00:00
41e271974a add tests 2021-07-06 16:21:15 +02:00
520d37983c implement index search methods 2021-07-06 11:54:09 +02:00
487d82773a Merge #1481
1481: fix bug in index deletion r=Kerollmops a=MarinPostma

this bug was caused by a heed iterator entry being deleted while still holding a reference to it.


close #1333


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-06 08:07:30 +00:00
066085f6f5 fix index deletion bug 2021-07-05 18:42:13 +02:00
0d1f5b7193 Merge #1469
1469: Return 201 on index creation r=Kerollmops a=MarinPostma

fix #1467


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-05 15:42:13 +00:00
2f3a439566 fix tests 2021-07-05 16:31:52 +02:00
9681ffca52 change index create http code 2021-07-05 16:31:51 +02:00
fddc60f893 Merge #1471
1471: Bump milli to 0.7.2 r=irevoire a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2021-07-05 13:29:38 +00:00
0f024cc225 Merge #1478
1478: refactor routes r=irevoire a=MarinPostma

refactor the route directory, so the module tree follows the route structure


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-05 12:55:39 +00:00
575ec2a06f refactor routes 2021-07-05 14:33:48 +02:00
83aef0a27d Merge #1473
1473: Update loop r=MarinPostma a=irevoire



Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-07-05 12:32:29 +00:00
bc85d30076 add test 2021-07-05 12:33:28 +02:00
bc417726fc fix update loop bug 2021-07-05 12:33:22 +02:00
9949a2a930 bump milli to 0.7.2 2021-07-05 12:19:27 +02:00
185 changed files with 20276 additions and 28940 deletions

View File

@ -1,5 +1,4 @@
target
Dockerfile
.dockerignore
.git
.gitignore

View File

@ -23,16 +23,8 @@ A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
**Smartphone (please complete the following information):**
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
**Meilisearch version:** [e.g. v0.20.0]
**Additional context**
Add any other context about the problem here.
Additional information that may be relevant to the issue.
[e.g. architecture, device, OS, browser]

10
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@ -0,0 +1,10 @@
contact_links:
- name: Feature request & feedback
url: https://github.com/meilisearch/product/discussions/categories/feedback-feature-proposal
about: The feature requests and feedback regarding the already existing features are not managed in this repository. Please open a discussion in our dedicated product repository
- name: Documentation issue
url: https://github.com/meilisearch/documentation/issues/new
about: For documentation issues, open an issue or a PR in the documentation repository
- name: Support questions & other
url: https://github.com/meilisearch/meilisearch/discussions/new
about: For any other question, open a discussion in this repository

View File

@ -1,20 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

View File

@ -1,40 +0,0 @@
---
name: Tracking issue
about: Template for a tracking issue
title: ''
labels: tracking-issue
assignees: ''
---
# Summary
One paragraph to explain the feature.
# Motivations
Why are we doing this? What use cases does it support? What is the expected outcome?
# Explanation
Explain the proposal like it was the final documentation of this proposal.
- What is changing for end-users.
- How it works.
- What is breaking?
- Examples.
# Implementation
Explain the technical specificities that will need to be known or done in order to implement this proposal.
## Steps
Describe each step to create the feature with it's associated issue/PR.
# Related
- [ ] Validated by the team (@people needed)
- [ ] Test added
- [ ] [Documentation](https://github.com/meilisearch/documentation/issues/#xxx) //Change xxx or remove the line
- [ ] [SDK/Integrations](https://github.com/meilisearch/integration-guides/issues/#xxx) //Change xxx or remove the line

View File

@ -1,6 +0,0 @@
version: 2
updates:
- package-ecosystem: "cargo"
directory: "/"
schedule:
interval: "monthly"

View File

@ -74,7 +74,7 @@ semverLT() {
# Returns the tag of the latest stable release (in terms of semver and not of release date)
get_latest() {
temp_file='temp_file' # temp_file needed because the grep would start before the download is over
curl -s 'https://api.github.com/repos/meilisearch/MeiliSearch/releases' > "$temp_file"
curl -s 'https://api.github.com/repos/meilisearch/meilisearch/releases' > "$temp_file"
releases=$(cat "$temp_file" | \
grep -E "tag_name|draft|prerelease" \
| tr -d ',"' | cut -d ':' -f2 | tr -d ' ')

View File

@ -1,13 +0,0 @@
name-template: 'v$RESOLVED_VERSION'
tag-template: 'v$RESOLVED_VERSION'
version-template: '0.21.0-alpha.$PATCH'
exclude-labels:
- 'skip-changelog'
template: |
## Changes
$CHANGES
no-changes-template: 'Changes are coming soon 😎'
sort-direction: 'ascending'
version-resolver:
default: patch

View File

@ -1,4 +1,4 @@
# GitHub Actions Workflow for MeiliSearch
# GitHub Actions Workflow for Meilisearch
> **Note:**

View File

@ -1,7 +1,6 @@
---
on:
pull_request:
types: [review_requested, ready_for_review]
workflow_dispatch:
name: Execute code coverage

View File

@ -6,9 +6,10 @@ name: Publish binaries to release
jobs:
publish:
name: Publish for ${{ matrix.os }}
name: Publish binary for ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-18.04, macos-latest, windows-latest]
include:
@ -26,7 +27,7 @@ jobs:
- uses: hecrj/setup-rust-action@master
with:
rust-version: stable
- uses: actions/checkout@v1
- uses: actions/checkout@v2
- name: Build
run: cargo build --release --locked
- name: Upload binaries to release
@ -37,50 +38,69 @@ jobs:
asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }}
publish-armv7:
name: Publish for ARMv7
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v1.0.0
- uses: uraimo/run-on-arch-action@v1.0.7
id: runcmd
with:
architecture: armv7
distribution: ubuntu18.04
run: |
apt update
apt install -y curl gcc make
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --default-toolchain stable
source $HOME/.cargo/env
cargo build --release --locked
- name: Upload the binary to release
uses: svenstaro/upload-release-action@v1-release
with:
repo_token: ${{ secrets.PUBLISH_TOKEN }}
file: target/release/meilisearch
asset_name: meilisearch-linux-armv7
tag: ${{ github.ref }}
publish-aarch64:
name: Publish binary for aarch64
runs-on: ${{ matrix.os }}
continue-on-error: false
strategy:
fail-fast: false
matrix:
include:
- build: aarch64
os: ubuntu-18.04
target: aarch64-unknown-linux-gnu
linker: gcc-aarch64-linux-gnu
use-cross: true
asset_name: meilisearch-linux-aarch64
publish-armv8:
name: Publish for ARMv8
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v1.0.0
- uses: uraimo/run-on-arch-action@v1.0.7
id: runcmd
- name: Checkout repository
uses: actions/checkout@v2
- name: Installing Rust toolchain
uses: actions-rs/toolchain@v1
with:
architecture: aarch64 # aka ARMv8
distribution: ubuntu18.04
run: |
apt update
apt install -y curl gcc make
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --default-toolchain stable
source $HOME/.cargo/env
cargo build --release --locked
toolchain: stable
profile: minimal
target: ${{ matrix.target }}
override: true
- name: APT update
run: |
sudo apt update
- name: Install target specific tools
if: matrix.use-cross
run: |
sudo apt-get install -y ${{ matrix.linker }}
- name: Configure target aarch64 GNU
if: matrix.target == 'aarch64-unknown-linux-gnu'
## Environment variable is not passed using env:
## LD gold won't work with MUSL
# env:
# JEMALLOC_SYS_WITH_LG_PAGE: 16
# RUSTFLAGS: '-Clink-arg=-fuse-ld=gold'
run: |
echo '[target.aarch64-unknown-linux-gnu]' >> ~/.cargo/config
echo 'linker = "aarch64-linux-gnu-gcc"' >> ~/.cargo/config
echo 'JEMALLOC_SYS_WITH_LG_PAGE=16' >> $GITHUB_ENV
echo RUSTFLAGS="-Clink-arg=-fuse-ld=gold" >> $GITHUB_ENV
- name: Cargo build
uses: actions-rs/cargo@v1
with:
command: build
use-cross: ${{ matrix.use-cross }}
args: --release --target ${{ matrix.target }}
- name: List target output files
run: ls -lR ./target
- name: Upload the binary to release
uses: svenstaro/upload-release-action@v1-release
with:
repo_token: ${{ secrets.PUBLISH_TOKEN }}
file: target/release/meilisearch
asset_name: meilisearch-linux-armv8
file: target/${{ matrix.target }}/release/meilisearch
asset_name: ${{ matrix.asset_name }}
tag: ${{ github.ref }}

View File

@ -14,7 +14,7 @@ jobs:
rust-version: stable
- name: Install cargo-deb
run: cargo install cargo-deb
- uses: actions/checkout@v1
- uses: actions/checkout@v2
- name: Build deb package
run: cargo deb -p meilisearch-http -o target/debian/meilisearch.deb
- name: Upload debian pkg to release

View File

@ -6,22 +6,25 @@ on:
name: Publish latest image to Docker Hub
jobs:
build:
runs-on: ubuntu-18.04
docker-latest:
runs-on: docker
steps:
- uses: actions/checkout@v2
- name: Check if current release is latest
run: echo "##[set-output name=is_latest;]$(sh .github/is-latest-release.sh)"
id: release
- name: Set COMMIT_DATE env variable
run: |
echo "COMMIT_DATE=$( git log --pretty=format:'%ad' -n1 --date=short )" >> $GITHUB_ENV
- name: Publish to Registry
if: steps.release.outputs.is_latest == 'true'
uses: elgohr/Publish-Docker-Github-Action@master
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to DockerHub
uses: docker/login-action@v1
with:
name: getmeili/meilisearch
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
tag_names: true
buildargs: COMMIT_SHA,COMMIT_DATE
- name: Build and push
id: docker_build
uses: docker/build-push-action@v2
with:
push: true
platforms: linux/amd64,linux/arm64
tags: getmeili/meilisearch:latest

View File

@ -7,20 +7,33 @@ on:
name: Publish tagged image to Docker Hub
jobs:
build:
runs-on: ubuntu-18.04
docker-tag:
runs-on: docker
steps:
- uses: actions/checkout@v1
- name: Set COMMIT_DATE env variable
run: |
echo "COMMIT_DATE=$( git log --pretty=format:'%ad' -n1 --date=short )" >> $GITHUB_ENV
- name: Publish to Registry
uses: elgohr/Publish-Docker-Github-Action@master
env:
COMMIT_SHA: ${{ github.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to DockerHub
uses: docker/login-action@v1
with:
name: getmeili/meilisearch
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
tag_names: true
buildargs: COMMIT_SHA,COMMIT_DATE
- name: Docker meta
id: meta
uses: docker/metadata-action@v3
with:
images: getmeili/meilisearch
flavor: latest=false
tags: type=ref,event=tag
- name: Build and push
id: docker_build
uses: docker/build-push-action@v2
with:
push: true
platforms: linux/amd64,linux/arm64
tags: ${{ steps.meta.outputs.tags }}

View File

@ -1,16 +0,0 @@
name: Release Drafter
on:
push:
branches:
- main
jobs:
update_release_draft:
runs-on: ubuntu-latest
steps:
- uses: release-drafter/release-drafter@v5
with:
config-name: release-draft-template.yml
env:
GITHUB_TOKEN: ${{ secrets.RELEASE_DRAFTER_TOKEN }}

View File

@ -11,6 +11,7 @@ on:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
tests:
@ -22,6 +23,8 @@ jobs:
os: [ubuntu-18.04, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@ -33,6 +36,25 @@ jobs:
command: test
args: --locked --release
# We run tests in debug also, to make sure that the debug_assertions are hit
test-debug:
name: Run tests in debug
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run tests in debug
uses: actions-rs/cargo@v1
with:
command: test
args: --locked
clippy:
name: Run Clippy
runs-on: ubuntu-18.04
@ -44,6 +66,8 @@ jobs:
toolchain: stable
override: true
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo clippy
uses: actions-rs/cargo@v1
with:
@ -61,5 +85,7 @@ jobs:
toolchain: nightly
override: true
components: rustfmt
- name: Cache dependencies
uses: Swatinem/rust-cache@v1.3.0
- name: Run cargo fmt
run: cargo fmt --all -- --check

View File

@ -1,112 +1,83 @@
# Contributing
First, thank you for contributing to MeiliSearch! The goal of this document is to
provide everything you need to start contributing to MeiliSearch. The
following TOC is sorted progressively, starting with the basics and
expanding into more specifics.
First, thank you for contributing to Meilisearch! The goal of this document is to provide everything you need to start contributing to Meilisearch.
<!-- MarkdownTOC autolink="true" style="ordered" indent=" " -->
Remember that there are many ways to contribute other than writing code: writing [tutorials or blog posts](https://github.com/meilisearch/awesome-meilisearch), improving [the documentation](https://github.com/meilisearch/documentation), submitting [bug reports](https://github.com/meilisearch/meilisearch/issues/new?assignees=&labels=&template=bug_report.md&title=) and [feature requests](https://github.com/meilisearch/product/discussions/categories/feedback-feature-proposal)...
1. [Assumptions](#assumptions)
1. [Your First Contribution](#your-first-contribution)
1. [Change Control](#change-control)
1. [Git Branches](#git-branches)
1. [Git Commits](#git-commits)
1. [Style](#style)
1. [Github Pull Requests](#github-pull-requests)
1. [Reviews & Approvals](#reviews--approvals)
1. [Merge Style](#merge-style)
1. [CI](#ci)
1. [Development](#development)
1. [Setup](#setup)
1. [Testing](#testing)
1. [Benchmarking](#benchmarking--profiling)
1. [Humans](#humans)
1. [Documentation](#documentation)
1. [Changelog](#changelog)
<!-- /MarkdownTOC -->
## Table of Contents
- [Assumptions](#assumptions)
- [How to Contribute](#how-to-contribute)
- [Development Workflow](#development-workflow)
- [Git Guidelines](#git-guidelines)
## Assumptions
1. **You're familiar with [Github](https://github.com) and the [pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)
workflow.**
2. **You've read the MeiliSearch [docs](https://docs.meilisearch.com).**
3. **You know about the [MeiliSearch community](https://docs.meilisearch.com/learn/what_is_meilisearch/contact.html).
1. **You're familiar with [Github](https://github.com) and the [Pull Requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)(PR) workflow.**
2. **You've read the Meilisearch [documentation](https://docs.meilisearch.com).**
3. **You know about the [Meilisearch community](https://docs.meilisearch.com/learn/what_is_meilisearch/contact.html).
Please use this for help.**
## Your First Contribution
## How to Contribute
1. Ensure your change has an issue! Find an
[existing issue](https://github.com/meilisearch/meilisearch/issues/) or [open a new issue](https://github.com/meilisearch/meilisearch/issues/new).
* This is where you can get a feel if the change will be accepted or not.
2. Once approved, [fork the MeiliSearch repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own
Github account.
2. Once approved, [fork the Meilisearch repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own Github account.
3. [Create a new Git branch](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-and-deleting-branches-within-your-repository)
4. Review the MeiliSearch [workflow](#workflow) and [development](#development).
5. Make your changes.
6. [Submit the branch as a pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) to the main MeiliSearch
repo. A MeiliSearch team member should comment and/or review your pull request
with a few days. Although, depending on the circumstances, it may take
longer.
4. Review the [Development Workflow](#development-workflow) section that describes the steps to maintain the repository.
5. Make your changes on your branch.
6. [Submit the branch as a Pull Request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) pointing to the `main` branch of the Meilisearch repository. A maintainer should comment and/or review your Pull Request within a few days. Although depending on the circumstances, it may take longer.
## Change Control
## Development Workflow
### Setup and run Meilisearch
```bash
cargo run --release
```
We recommend using the `--release` flag to test the full performance of Meilisearch.
### Test
```bash
cargo test
```
If you get a "Too many open files" error you might want to increase the open file limit using this command:
```bash
ulimit -Sn 3000
```
## Git Guidelines
### Git Branches
_All_ changes must be made in a branch and submitted as [pull requests](#pull-requests).
MeiliSearch does not adopt any type of branch naming style, but please use something
descriptive of your changes.
All changes must be made in a branch and submitted as PR.
We do not enforce any branch naming style, but please use something descriptive of your changes.
### Git Commits
#### Style
As minimal requirements, your commit message should:
- be capitalized
- not finish by a dot or any other punctuation character (!,?)
- start with a verb so that we can read your commit message this way: "This commit will ...", where "..." is the commit message.
e.g.: "Fix the home page button" or "Add more tests for create_index method"
Please ensure your commits are small and focused; they should tell a story of
your change. This helps reviewers to follow your changes, especially for more
complex changes.
Familiarise yourself with [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/).
We don't follow any other convention, but if you want to use one, we recommend [the Chris Beams one](https://chris.beams.io/posts/git-commit/).
### Github Pull Requests
Once your changes are ready you must submit your branch as a pull request.
Some notes on GitHub PRs:
#### Reviews & Approvals
- All PRs must be reviewed and approved by at least one maintainer.
- The PR title should be accurate and descriptive of the changes.
- [Convert your PR as a draft](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/changing-the-stage-of-a-pull-request) if your changes are a work in progress: no one will review it until you pass your PR as ready for review.<br>
The draft PRs are recommended when you want to show that you are working on something and make your work visible.
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [Bors](https://github.com/bors-ng/bors-ng) to automatically enforce this requirement without the PR author having to rebase manually.
All pull requests must be reviewed and approved by at least one MeiliSearch team
member.
<hr>
#### Merge Style
All pull requests are squashed and merged. We generally discourage large pull
requests that are over 300-500 lines of diff. If you would like to propose
a change that is larger we suggest coming onto our chat channel and
discuss it with one of our engineers. This way we can talk through the
solution and discuss if a change that large is even needed! This overall
will produce a quicker response to the change and likely produce code that
aligns better with our process.
## Development
### Setup
See the [MeiliSearch Docs](https://docs.meilisearch.com/reference/features/installation.html) for how to set up a development environment.
### Benchmarking & Profiling
We do not yet do any benchmarking, nor have we formalised our profiling. If you'd like to work on this please get in touch!
## Humans
After making your change, you'll want to prepare it for MeiliSearch users (mostly humans). This usually entails updating documentation and announcing your feature.
### Documentation
Documentation is very important to MeiliSearch. All contributions that
alter user-facing behavior MUST include documentation changes. Please see
[GitHub.com/meilisearch/documentation](https://github.com/meilisearch/documentation) for more info.
### Changelog
Until we have guidelines in place, updating the [`Changelog`](/CHANGELOG.md) is solely the responsibility of MeiliSearch team members.
Thank you again for reading this through, we can not wait to begin to work with you if you made your way through this contributing guide ❤️

2919
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -1,8 +1,8 @@
[workspace]
resolver = "2"
members = [
"meilisearch-http",
"meilisearch-error",
"meilisearch-lib",
"meilisearch-auth",
]
[profile.release]
debug = true

7
Cross.toml Normal file
View File

@ -0,0 +1,7 @@
[build.env]
passthrough = [
"RUST_BACKTRACE",
"CARGO_TERM_COLOR",
"RUSTFLAGS",
"JEMALLOC_SYS_WITH_LG_PAGE"
]

View File

@ -1,45 +1,46 @@
# Compile
FROM alpine:3.14 AS compiler
FROM rust:alpine3.14 AS compiler
RUN apk update --quiet
RUN apk add curl
RUN apk add build-base
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
RUN apk add -q --update-cache --no-cache build-base openssl-dev
WORKDIR /meilisearch
COPY Cargo.lock .
COPY Cargo.toml .
COPY meilisearch-error/Cargo.toml meilisearch-error/
COPY meilisearch-http/Cargo.toml meilisearch-http/
ENV RUSTFLAGS="-C target-feature=-crt-static"
# Create dummy main.rs files for each workspace member to be able to compile all the dependencies
RUN find . -type d -name "meilisearch-*" | xargs -I{} sh -c 'mkdir {}/src; echo "fn main() { }" > {}/src/main.rs;'
# Use `cargo build` instead of `cargo vendor` because we need to not only download but compile dependencies too
RUN $HOME/.cargo/bin/cargo build --release
# Cleanup dummy main.rs files
RUN find . -path "*/src/main.rs" -delete
ARG COMMIT_SHA
ARG COMMIT_DATE
ENV COMMIT_SHA=${COMMIT_SHA} COMMIT_DATE=${COMMIT_DATE}
ENV RUSTFLAGS="-C target-feature=-crt-static"
COPY . .
RUN $HOME/.cargo/bin/cargo build --release
RUN set -eux; \
apkArch="$(apk --print-arch)"; \
if [ "$apkArch" = "aarch64" ]; then \
export JEMALLOC_SYS_WITH_LG_PAGE=16; \
fi && \
cargo build --release
# Run
FROM alpine:3.14
RUN apk add -q --no-cache libgcc tini
COPY --from=compiler /meilisearch/target/release/meilisearch .
ENV MEILI_HTTP_ADDR 0.0.0.0:7700
ENV MEILI_SERVER_PROVIDER docker
RUN apk update --quiet \
&& apk add -q --no-cache libgcc tini curl
# add meilisearch to the `/bin` so you can run it from anywhere and it's easy
# to find.
COPY --from=compiler /meilisearch/target/release/meilisearch /bin/meilisearch
# To stay compatible with the older version of the container (pre v0.27.0) we're
# going to symlink the meilisearch binary in the path to `/meilisearch`
RUN ln -s /bin/meilisearch /meilisearch
# This directory should hold all the data related to meilisearch so we're going
# to move our PWD in there.
# We don't want to put the meilisearch binary
WORKDIR /meili_data
EXPOSE 7700/tcp
ENTRYPOINT ["tini", "--"]
CMD ./meilisearch
CMD /bin/meilisearch

View File

@ -1,6 +1,6 @@
MIT License
Copyright (c) 2019-2021 Meili SAS
Copyright (c) 2019-2022 Meili SAS
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@ -1,8 +1,8 @@
<p align="center">
<img src="assets/logo.svg" alt="MeiliSearch" width="200" height="200" />
<img src="assets/logo.svg" alt="Meilisearch" width="200" height="200" />
</p>
<h1 align="center">MeiliSearch</h1>
<h1 align="center">Meilisearch</h1>
<h4 align="center">
<a href="https://www.meilisearch.com">Website</a> |
@ -15,21 +15,21 @@
</h4>
<p align="center">
<a href="https://github.com/meilisearch/MeiliSearch/actions"><img src="https://github.com/meilisearch/MeiliSearch/workflows/Cargo%20test/badge.svg" alt="Build Status"></a>
<a href="https://deps.rs/repo/github/meilisearch/MeiliSearch"><img src="https://deps.rs/repo/github/meilisearch/MeiliSearch/status.svg" alt="Dependency status"></a>
<a href="https://github.com/meilisearch/MeiliSearch/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-informational" alt="License"></a>
<a href="https://slack.meilisearch.com"><img src="https://img.shields.io/badge/slack-MeiliSearch-blue.svg?logo=slack" alt="Slack"></a>
<a href="https://github.com/meilisearch/MeiliSearch/discussions" alt="Discussions"><img src="https://img.shields.io/badge/github-discussions-red" /></a>
<a href="https://github.com/meilisearch/meilisearch/actions"><img src="https://github.com/meilisearch/meilisearch/workflows/Cargo%20test/badge.svg" alt="Build Status"></a>
<a href="https://deps.rs/repo/github/meilisearch/meilisearch"><img src="https://deps.rs/repo/github/meilisearch/meilisearch/status.svg" alt="Dependency status"></a>
<a href="https://github.com/meilisearch/meilisearch/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-informational" alt="License"></a>
<a href="https://slack.meilisearch.com"><img src="https://img.shields.io/badge/slack-meilisearch-blue.svg?logo=slack" alt="Slack"></a>
<a href="https://github.com/meilisearch/meilisearch/discussions" alt="Discussions"><img src="https://img.shields.io/badge/github-discussions-red" /></a>
<a href="https://app.bors.tech/repositories/26457"><img src="https://bors.tech/images/badge_small.svg" alt="Bors enabled"></a>
</p>
<p align="center">⚡ Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine 🔍</p>
**MeiliSearch** is a powerful, fast, open-source, easy to use and deploy search engine. Both searching and indexing are highly customizable. Features such as typo-tolerance, filters, and synonyms are provided out-of-the-box.
**Meilisearch** is a powerful, fast, open-source, easy to use and deploy search engine. Both searching and indexing are highly customizable. Features such as typo-tolerance, filters, and synonyms are provided out-of-the-box.
For more information about features go to [our documentation](https://docs.meilisearch.com/).
<p align="center">
<img src="assets/trumen_quick_loop.gif" alt="Web interface gif" />
<img src="assets/trumen-fast.gif" alt="Web interface gif" />
</p>
## ✨ Features
@ -61,9 +61,13 @@ meilisearch
docker run -p 7700:7700 -v "$(pwd)/data.ms:/data.ms" getmeili/meilisearch
```
#### Try MeiliSearch in our Sandbox
#### Announcing a cloud-hosted Meilisearch
Create a MeiliSearch instance in [MeiliSearch Sandbox](https://sandbox.meilisearch.com/). This instance is free, and will be active for 48 hours.
Join the closed beta by filling out this [form](https://meilisearch.typeform.com/to/FtnzvZfh).
#### Try Meilisearch in our Sandbox
Create a Meilisearch instance in [Meilisearch Sandbox](https://sandbox.meilisearch.com/). This instance is free, and will be active for 48 hours.
#### Run on Digital Ocean
@ -95,8 +99,8 @@ curl -L https://install.meilisearch.com | sh
If you have the latest stable Rust toolchain installed on your local system, clone the repository and change it to your working directory.
```bash
git clone https://github.com/meilisearch/MeiliSearch.git
cd MeiliSearch
git clone https://github.com/meilisearch/meilisearch.git
cd meilisearch
cargo run --release
```
@ -108,7 +112,7 @@ Let's create an index! If you need a sample dataset, use [this movie database](h
curl -L 'https://bit.ly/2PAcw9l' -o movies.json
```
Now, you're ready to index some data.
Now, you're ready to index some data.
```bash
curl -i -X POST 'http://127.0.0.1:7700/indexes/movies/documents' \
@ -157,36 +161,45 @@ curl 'http://127.0.0.1:7700/indexes/movies/search?q=botman+robin&limit=2' | jq
#### Use the Web Interface
We also deliver an **out-of-the-box [web interface](https://github.com/meilisearch/mini-dashboard)** in which you can test MeiliSearch interactively.
We also deliver an **out-of-the-box [web interface](https://github.com/meilisearch/mini-dashboard)** in which you can test Meilisearch interactively.
You can access the web interface in your web browser at the root of the server. The default URL is [http://127.0.0.1:7700](http://127.0.0.1:7700). All you need to do is open your web browser and enter MeiliSearchs address to visit it. This will lead you to a web page with a search bar that will allow you to search in the selected index.
You can access the web interface in your web browser at the root of the server. The default URL is [http://127.0.0.1:7700](http://127.0.0.1:7700). All you need to do is open your web browser and enter Meilisearchs address to visit it. This will lead you to a web page with a search bar that will allow you to search in the selected index.
| [See the gif above](#demo)
## Documentation
Now that your MeiliSearch server is up and running, you can learn more about how to tune your search engine in [the documentation](https://docs.meilisearch.com).
Now that your Meilisearch server is up and running, you can learn more about how to tune your search engine in [the documentation](https://docs.meilisearch.com).
## Contributing
Hey! We're glad you're thinking about contributing to MeiliSearch! However, we are currently working on a huge refactor and accepting PRs on this repository wouldn't be productive. We are sorry about this! Be sure we are doing our best so that you can contribute to MeiliSearch again as soon as possible ❤️
Hey! We're glad you're thinking about contributing to Meilisearch! Feel free to pick an [issue labeled as `good first issue`](https://github.com/meilisearch/meilisearch/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22), and to ask any question you need. Some points might not be clear and we are available to help you!
Also, we recommend following the [CONTRIBUTING](./CONTRIBUTING.md) to create your PR.
## Core engine and tokenizer
The code in this repository is only concerned with managing multiple indexes, handling the update store, and exposing an HTTP API.
Search and indexation are the domain of our core engine, [`milli`](https://github.com/meilisearch/milli), while tokenization is handled by [our `tokenizer` library](https://github.com/meilisearch/tokenizer/).
## Telemetry
MeiliSearch collects anonymous data regarding general usage.
This helps us better understand developers' usage of MeiliSearch features.
Meilisearch collects anonymous data regarding general usage.
This helps us better understand developers' usage of Meilisearch features.
To see what information we're retrieving, please see the complete list [on the dedicated issue](https://github.com/meilisearch/MeiliSearch/issues/720).
We also use Sentry to make us crash and error reports. If you want to know more about what Sentry collects, please visit their [privacy policy website](https://sentry.io/privacy/).
To find out more on what information we're retrieving, please see our documentation on [Telemetry](https://docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html).
This program is optional, you can disable these analytics by using the `MEILI_NO_ANALYTICS` env variable.
## Feature request
The feature requests are not managed in this repository. Please visit our [dedicated repository](https://github.com/meilisearch/product) to see our work about the Meilisearch product.
If you have a feature request or any feedback about an existing feature, please open [a discussion](https://github.com/meilisearch/product/discussions).
Also, feel free to participate in the current discussions, we are looking forward to reading your comments.
## 💌 Contact
Feel free to contact us with any questions you may have:
* 🆕 Join our [GitHub Discussions forum](https://github.com/meilisearch/MeiliSearch/discussions)
* Join our [Slack community](https://slack.meilisearch.com/).
* By opening an issue.
Please visit [this page](https://docs.meilisearch.com/learn/what_is_meilisearch/contact.html#contact-us).
MeiliSearch is developed by [Meili](https://www.meilisearch.com), a young company. To know more about us, you can [read our blog](https://blog.meilisearch.com). Any suggestion or feedback is highly appreciated. Thank you for your support!
Meilisearch is developed by [Meili](https://www.meilisearch.com), a young company. To know more about us, you can [read our blog](https://blog.meilisearch.com). Any suggestion or feedback is highly appreciated. Thank you for your support!

33
SECURITY.md Normal file
View File

@ -0,0 +1,33 @@
# Security
Meilisearch takes the security of our software products and services seriously.
If you believe you have found a security vulnerability in any Meilisearch-owned repository, please report it to us as described below.
## Suported versions
As long as we are pre-v1.0, only the latest version of Meilisearch will be supported with security updates.
## Reporting security issues
⚠️ Please do not report security vulnerabilities through public GitHub issues. ⚠️
Instead, please kindly email us at security@meilisearch.com
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
- Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
- Full paths of source file(s) related to the manifestation of the issue
- The location of the affected source code (tag/branch/commit or direct URL)
- Any special configuration required to reproduce the issue
- Step-by-step instructions to reproduce the issue
- Proof-of-concept or exploit code (if possible)
- Impact of the issue, including how an attacker might exploit the issue
This information will help us triage your report more quickly.
You will receive a response from us within 72 hours. If the issue is confirmed, we will release a patch as soon as possible depending on complexity.
## Preferred languages
We prefer all communications to be in English.

View File

@ -1,17 +1,19 @@
<svg width="360" height="360" viewBox="0 0 360 360" fill="none" xmlns="http://www.w3.org/2000/svg">
<g id="logo_main">
<rect id="Rectangle" x="107.333" y="0.150146" width="274.315" height="274.315" rx="98.8334" transform="rotate(23 107.333 0.150146)" fill="url(#paint0_linear)"/>
<path id="Rectangle_2" fill-rule="evenodd" clip-rule="evenodd" d="M61.3296 230.199C46.2224 194.608 38.6688 176.813 38.208 160.329C37.5286 136.025 47.0175 112.539 64.3891 95.5282C76.1718 83.9904 93.9669 76.4368 129.557 61.3296C165.147 46.2224 182.943 38.6688 199.427 38.208C223.731 37.5286 247.217 47.0175 264.228 64.3891C275.766 76.1718 283.319 93.9669 298.426 129.557C313.534 165.147 321.087 182.943 321.548 199.427C322.227 223.731 312.738 247.217 295.367 264.228C283.584 275.766 265.789 283.319 230.199 298.426C194.608 313.534 176.813 321.087 160.329 321.548C136.025 322.227 112.539 312.738 95.5282 295.367C83.9903 283.584 76.4368 265.789 61.3296 230.199Z" fill="url(#paint1_linear)"/>
<path id="m" fill-rule="evenodd" clip-rule="evenodd" d="M219.568 130.748C242.363 130.748 259.263 147.451 259.263 174.569V229.001H227.232V179.678C227.232 166.119 220.747 159.634 210.136 159.634C205.223 159.634 200.311 161.796 195.595 167.494C195.791 169.852 195.988 172.21 195.988 174.569V229.001H164.154V179.678C164.154 166.119 157.472 159.634 147.057 159.634C142.145 159.634 137.429 161.992 132.712 168.084V229.001H100.878V133.695H132.712V139.394C139.197 133.892 145.878 130.748 156.49 130.748C168.477 130.748 178.695 135.267 185.769 143.52C195.791 134.678 205.42 130.748 219.568 130.748Z" fill="white"/>
</g>
<svg width="300" height="300" viewBox="0 0 300 300" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M0 237L55.426 96.7678C63.2367 77.0063 82.499 64 103.955 64H137.371L81.9447 204.232C74.1341 223.993 54.8717 237 33.4156 237H0Z" fill="url(#paint0_linear_1_898)"/>
<path d="M81.3123 237L136.738 96.7682C144.549 77.0067 163.811 64.0004 185.267 64.0004H218.683L163.257 204.232C155.446 223.994 136.184 237 114.728 237H81.3123Z" fill="url(#paint1_linear_1_898)"/>
<path d="M162.629 237L218.055 96.7682C225.866 77.0067 245.128 64.0004 266.584 64.0004H300L244.574 204.232C236.763 223.994 217.501 237 196.045 237H162.629Z" fill="url(#paint2_linear_1_898)"/>
<defs>
<linearGradient id="paint0_linear" x1="-13.6248" y1="129.208" x2="244.49" y2="403.522" gradientUnits="userSpaceOnUse">
<stop stop-color="#E41359"/>
<stop offset="1" stop-color="#F23C79"/>
<linearGradient id="paint0_linear_1_898" x1="300.001" y1="50.7858" x2="1.63474" y2="221.244" gradientUnits="userSpaceOnUse">
<stop stop-color="#FF5CAA"/>
<stop offset="1" stop-color="#FF4E62"/>
</linearGradient>
<linearGradient id="paint1_linear" x1="11.0088" y1="111.65" x2="111.65" y2="348.747" gradientUnits="userSpaceOnUse">
<stop stop-color="#24222F"/>
<stop offset="1" stop-color="#2B2937"/>
<linearGradient id="paint1_linear_1_898" x1="300.001" y1="50.7858" x2="1.63474" y2="221.244" gradientUnits="userSpaceOnUse">
<stop stop-color="#FF5CAA"/>
<stop offset="1" stop-color="#FF4E62"/>
</linearGradient>
<linearGradient id="paint2_linear_1_898" x1="300.001" y1="50.7858" x2="1.63474" y2="221.244" gradientUnits="userSpaceOnUse">
<stop stop-color="#FF5CAA"/>
<stop offset="1" stop-color="#FF4E62"/>
</linearGradient>
</defs>
</svg>

Before

Width:  |  Height:  |  Size: 2.0 KiB

After

Width:  |  Height:  |  Size: 1.3 KiB

BIN
assets/trumen-fast.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 MiB

View File

@ -2,8 +2,10 @@ status = [
'Tests on ubuntu-18.04',
'Tests on macos-latest',
'Tests on windows-latest',
'Run Clippy',
'Run Rustfmt'
# 'Run Clippy',
'Run Rustfmt',
'Run tests in debug',
]
pr_status = ['Milestone Check']
# 3 hours timeout
timeout-sec = 10800

View File

@ -1 +0,0 @@
_datas in movies.json are from https://www.themoviedb.org/_

File diff suppressed because it is too large Load Diff

View File

@ -6,7 +6,6 @@ GREEN='\033[32m'
DEFAULT='\033[0m'
# GLOBALS
BINARY_NAME='meilisearch'
GREP_SEMVER_REGEXP='v\([0-9]*\)[.]\([0-9]*\)[.]\([0-9]*\)$' # i.e. v[number].[number].[number]
# FUNCTIONS
@ -52,13 +51,13 @@ semverLT() {
if [ $MAJOR_A -le $MAJOR_B ] && [ $MINOR_A -le $MINOR_B ] && [ $PATCH_A -lt $PATCH_B ]; then
return 0
fi
if [ "_$SPECIAL_A" == "_" ] && [ "_$SPECIAL_B" == "_" ] ; then
if [ "_$SPECIAL_A" == '_' ] && [ "_$SPECIAL_B" == '_' ] ; then
return 1
fi
if [ "_$SPECIAL_A" == "_" ] && [ "_$SPECIAL_B" != "_" ] ; then
if [ "_$SPECIAL_A" == '_' ] && [ "_$SPECIAL_B" != '_' ] ; then
return 1
fi
if [ "_$SPECIAL_A" != "_" ] && [ "_$SPECIAL_B" == "_" ] ; then
if [ "_$SPECIAL_A" != '_' ] && [ "_$SPECIAL_B" == '_' ] ; then
return 0
fi
if [ "_$SPECIAL_A" < "_$SPECIAL_B" ]; then
@ -68,39 +67,47 @@ semverLT() {
return 1
}
# Get a token from https://github.com/settings/tokens to increasae rate limit (from 60 to 5000), make sure the token scope is set to 'public_repo'
# Create GITHUB_PAT enviroment variable once you aquired the token to start using it
# Returns the tag of the latest stable release (in terms of semver and not of release date)
get_latest() {
temp_file='temp_file' # temp_file needed because the grep would start before the download is over
curl -s 'https://api.github.com/repos/meilisearch/MeiliSearch/releases' > "$temp_file" || return 1
if [ -z "$GITHUB_PAT" ]; then
curl -s 'https://api.github.com/repos/meilisearch/meilisearch/releases' > "$temp_file" || return 1
else
curl -H "Authorization: token $GITHUB_PAT" -s 'https://api.github.com/repos/meilisearch/meilisearch/releases' > "$temp_file" || return 1
fi
releases=$(cat "$temp_file" | \
grep -E "tag_name|draft|prerelease" \
grep -E '"tag_name":|"draft":|"prerelease":' \
| tr -d ',"' | cut -d ':' -f2 | tr -d ' ')
# Returns a list of [tag_name draft_boolean prerelease_boolean ...]
# Ex: v0.10.1 false false v0.9.1-rc.1 false true v0.9.0 false false...
i=0
latest=""
current_tag=""
latest=''
current_tag=''
for release_info in $releases; do
if [ $i -eq 0 ]; then # Cheking tag_name
if echo "$release_info" | grep -q "$GREP_SEMVER_REGEXP"; then # If it's not an alpha or beta release
current_tag=$release_info
else
current_tag=""
current_tag=''
fi
i=1
elif [ $i -eq 1 ]; then # Checking draft boolean
if [ "$release_info" = "true" ]; then
current_tag=""
if [ "$release_info" = 'true' ]; then
current_tag=''
fi
i=2
elif [ $i -eq 2 ]; then # Checking prerelease boolean
if [ "$release_info" = "true" ]; then
current_tag=""
if [ "$release_info" = 'true' ]; then
current_tag=''
fi
i=0
if [ "$current_tag" != "" ]; then # If the current_tag is valid
if [ "$latest" = "" ]; then # If there is no latest yet
if [ "$current_tag" != '' ]; then # If the current_tag is valid
if [ "$latest" = '' ]; then # If there is no latest yet
latest="$current_tag"
else
semverLT $current_tag $latest # Comparing latest and the current tag
@ -113,7 +120,7 @@ get_latest() {
done
rm -f "$temp_file"
echo $latest
return 0
}
# Gets the OS by setting the $os variable
@ -127,6 +134,9 @@ get_os() {
'Linux')
os='linux'
;;
'MINGW'*)
os='windows'
;;
*)
return 1
esac
@ -138,11 +148,18 @@ get_os() {
get_archi() {
architecture=$(uname -m)
case "$architecture" in
'x86_64' | 'amd64')
'x86_64' | 'amd64' )
archi='amd64'
;;
'arm64')
if [ $os = 'macos' ]; then # MacOS M1
archi='amd64'
else
archi='aarch64'
fi
;;
'aarch64')
archi='armv8'
archi='aarch64'
;;
*)
return 1
@ -151,7 +168,7 @@ get_archi() {
}
success_usage() {
printf "$GREEN%s\n$DEFAULT" "MeiliSearch binary successfully downloaded as '$BINARY_NAME' file."
printf "$GREEN%s\n$DEFAULT" "Meilisearch $latest binary successfully downloaded as '$binary_name' file."
echo ''
echo 'Run it:'
echo ' $ ./meilisearch'
@ -159,30 +176,65 @@ success_usage() {
echo ' $ ./meilisearch --help'
}
failure_usage() {
printf "$RED%s\n$DEFAULT" 'ERROR: MeiliSearch binary is not available for your OS distribution or your architecture yet.'
not_available_failure_usage() {
printf "$RED%s\n$DEFAULT" 'ERROR: Meilisearch binary is not available for your OS distribution or your architecture yet.'
echo ''
echo 'However, you can easily compile the binary from the source files.'
echo 'Follow the steps at the page ("Source" tab): https://docs.meilisearch.com/guides/advanced_guides/installation.html'
echo 'Follow the steps at the page ("Source" tab): https://docs.meilisearch.com/learn/getting_started/installation.html'
}
fetch_release_failure_usage() {
echo ''
printf "$RED%s\n$DEFAULT" 'ERROR: Impossible to get the latest stable version of Meilisearch.'
echo 'Please let us know about this issue: https://github.com/meilisearch/meilisearch/issues/new/choose'
}
# MAIN
latest="$(get_latest)"
# Fill $latest variable
if ! get_latest; then
fetch_release_failure_usage # TO CHANGE
exit 1
fi
if [ "$latest" = '' ]; then
fetch_release_failure_usage
exit 1
fi
# Fill $os variable
if ! get_os; then
failure_usage
not_available_failure_usage
exit 1
fi
# Fill $archi variable
if ! get_archi; then
failure_usage
not_available_failure_usage
exit 1
fi
echo "Downloading MeiliSearch binary $latest for $os, architecture $archi..."
release_file="meilisearch-$os-$archi"
link="https://github.com/meilisearch/MeiliSearch/releases/download/$latest/$release_file"
curl -OL "$link"
mv "$release_file" "$BINARY_NAME"
chmod 744 "$BINARY_NAME"
echo "Downloading Meilisearch binary $latest for $os, architecture $archi..."
case "$os" in
'windows')
release_file="meilisearch-$os-$archi.exe"
binary_name='meilisearch.exe'
;;
*)
release_file="meilisearch-$os-$archi"
binary_name='meilisearch'
esac
# Fetch the Meilisearch binary
link="https://github.com/meilisearch/meilisearch/releases/download/$latest/$release_file"
curl --fail -OL "$link"
if [ $? -ne 0 ]; then
fetch_release_failure_usage
exit 1
fi
mv "$release_file" "$binary_name"
chmod 744 "$binary_name"
success_usage

View File

@ -0,0 +1,15 @@
[package]
name = "meilisearch-auth"
version = "0.26.0"
edition = "2021"
[dependencies]
enum-iterator = "0.7.0"
meilisearch-error = { path = "../meilisearch-error" }
milli = { git = "https://github.com/meilisearch/milli.git", tag = "v0.26.0" }
rand = "0.8.4"
serde = { version = "1.0.136", features = ["derive"] }
serde_json = { version = "1.0.79", features = ["preserve_order"] }
sha2 = "0.10.2"
thiserror = "1.0.30"
time = { version = "0.3.7", features = ["serde-well-known", "formatting", "parsing", "macros"] }

View File

@ -0,0 +1,104 @@
use enum_iterator::IntoEnumIterator;
use serde::{Deserialize, Serialize};
#[derive(IntoEnumIterator, Copy, Clone, Serialize, Deserialize, Debug, Eq, PartialEq)]
#[repr(u8)]
pub enum Action {
#[serde(rename = "*")]
All = 0,
#[serde(rename = "search")]
Search = actions::SEARCH,
#[serde(rename = "documents.add")]
DocumentsAdd = actions::DOCUMENTS_ADD,
#[serde(rename = "documents.get")]
DocumentsGet = actions::DOCUMENTS_GET,
#[serde(rename = "documents.delete")]
DocumentsDelete = actions::DOCUMENTS_DELETE,
#[serde(rename = "indexes.create")]
IndexesAdd = actions::INDEXES_CREATE,
#[serde(rename = "indexes.get")]
IndexesGet = actions::INDEXES_GET,
#[serde(rename = "indexes.update")]
IndexesUpdate = actions::INDEXES_UPDATE,
#[serde(rename = "indexes.delete")]
IndexesDelete = actions::INDEXES_DELETE,
#[serde(rename = "tasks.get")]
TasksGet = actions::TASKS_GET,
#[serde(rename = "settings.get")]
SettingsGet = actions::SETTINGS_GET,
#[serde(rename = "settings.update")]
SettingsUpdate = actions::SETTINGS_UPDATE,
#[serde(rename = "stats.get")]
StatsGet = actions::STATS_GET,
#[serde(rename = "dumps.create")]
DumpsCreate = actions::DUMPS_CREATE,
#[serde(rename = "dumps.get")]
DumpsGet = actions::DUMPS_GET,
#[serde(rename = "version")]
Version = actions::VERSION,
}
impl Action {
pub fn from_repr(repr: u8) -> Option<Self> {
use actions::*;
match repr {
0 => Some(Self::All),
SEARCH => Some(Self::Search),
DOCUMENTS_ADD => Some(Self::DocumentsAdd),
DOCUMENTS_GET => Some(Self::DocumentsGet),
DOCUMENTS_DELETE => Some(Self::DocumentsDelete),
INDEXES_CREATE => Some(Self::IndexesAdd),
INDEXES_GET => Some(Self::IndexesGet),
INDEXES_UPDATE => Some(Self::IndexesUpdate),
INDEXES_DELETE => Some(Self::IndexesDelete),
TASKS_GET => Some(Self::TasksGet),
SETTINGS_GET => Some(Self::SettingsGet),
SETTINGS_UPDATE => Some(Self::SettingsUpdate),
STATS_GET => Some(Self::StatsGet),
DUMPS_CREATE => Some(Self::DumpsCreate),
DUMPS_GET => Some(Self::DumpsGet),
VERSION => Some(Self::Version),
_otherwise => None,
}
}
pub fn repr(&self) -> u8 {
use actions::*;
match self {
Self::All => 0,
Self::Search => SEARCH,
Self::DocumentsAdd => DOCUMENTS_ADD,
Self::DocumentsGet => DOCUMENTS_GET,
Self::DocumentsDelete => DOCUMENTS_DELETE,
Self::IndexesAdd => INDEXES_CREATE,
Self::IndexesGet => INDEXES_GET,
Self::IndexesUpdate => INDEXES_UPDATE,
Self::IndexesDelete => INDEXES_DELETE,
Self::TasksGet => TASKS_GET,
Self::SettingsGet => SETTINGS_GET,
Self::SettingsUpdate => SETTINGS_UPDATE,
Self::StatsGet => STATS_GET,
Self::DumpsCreate => DUMPS_CREATE,
Self::DumpsGet => DUMPS_GET,
Self::Version => VERSION,
}
}
}
pub mod actions {
pub const SEARCH: u8 = 1;
pub const DOCUMENTS_ADD: u8 = 2;
pub const DOCUMENTS_GET: u8 = 3;
pub const DOCUMENTS_DELETE: u8 = 4;
pub const INDEXES_CREATE: u8 = 5;
pub const INDEXES_GET: u8 = 6;
pub const INDEXES_UPDATE: u8 = 7;
pub const INDEXES_DELETE: u8 = 8;
pub const TASKS_GET: u8 = 9;
pub const SETTINGS_GET: u8 = 10;
pub const SETTINGS_UPDATE: u8 = 11;
pub const STATS_GET: u8 = 12;
pub const DUMPS_CREATE: u8 = 13;
pub const DUMPS_GET: u8 = 14;
pub const VERSION: u8 = 15;
}

View File

@ -0,0 +1,47 @@
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Write;
use std::path::Path;
use crate::{AuthController, HeedAuthStore, Result};
const KEYS_PATH: &str = "keys";
impl AuthController {
pub fn dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let mut store = HeedAuthStore::new(&src)?;
// do not attempt to close the database on drop!
store.set_drop_on_close(false);
let keys_file_path = dst.as_ref().join(KEYS_PATH);
let keys = store.list_api_keys()?;
let mut keys_file = File::create(&keys_file_path)?;
for key in keys {
serde_json::to_writer(&mut keys_file, &key)?;
keys_file.write_all(b"\n")?;
}
Ok(())
}
pub fn load_dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let store = HeedAuthStore::new(&dst)?;
let keys_file_path = src.as_ref().join(KEYS_PATH);
if !keys_file_path.exists() {
return Ok(());
}
let mut reader = BufReader::new(File::open(&keys_file_path)?).lines();
while let Some(key) = reader.next().transpose()? {
let key = serde_json::from_str(&key)?;
store.put_api_key(key)?;
}
Ok(())
}
}

View File

@ -0,0 +1,46 @@
use std::error::Error;
use meilisearch_error::ErrorCode;
use meilisearch_error::{internal_error, Code};
use serde_json::Value;
pub type Result<T> = std::result::Result<T, AuthControllerError>;
#[derive(Debug, thiserror::Error)]
pub enum AuthControllerError {
#[error("`{0}` field is mandatory.")]
MissingParameter(&'static str),
#[error("`actions` field value `{0}` is invalid. It should be an array of string representing action names.")]
InvalidApiKeyActions(Value),
#[error("`indexes` field value `{0}` is invalid. It should be an array of string representing index names.")]
InvalidApiKeyIndexes(Value),
#[error("`expiresAt` field value `{0}` is invalid. It should follow the RFC 3339 format to represents a date or datetime in the future or specified as a null value. e.g. 'YYYY-MM-DD' or 'YYYY-MM-DD HH:MM:SS'.")]
InvalidApiKeyExpiresAt(Value),
#[error("`description` field value `{0}` is invalid. It should be a string or specified as a null value.")]
InvalidApiKeyDescription(Value),
#[error("API key `{0}` not found.")]
ApiKeyNotFound(String),
#[error("Internal error: {0}")]
Internal(Box<dyn Error + Send + Sync + 'static>),
}
internal_error!(
AuthControllerError: milli::heed::Error,
std::io::Error,
serde_json::Error,
std::str::Utf8Error
);
impl ErrorCode for AuthControllerError {
fn error_code(&self) -> Code {
match self {
Self::MissingParameter(_) => Code::MissingParameter,
Self::InvalidApiKeyActions(_) => Code::InvalidApiKeyActions,
Self::InvalidApiKeyIndexes(_) => Code::InvalidApiKeyIndexes,
Self::InvalidApiKeyExpiresAt(_) => Code::InvalidApiKeyExpiresAt,
Self::InvalidApiKeyDescription(_) => Code::InvalidApiKeyDescription,
Self::ApiKeyNotFound(_) => Code::ApiKeyNotFound,
Self::Internal(_) => Code::Internal,
}
}
}

181
meilisearch-auth/src/key.rs Normal file
View File

@ -0,0 +1,181 @@
use crate::action::Action;
use crate::error::{AuthControllerError, Result};
use crate::store::{KeyId, KEY_ID_LENGTH};
use rand::Rng;
use serde::{Deserialize, Serialize};
use serde_json::{from_value, Value};
use time::format_description::well_known::Rfc3339;
use time::macros::{format_description, time};
use time::{Date, OffsetDateTime, PrimitiveDateTime};
#[derive(Debug, Deserialize, Serialize)]
pub struct Key {
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
pub id: KeyId,
pub actions: Vec<Action>,
pub indexes: Vec<String>,
#[serde(with = "time::serde::rfc3339::option")]
pub expires_at: Option<OffsetDateTime>,
#[serde(with = "time::serde::rfc3339")]
pub created_at: OffsetDateTime,
#[serde(with = "time::serde::rfc3339")]
pub updated_at: OffsetDateTime,
}
impl Key {
pub fn create_from_value(value: Value) -> Result<Self> {
let description = match value.get("description") {
Some(Value::Null) => None,
Some(des) => Some(
from_value(des.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyDescription(des.clone()))?,
),
None => None,
};
let id = generate_id();
let actions = value
.get("actions")
.map(|act| {
from_value(act.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyActions(act.clone()))
})
.ok_or(AuthControllerError::MissingParameter("actions"))??;
let indexes = value
.get("indexes")
.map(|ind| {
from_value(ind.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyIndexes(ind.clone()))
})
.ok_or(AuthControllerError::MissingParameter("indexes"))??;
let expires_at = value
.get("expiresAt")
.map(parse_expiration_date)
.ok_or(AuthControllerError::MissingParameter("expiresAt"))??;
let created_at = OffsetDateTime::now_utc();
let updated_at = created_at;
Ok(Self {
description,
id,
actions,
indexes,
expires_at,
created_at,
updated_at,
})
}
pub fn update_from_value(&mut self, value: Value) -> Result<()> {
if let Some(des) = value.get("description") {
let des = from_value(des.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyDescription(des.clone()));
self.description = des?;
}
if let Some(act) = value.get("actions") {
let act = from_value(act.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyActions(act.clone()));
self.actions = act?;
}
if let Some(ind) = value.get("indexes") {
let ind = from_value(ind.clone())
.map_err(|_| AuthControllerError::InvalidApiKeyIndexes(ind.clone()));
self.indexes = ind?;
}
if let Some(exp) = value.get("expiresAt") {
self.expires_at = parse_expiration_date(exp)?;
}
self.updated_at = OffsetDateTime::now_utc();
Ok(())
}
pub(crate) fn default_admin() -> Self {
let now = OffsetDateTime::now_utc();
Self {
description: Some("Default Admin API Key (Use it for all other operations. Caution! Do not use it on a public frontend)".to_string()),
id: generate_id(),
actions: vec![Action::All],
indexes: vec!["*".to_string()],
expires_at: None,
created_at: now,
updated_at: now,
}
}
pub(crate) fn default_search() -> Self {
let now = OffsetDateTime::now_utc();
Self {
description: Some(
"Default Search API Key (Use it to search from the frontend)".to_string(),
),
id: generate_id(),
actions: vec![Action::Search],
indexes: vec!["*".to_string()],
expires_at: None,
created_at: now,
updated_at: now,
}
}
}
/// Generate a printable key of 64 characters using thread_rng.
fn generate_id() -> [u8; KEY_ID_LENGTH] {
const CHARSET: &[u8] = b"abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
let mut rng = rand::thread_rng();
let mut bytes = [0; KEY_ID_LENGTH];
for byte in bytes.iter_mut() {
*byte = CHARSET[rng.gen_range(0..CHARSET.len())];
}
bytes
}
fn parse_expiration_date(value: &Value) -> Result<Option<OffsetDateTime>> {
match value {
Value::String(string) => OffsetDateTime::parse(string, &Rfc3339)
.or_else(|_| {
PrimitiveDateTime::parse(
string,
format_description!(
"[year repr:full base:calendar]-[month repr:numerical]-[day]T[hour]:[minute]:[second]"
),
).map(|datetime| datetime.assume_utc())
})
.or_else(|_| {
PrimitiveDateTime::parse(
string,
format_description!(
"[year repr:full base:calendar]-[month repr:numerical]-[day] [hour]:[minute]:[second]"
),
).map(|datetime| datetime.assume_utc())
})
.or_else(|_| {
Date::parse(string, format_description!(
"[year repr:full base:calendar]-[month repr:numerical]-[day]"
)).map(|date| PrimitiveDateTime::new(date, time!(00:00)).assume_utc())
})
.map_err(|_| AuthControllerError::InvalidApiKeyExpiresAt(value.clone()))
// check if the key is already expired.
.and_then(|d| {
if d > OffsetDateTime::now_utc() {
Ok(d)
} else {
Err(AuthControllerError::InvalidApiKeyExpiresAt(value.clone()))
}
})
.map(Option::Some),
Value::Null => Ok(None),
_otherwise => Err(AuthControllerError::InvalidApiKeyExpiresAt(value.clone())),
}
}

272
meilisearch-auth/src/lib.rs Normal file
View File

@ -0,0 +1,272 @@
mod action;
mod dump;
pub mod error;
mod key;
mod store;
use std::collections::{HashMap, HashSet};
use std::path::Path;
use std::str::from_utf8;
use std::sync::Arc;
use serde::{Deserialize, Serialize};
use serde_json::Value;
use sha2::{Digest, Sha256};
use time::OffsetDateTime;
pub use action::{actions, Action};
use error::{AuthControllerError, Result};
pub use key::Key;
pub use store::open_auth_store_env;
use store::HeedAuthStore;
#[derive(Clone)]
pub struct AuthController {
store: Arc<HeedAuthStore>,
master_key: Option<String>,
}
impl AuthController {
pub fn new(db_path: impl AsRef<Path>, master_key: &Option<String>) -> Result<Self> {
let store = HeedAuthStore::new(db_path)?;
if store.is_empty()? {
generate_default_keys(&store)?;
}
Ok(Self {
store: Arc::new(store),
master_key: master_key.clone(),
})
}
pub fn create_key(&self, value: Value) -> Result<Key> {
let key = Key::create_from_value(value)?;
self.store.put_api_key(key)
}
pub fn update_key(&self, key: impl AsRef<str>, value: Value) -> Result<Key> {
let mut key = self.get_key(key)?;
key.update_from_value(value)?;
self.store.put_api_key(key)
}
pub fn get_key(&self, key: impl AsRef<str>) -> Result<Key> {
self.store
.get_api_key(&key)?
.ok_or_else(|| AuthControllerError::ApiKeyNotFound(key.as_ref().to_string()))
}
pub fn get_key_filters(
&self,
key: impl AsRef<str>,
search_rules: Option<SearchRules>,
) -> Result<AuthFilter> {
let mut filters = AuthFilter::default();
if self
.master_key
.as_ref()
.map_or(false, |master_key| master_key != key.as_ref())
{
let key = self
.store
.get_api_key(&key)?
.ok_or_else(|| AuthControllerError::ApiKeyNotFound(key.as_ref().to_string()))?;
if !key.indexes.iter().any(|i| i.as_str() == "*") {
filters.search_rules = match search_rules {
// Intersect search_rules with parent key authorized indexes.
Some(search_rules) => SearchRules::Map(
key.indexes
.into_iter()
.filter_map(|index| {
search_rules
.get_index_search_rules(&index)
.map(|index_search_rules| (index, Some(index_search_rules)))
})
.collect(),
),
None => SearchRules::Set(key.indexes.into_iter().collect()),
};
} else if let Some(search_rules) = search_rules {
filters.search_rules = search_rules;
}
filters.allow_index_creation = key
.actions
.iter()
.any(|&action| action == Action::IndexesAdd || action == Action::All);
}
Ok(filters)
}
pub fn list_keys(&self) -> Result<Vec<Key>> {
self.store.list_api_keys()
}
pub fn delete_key(&self, key: impl AsRef<str>) -> Result<()> {
if self.store.delete_api_key(&key)? {
Ok(())
} else {
Err(AuthControllerError::ApiKeyNotFound(
key.as_ref().to_string(),
))
}
}
pub fn get_master_key(&self) -> Option<&String> {
self.master_key.as_ref()
}
/// Generate a valid key from a key id using the current master key.
/// Returns None if no master key has been set.
pub fn generate_key(&self, id: &str) -> Option<String> {
self.master_key
.as_ref()
.map(|master_key| generate_key(master_key.as_bytes(), id))
}
/// Check if the provided key is authorized to make a specific action
/// without checking if the key is valid.
pub fn is_key_authorized(
&self,
key: &[u8],
action: Action,
index: Option<&str>,
) -> Result<bool> {
match self
.store
// check if the key has access to all indexes.
.get_expiration_date(key, action, None)?
.or(match index {
// else check if the key has access to the requested index.
Some(index) => {
self.store
.get_expiration_date(key, action, Some(index.as_bytes()))?
}
// or to any index if no index has been requested.
None => self.store.prefix_first_expiration_date(key, action)?,
}) {
// check expiration date.
Some(Some(exp)) => Ok(OffsetDateTime::now_utc() < exp),
// no expiration date.
Some(None) => Ok(true),
// action or index forbidden.
None => Ok(false),
}
}
/// Check if the provided key is valid
/// without checking if the key is authorized to make a specific action.
pub fn is_key_valid(&self, key: &[u8]) -> Result<bool> {
if let Some(id) = self.store.get_key_id(key) {
let id = from_utf8(&id)?;
if let Some(generated) = self.generate_key(id) {
return Ok(generated.as_bytes() == key);
}
}
Ok(false)
}
/// Check if the provided key is valid
/// and is authorized to make a specific action.
pub fn authenticate(&self, key: &[u8], action: Action, index: Option<&str>) -> Result<bool> {
if self.is_key_authorized(key, action, index)? {
self.is_key_valid(key)
} else {
Ok(false)
}
}
}
pub struct AuthFilter {
pub search_rules: SearchRules,
pub allow_index_creation: bool,
}
impl Default for AuthFilter {
fn default() -> Self {
Self {
search_rules: SearchRules::default(),
allow_index_creation: true,
}
}
}
/// Transparent wrapper around a list of allowed indexes with the search rules to apply for each.
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(untagged)]
pub enum SearchRules {
Set(HashSet<String>),
Map(HashMap<String, Option<IndexSearchRules>>),
}
impl Default for SearchRules {
fn default() -> Self {
Self::Set(Some("*".to_string()).into_iter().collect())
}
}
impl SearchRules {
pub fn is_index_authorized(&self, index: &str) -> bool {
match self {
Self::Set(set) => set.contains("*") || set.contains(index),
Self::Map(map) => map.contains_key("*") || map.contains_key(index),
}
}
pub fn get_index_search_rules(&self, index: &str) -> Option<IndexSearchRules> {
match self {
Self::Set(set) => {
if set.contains("*") || set.contains(index) {
Some(IndexSearchRules::default())
} else {
None
}
}
Self::Map(map) => map
.get(index)
.or_else(|| map.get("*"))
.map(|isr| isr.clone().unwrap_or_default()),
}
}
}
impl IntoIterator for SearchRules {
type Item = (String, IndexSearchRules);
type IntoIter = Box<dyn Iterator<Item = Self::Item>>;
fn into_iter(self) -> Self::IntoIter {
match self {
Self::Set(array) => {
Box::new(array.into_iter().map(|i| (i, IndexSearchRules::default())))
}
Self::Map(map) => {
Box::new(map.into_iter().map(|(i, isr)| (i, isr.unwrap_or_default())))
}
}
}
}
/// Contains the rules to apply on the top of the search query for a specific index.
///
/// filter: search filter to apply in addition to query filters.
#[derive(Debug, Serialize, Deserialize, Default, Clone)]
pub struct IndexSearchRules {
pub filter: Option<serde_json::Value>,
}
fn generate_key(master_key: &[u8], keyid: &str) -> String {
let key = [keyid.as_bytes(), master_key].concat();
let sha = Sha256::digest(&key);
format!("{}{:x}", keyid, sha)
}
fn generate_default_keys(store: &HeedAuthStore) -> Result<()> {
store.put_api_key(Key::default_admin())?;
store.put_api_key(Key::default_search())?;
Ok(())
}

View File

@ -0,0 +1,256 @@
use enum_iterator::IntoEnumIterator;
use std::borrow::Cow;
use std::cmp::Reverse;
use std::convert::TryFrom;
use std::convert::TryInto;
use std::fs::create_dir_all;
use std::path::Path;
use std::str;
use std::sync::Arc;
use milli::heed::types::{ByteSlice, DecodeIgnore, SerdeJson};
use milli::heed::{Database, Env, EnvOpenOptions, RwTxn};
use time::OffsetDateTime;
use super::error::Result;
use super::{Action, Key};
const AUTH_STORE_SIZE: usize = 1_073_741_824; //1GiB
pub const KEY_ID_LENGTH: usize = 8;
const AUTH_DB_PATH: &str = "auth";
const KEY_DB_NAME: &str = "api-keys";
const KEY_ID_ACTION_INDEX_EXPIRATION_DB_NAME: &str = "keyid-action-index-expiration";
pub type KeyId = [u8; KEY_ID_LENGTH];
#[derive(Clone)]
pub struct HeedAuthStore {
env: Arc<Env>,
keys: Database<ByteSlice, SerdeJson<Key>>,
action_keyid_index_expiration: Database<KeyIdActionCodec, SerdeJson<Option<OffsetDateTime>>>,
should_close_on_drop: bool,
}
impl Drop for HeedAuthStore {
fn drop(&mut self) {
if self.should_close_on_drop && Arc::strong_count(&self.env) == 1 {
self.env.as_ref().clone().prepare_for_closing();
}
}
}
pub fn open_auth_store_env(path: &Path) -> milli::heed::Result<milli::heed::Env> {
let mut options = EnvOpenOptions::new();
options.map_size(AUTH_STORE_SIZE); // 1GB
options.max_dbs(2);
options.open(path)
}
impl HeedAuthStore {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let path = path.as_ref().join(AUTH_DB_PATH);
create_dir_all(&path)?;
let env = Arc::new(open_auth_store_env(path.as_ref())?);
let keys = env.create_database(Some(KEY_DB_NAME))?;
let action_keyid_index_expiration =
env.create_database(Some(KEY_ID_ACTION_INDEX_EXPIRATION_DB_NAME))?;
Ok(Self {
env,
keys,
action_keyid_index_expiration,
should_close_on_drop: true,
})
}
pub fn set_drop_on_close(&mut self, v: bool) {
self.should_close_on_drop = v;
}
pub fn is_empty(&self) -> Result<bool> {
let rtxn = self.env.read_txn()?;
Ok(self.keys.len(&rtxn)? == 0)
}
pub fn put_api_key(&self, key: Key) -> Result<Key> {
let mut wtxn = self.env.write_txn()?;
self.keys.put(&mut wtxn, &key.id, &key)?;
let id = key.id;
// delete key from inverted database before refilling it.
self.delete_key_from_inverted_db(&mut wtxn, &id)?;
// create inverted database.
let db = self.action_keyid_index_expiration;
let actions = if key.actions.contains(&Action::All) {
// if key.actions contains All, we iterate over all actions.
Action::into_enum_iter().collect()
} else {
key.actions.clone()
};
let no_index_restriction = key.indexes.contains(&"*".to_owned());
for action in actions {
if no_index_restriction {
// If there is no index restriction we put None.
db.put(&mut wtxn, &(&id, &action, None), &key.expires_at)?;
} else {
// else we create a key for each index.
for index in key.indexes.iter() {
db.put(
&mut wtxn,
&(&id, &action, Some(index.as_bytes())),
&key.expires_at,
)?;
}
}
}
wtxn.commit()?;
Ok(key)
}
pub fn get_api_key(&self, key: impl AsRef<str>) -> Result<Option<Key>> {
let rtxn = self.env.read_txn()?;
match self.get_key_id(key.as_ref().as_bytes()) {
Some(id) => self.keys.get(&rtxn, &id).map_err(|e| e.into()),
None => Ok(None),
}
}
pub fn delete_api_key(&self, key: impl AsRef<str>) -> Result<bool> {
let mut wtxn = self.env.write_txn()?;
let existing = match self.get_key_id(key.as_ref().as_bytes()) {
Some(id) => {
let existing = self.keys.delete(&mut wtxn, &id)?;
self.delete_key_from_inverted_db(&mut wtxn, &id)?;
existing
}
None => false,
};
wtxn.commit()?;
Ok(existing)
}
pub fn list_api_keys(&self) -> Result<Vec<Key>> {
let mut list = Vec::new();
let rtxn = self.env.read_txn()?;
for result in self.keys.remap_key_type::<DecodeIgnore>().iter(&rtxn)? {
let (_, content) = result?;
list.push(content);
}
list.sort_unstable_by_key(|k| Reverse(k.created_at));
Ok(list)
}
pub fn get_expiration_date(
&self,
key: &[u8],
action: Action,
index: Option<&[u8]>,
) -> Result<Option<Option<OffsetDateTime>>> {
let rtxn = self.env.read_txn()?;
match self.get_key_id(key) {
Some(id) => {
let tuple = (&id, &action, index);
Ok(self.action_keyid_index_expiration.get(&rtxn, &tuple)?)
}
None => Ok(None),
}
}
pub fn prefix_first_expiration_date(
&self,
key: &[u8],
action: Action,
) -> Result<Option<Option<OffsetDateTime>>> {
let rtxn = self.env.read_txn()?;
match self.get_key_id(key) {
Some(id) => {
let tuple = (&id, &action, None);
Ok(self
.action_keyid_index_expiration
.prefix_iter(&rtxn, &tuple)?
.next()
.transpose()?
.map(|(_, expiration)| expiration))
}
None => Ok(None),
}
}
pub fn get_key_id(&self, key: &[u8]) -> Option<KeyId> {
try_split_array_at::<_, KEY_ID_LENGTH>(key).map(|(id, _)| *id)
}
fn delete_key_from_inverted_db(&self, wtxn: &mut RwTxn, key: &KeyId) -> Result<()> {
let mut iter = self
.action_keyid_index_expiration
.remap_types::<ByteSlice, DecodeIgnore>()
.prefix_iter_mut(wtxn, key)?;
while iter.next().transpose()?.is_some() {
// safety: we don't keep references from inside the LMDB database.
unsafe { iter.del_current()? };
}
Ok(())
}
}
/// Codec allowing to retrieve the expiration date of an action,
/// optionnally on a spcific index, for a given key.
pub struct KeyIdActionCodec;
impl<'a> milli::heed::BytesDecode<'a> for KeyIdActionCodec {
type DItem = (KeyId, Action, Option<&'a [u8]>);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let (key_id, action_bytes) = try_split_array_at(bytes)?;
let (action_bytes, index) = match try_split_array_at(action_bytes)? {
(action, []) => (action, None),
(action, index) => (action, Some(index)),
};
let action = Action::from_repr(u8::from_be_bytes(*action_bytes))?;
Some((*key_id, action, index))
}
}
impl<'a> milli::heed::BytesEncode<'a> for KeyIdActionCodec {
type EItem = (&'a KeyId, &'a Action, Option<&'a [u8]>);
fn bytes_encode((key_id, action, index): &Self::EItem) -> Option<Cow<[u8]>> {
let mut bytes = Vec::new();
bytes.extend_from_slice(*key_id);
let action_bytes = u8::to_be_bytes(action.repr());
bytes.extend_from_slice(&action_bytes);
if let Some(index) = index {
bytes.extend_from_slice(index);
}
Some(Cow::Owned(bytes))
}
}
/// Divides one slice into two at an index, returns `None` if mid is out of bounds.
pub fn try_split_at<T>(slice: &[T], mid: usize) -> Option<(&[T], &[T])> {
if mid <= slice.len() {
Some(slice.split_at(mid))
} else {
None
}
}
/// Divides one slice into an array and the tail at an index,
/// returns `None` if `N` is out of bounds.
pub fn try_split_array_at<T, const N: usize>(slice: &[T]) -> Option<(&[T; N], &[T])>
where
[T; N]: for<'a> TryFrom<&'a [T]>,
{
let (head, tail) = try_split_at(slice, N)?;
let head = head.try_into().ok()?;
Some((head, tail))
}

View File

@ -1,8 +1,15 @@
[package]
name = "meilisearch-error"
version = "0.21.0"
version = "0.26.0"
authors = ["marin <postma.marin@protonmail.com>"]
edition = "2018"
edition = "2021"
[dependencies]
actix-http = "=3.0.0-beta.6"
actix-web = { version = "4.0.1", default-features = false }
proptest = { version = "1.0.0", optional = true }
proptest-derive = { version = "0.3.0", optional = true }
serde = { version = "1.0.136", features = ["derive"] }
serde_json = "1.0.79"
[features]
test-traits = ["proptest", "proptest-derive"]

View File

@ -1,6 +1,74 @@
use std::fmt;
use actix_http::http::StatusCode;
use actix_web::{self as aweb, http::StatusCode, HttpResponseBuilder};
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
#[serde(rename_all = "camelCase")]
#[cfg_attr(feature = "test-traits", derive(proptest_derive::Arbitrary))]
pub struct ResponseError {
#[serde(skip)]
#[cfg_attr(
feature = "test-traits",
proptest(strategy = "strategy::status_code_strategy()")
)]
code: StatusCode,
message: String,
#[serde(rename = "code")]
error_code: String,
#[serde(rename = "type")]
error_type: String,
#[serde(rename = "link")]
error_link: String,
}
impl ResponseError {
pub fn from_msg(message: String, code: Code) -> Self {
Self {
code: code.http(),
message,
error_code: code.err_code().error_name.to_string(),
error_type: code.type_(),
error_link: code.url(),
}
}
}
impl fmt::Display for ResponseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.message.fmt(f)
}
}
impl std::error::Error for ResponseError {}
impl<T> From<T> for ResponseError
where
T: ErrorCode,
{
fn from(other: T) -> Self {
Self {
code: other.http_status(),
message: other.to_string(),
error_code: other.error_name(),
error_type: other.error_type(),
error_link: other.error_url(),
}
}
}
impl aweb::error::ResponseError for ResponseError {
fn error_response(&self) -> aweb::HttpResponse {
let json = serde_json::to_vec(self).unwrap();
HttpResponseBuilder::new(self.status_code())
.content_type("application/json")
.body(json)
}
fn status_code(&self) -> StatusCode {
self.code
}
}
pub trait ErrorCode: std::error::Error {
fn error_code(&self) -> Code;
@ -38,20 +106,21 @@ impl fmt::Display for ErrorType {
use ErrorType::*;
match self {
InternalError => write!(f, "internal_error"),
InvalidRequestError => write!(f, "invalid_request_error"),
AuthenticationError => write!(f, "authentication_error"),
InternalError => write!(f, "internal"),
InvalidRequestError => write!(f, "invalid_request"),
AuthenticationError => write!(f, "auth"),
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, Copy)]
pub enum Code {
// index related error
CreateIndex,
IndexAlreadyExists,
IndexNotFound,
InvalidIndexUid,
OpenIndex,
InvalidMinWordLengthForTypo,
// invalid state error
InvalidState,
@ -60,17 +129,24 @@ pub enum Code {
MaxFieldsLimitExceeded,
MissingDocumentId,
InvalidDocumentId,
Facet,
Filter,
Sort,
BadParameter,
BadRequest,
DatabaseSizeLimitReached,
DocumentNotFound,
Internal,
InvalidGeoField,
InvalidRankingRule,
InvalidStore,
InvalidToken,
MissingAuthorizationHeader,
NotFound,
NoSpaceLeftOnDevice,
DumpNotFound,
TaskNotFound,
PayloadTooLarge,
RetrieveDocument,
SearchDocuments,
@ -78,6 +154,18 @@ pub enum Code {
DumpAlreadyInProgress,
DumpProcessFailed,
InvalidContentType,
MissingContentType,
MalformedPayload,
MissingPayload,
ApiKeyNotFound,
MissingParameter,
InvalidApiKeyActions,
InvalidApiKeyIndexes,
InvalidApiKeyExpiresAt,
InvalidApiKeyDescription,
}
impl Code {
@ -88,22 +176,30 @@ impl Code {
match self {
// index related errors
// create index is thrown on internal error while creating an index.
CreateIndex => ErrCode::internal("index_creation_failed", StatusCode::BAD_REQUEST),
IndexAlreadyExists => ErrCode::invalid("index_already_exists", StatusCode::BAD_REQUEST),
CreateIndex => {
ErrCode::internal("index_creation_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
IndexAlreadyExists => ErrCode::invalid("index_already_exists", StatusCode::CONFLICT),
// thrown when requesting an unexisting index
IndexNotFound => ErrCode::invalid("index_not_found", StatusCode::NOT_FOUND),
InvalidIndexUid => ErrCode::invalid("invalid_index_uid", StatusCode::BAD_REQUEST),
OpenIndex => {
ErrCode::internal("index_not_accessible", StatusCode::INTERNAL_SERVER_ERROR)
}
// invalid state error
InvalidState => ErrCode::internal("invalid_state", StatusCode::INTERNAL_SERVER_ERROR),
// thrown when no primary key has been set
MissingPrimaryKey => ErrCode::invalid("missing_primary_key", StatusCode::BAD_REQUEST),
MissingPrimaryKey => {
ErrCode::invalid("primary_key_inference_failed", StatusCode::BAD_REQUEST)
}
// error thrown when trying to set an already existing primary key
PrimaryKeyAlreadyPresent => {
ErrCode::invalid("primary_key_already_present", StatusCode::BAD_REQUEST)
ErrCode::invalid("index_primary_key_already_exists", StatusCode::BAD_REQUEST)
}
// invalid ranking rule
InvalidRankingRule => ErrCode::invalid("invalid_ranking_rule", StatusCode::BAD_REQUEST),
// invalid database
InvalidStore => {
ErrCode::internal("invalid_store_file", StatusCode::INTERNAL_SERVER_ERROR)
}
// invalid document
@ -111,21 +207,31 @@ impl Code {
ErrCode::invalid("max_fields_limit_exceeded", StatusCode::BAD_REQUEST)
}
MissingDocumentId => ErrCode::invalid("missing_document_id", StatusCode::BAD_REQUEST),
InvalidDocumentId => ErrCode::invalid("invalid_document_id", StatusCode::BAD_REQUEST),
// error related to facets
Facet => ErrCode::invalid("invalid_facet", StatusCode::BAD_REQUEST),
// error related to filters
Filter => ErrCode::invalid("invalid_filter", StatusCode::BAD_REQUEST),
// error related to sorts
Sort => ErrCode::invalid("invalid_sort", StatusCode::BAD_REQUEST),
BadParameter => ErrCode::invalid("bad_parameter", StatusCode::BAD_REQUEST),
BadRequest => ErrCode::invalid("bad_request", StatusCode::BAD_REQUEST),
DatabaseSizeLimitReached => ErrCode::internal(
"database_size_limit_reached",
StatusCode::INTERNAL_SERVER_ERROR,
),
DocumentNotFound => ErrCode::invalid("document_not_found", StatusCode::NOT_FOUND),
Internal => ErrCode::internal("internal", StatusCode::INTERNAL_SERVER_ERROR),
InvalidToken => ErrCode::authentication("invalid_token", StatusCode::FORBIDDEN),
InvalidGeoField => ErrCode::invalid("invalid_geo_field", StatusCode::BAD_REQUEST),
InvalidToken => ErrCode::authentication("invalid_api_key", StatusCode::FORBIDDEN),
MissingAuthorizationHeader => {
ErrCode::authentication("missing_authorization_header", StatusCode::UNAUTHORIZED)
}
NotFound => ErrCode::invalid("not_found", StatusCode::NOT_FOUND),
TaskNotFound => ErrCode::invalid("task_not_found", StatusCode::NOT_FOUND),
DumpNotFound => ErrCode::invalid("dump_not_found", StatusCode::NOT_FOUND),
NoSpaceLeftOnDevice => {
ErrCode::internal("no_space_left_on_device", StatusCode::INTERNAL_SERVER_ERROR)
}
PayloadTooLarge => ErrCode::invalid("payload_too_large", StatusCode::PAYLOAD_TOO_LARGE),
RetrieveDocument => {
ErrCode::internal("unretrievable_document", StatusCode::BAD_REQUEST)
@ -137,11 +243,38 @@ impl Code {
// error related to dump
DumpAlreadyInProgress => {
ErrCode::invalid("dump_already_in_progress", StatusCode::CONFLICT)
ErrCode::invalid("dump_already_processing", StatusCode::CONFLICT)
}
DumpProcessFailed => {
ErrCode::internal("dump_process_failed", StatusCode::INTERNAL_SERVER_ERROR)
}
MissingContentType => {
ErrCode::invalid("missing_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MalformedPayload => ErrCode::invalid("malformed_payload", StatusCode::BAD_REQUEST),
InvalidContentType => {
ErrCode::invalid("invalid_content_type", StatusCode::UNSUPPORTED_MEDIA_TYPE)
}
MissingPayload => ErrCode::invalid("missing_payload", StatusCode::BAD_REQUEST),
// error related to keys
ApiKeyNotFound => ErrCode::invalid("api_key_not_found", StatusCode::NOT_FOUND),
MissingParameter => ErrCode::invalid("missing_parameter", StatusCode::BAD_REQUEST),
InvalidApiKeyActions => {
ErrCode::invalid("invalid_api_key_actions", StatusCode::BAD_REQUEST)
}
InvalidApiKeyIndexes => {
ErrCode::invalid("invalid_api_key_indexes", StatusCode::BAD_REQUEST)
}
InvalidApiKeyExpiresAt => {
ErrCode::invalid("invalid_api_key_expires_at", StatusCode::BAD_REQUEST)
}
InvalidApiKeyDescription => {
ErrCode::invalid("invalid_api_key_description", StatusCode::BAD_REQUEST)
}
InvalidMinWordLengthForTypo => {
ErrCode::invalid("invalid_min_word_length_for_typo", StatusCode::BAD_REQUEST)
}
}
}
@ -198,3 +331,27 @@ impl ErrCode {
}
}
}
#[cfg(feature = "test-traits")]
mod strategy {
use proptest::strategy::Strategy;
use super::*;
pub(super) fn status_code_strategy() -> impl Strategy<Value = StatusCode> {
(100..999u16).prop_map(|i| StatusCode::from_u16(i).unwrap())
}
}
#[macro_export]
macro_rules! internal_error {
($target:ty : $($other:path), *) => {
$(
impl From<$other> for $target {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}

View File

@ -1,109 +1,95 @@
[package]
authors = ["Quentin de Quelen <quentin@dequelen.me>", "Clément Renault <clement@meilisearch.com>"]
description = "MeiliSearch HTTP server"
edition = "2018"
description = "Meilisearch HTTP server"
edition = "2021"
license = "MIT"
name = "meilisearch-http"
version = "0.21.0"
version = "0.26.0"
[[bin]]
name = "meilisearch"
path = "src/main.rs"
[build-dependencies]
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "6db8c3e", optional = true }
anyhow = { version = "*", optional = true }
cargo_toml = { version = "0.9.0", optional = true }
anyhow = { version = "1.0.56", optional = true }
cargo_toml = { version = "0.11.4", optional = true }
hex = { version = "0.4.3", optional = true }
reqwest = { version = "0.11.3", features = ["blocking", "rustls-tls"], default-features = false, optional = true }
sha-1 = { version = "0.9.4", optional = true }
tempfile = { version = "3.1.0", optional = true }
vergen = "3.1.0"
zip = { version = "0.5.12", optional = true }
reqwest = { version = "0.11.9", features = ["blocking", "rustls-tls"], default-features = false, optional = true }
sha-1 = { version = "0.10.0", optional = true }
static-files = { version = "0.2.3", optional = true }
tempfile = { version = "3.3.0", optional = true }
vergen = { version = "7.0.0", default-features = false, features = ["git"] }
zip = { version = "0.5.13", optional = true }
[dependencies]
actix-cors = { git = "https://github.com/MarinPostma/actix-extras.git", rev = "2dac1a4"}
actix-http = { version = "=3.0.0-beta.6" }
actix-service = "2.0.0"
actix-web = { version = "=4.0.0-beta.6", features = ["rustls"] }
actix-web-static-files = { git = "https://github.com/MarinPostma/actix-web-static-files.git", rev = "6db8c3e", optional = true }
anyhow = "1.0.36"
async-stream = "0.3.0"
async-trait = "0.1.42"
arc-swap = "1.2.0"
byte-unit = { version = "4.0.9", default-features = false, features = ["std"] }
bytes = "0.6.0"
chrono = { version = "0.4.19", features = ["serde"] }
crossbeam-channel = "0.5.0"
actix-cors = "0.6.1"
actix-web = { version = "4.0.1", default-features = false, features = ["macros", "compress-brotli", "compress-gzip", "cookies", "rustls"] }
actix-web-static-files = { git = "https://github.com/kilork/actix-web-static-files.git", rev = "2d3b6160", optional = true }
anyhow = { version = "1.0.56", features = ["backtrace"] }
async-stream = "0.3.3"
async-trait = "0.1.52"
bstr = "0.2.17"
byte-unit = { version = "4.0.14", default-features = false, features = ["std", "serde"] }
bytes = "1.1.0"
clap = { version = "3.1.6", features = ["derive", "env"] }
crossbeam-channel = "0.5.2"
either = "1.6.1"
env_logger = "0.8.2"
flate2 = "1.0.19"
fst = "0.4.5"
futures = "0.3.7"
futures-util = "0.3.8"
grenad = { git = "https://github.com/Kerollmops/grenad.git", rev = "3adcb26" }
heed = { git = "https://github.com/Kerollmops/heed", tag = "v0.12.0" }
http = "0.2.1"
indexmap = { version = "1.3.2", features = ["serde-1"] }
itertools = "0.10.0"
log = "0.4.8"
main_error = "0.1.0"
env_logger = "0.9.0"
flate2 = "1.0.22"
fst = "0.4.7"
futures = "0.3.21"
futures-util = "0.3.21"
http = "0.2.6"
indexmap = { version = "1.8.0", features = ["serde-1"] }
itertools = "0.10.3"
jsonwebtoken = "8.0.1"
log = "0.4.14"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-error = { path = "../meilisearch-error" }
meilisearch-tokenizer = { git = "https://github.com/meilisearch/tokenizer.git", tag = "v0.2.3" }
memmap = "0.7.0"
milli = { git = "https://github.com/meilisearch/milli.git", tag = "v0.7.1" }
meilisearch-lib = { path = "../meilisearch-lib" }
mime = "0.3.16"
num_cpus = "1.13.0"
once_cell = "1.5.2"
oxidized-json-checker = "0.3.2"
parking_lot = "0.11.1"
rand = "0.7.3"
rayon = "1.5.0"
regex = "1.4.2"
rustls = "0.19"
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0.59", features = ["preserve_order"] }
sha2 = "0.9.1"
siphasher = "0.3.2"
slice-group-by = "0.2.6"
structopt = "0.3.20"
tar = "0.4.29"
tempfile = "3.1.0"
thiserror = "1.0.24"
tokio = { version = "1", features = ["full"] }
uuid = { version = "0.8.2", features = ["serde"] }
num_cpus = "1.13.1"
obkv = "0.2.0"
once_cell = "1.10.0"
parking_lot = "0.12.0"
pin-project-lite = "0.2.8"
platform-dirs = "0.3.0"
rand = "0.8.5"
rayon = "1.5.1"
regex = "1.5.5"
rustls = "0.20.4"
rustls-pemfile = "0.3.0"
segment = { version = "0.2.0", optional = true }
serde = { version = "1.0.136", features = ["derive"] }
serde_json = { version = "1.0.79", features = ["preserve_order"] }
sha2 = "0.10.2"
siphasher = "0.3.10"
slice-group-by = "0.3.0"
static-files = { version = "0.2.3", optional = true }
sysinfo = "0.23.5"
tar = "0.4.38"
tempfile = "3.3.0"
thiserror = "1.0.30"
time = { version = "0.3.7", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tokio = { version = "1.17.0", features = ["full"] }
tokio-stream = "0.1.8"
uuid = { version = "0.8.2", features = ["serde"] }
walkdir = "2.3.2"
obkv = "0.1.1"
pin-project = "1.0.7"
whoami = { version = "1.1.2", optional = true }
reqwest = { version = "0.11.3", features = ["json", "rustls-tls"], default-features = false, optional = true }
[dependencies.sentry]
default-features = false
features = [
"backtrace",
"contexts",
"panic",
"reqwest",
"rustls",
"log",
]
optional = true
version = "0.22.0"
[dev-dependencies]
actix-rt = "2.1.0"
assert-json-diff = { branch = "master", git = "https://github.com/qdequele/assert-json-diff" }
mockall = "0.9.1"
paste = "1.0.5"
serde_url_params = "0.2.0"
tempdir = "0.3.7"
urlencoding = "1.1.1"
actix-rt = "2.7.0"
assert-json-diff = "2.0.1"
maplit = "1.0.2"
paste = "1.0.6"
serde_url_params = "0.2.1"
urlencoding = "2.1.0"
[features]
default = ["analytics", "mini-dashboard"]
analytics = ["segment"]
mini-dashboard = [
"actix-web-static-files",
"static-files",
"anyhow",
"cargo_toml",
"hex",
@ -112,12 +98,10 @@ mini-dashboard = [
"tempfile",
"zip",
]
analytics = ["sentry", "whoami", "reqwest"]
default = ["analytics", "mini-dashboard"]
[target.'cfg(target_os = "linux")'.dependencies]
jemallocator = "0.3.2"
tikv-jemallocator = "0.4.3"
[package.metadata.mini-dashboard]
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.1.3/build.zip"
sha1 = "fea1780e13d8e570e35a1921e7a45cabcd501d5e"
assets-url = "https://github.com/meilisearch/mini-dashboard/releases/download/v0.1.9/build.zip"
sha1 = "b1833c3e5dc6b5d9d519ae4834935ae6c8a47024"

View File

@ -1,12 +1,9 @@
use vergen::{generate_cargo_keys, ConstantsFlags};
use vergen::{vergen, Config};
fn main() {
// Setup the flags, toggling off the 'SEMVER_FROM_CARGO_PKG' flag
let mut flags = ConstantsFlags::all();
flags.toggle(ConstantsFlags::SEMVER_FROM_CARGO_PKG);
// Generate the 'cargo:' key output
generate_cargo_keys(ConstantsFlags::all()).expect("Unable to generate the cargo keys!");
if let Err(e) = vergen(Config::default()) {
println!("cargo:warning=vergen: {}", e);
}
#[cfg(feature = "mini-dashboard")]
mini_dashboard::setup_mini_dashboard().expect("Could not load the mini-dashboard assets");
@ -19,11 +16,11 @@ mod mini_dashboard {
use std::io::{Cursor, Read, Write};
use std::path::PathBuf;
use actix_web_static_files::resource_dir;
use anyhow::Context;
use cargo_toml::Manifest;
use reqwest::blocking::get;
use sha1::{Digest, Sha1};
use static_files::resource_dir;
pub fn setup_mini_dashboard() -> anyhow::Result<()> {
let cargo_manifest_dir = PathBuf::from(env::var("CARGO_MANIFEST_DIR").unwrap());

View File

@ -1,126 +0,0 @@
use std::hash::{Hash, Hasher};
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
use log::debug;
use serde::Serialize;
use siphasher::sip::SipHasher;
use crate::Data;
use crate::Opt;
const AMPLITUDE_API_KEY: &str = "f7fba398780e06d8fe6666a9be7e3d47";
#[derive(Debug, Serialize)]
struct EventProperties {
database_size: u64,
last_update_timestamp: Option<i64>, //timestamp
number_of_documents: Vec<u64>,
}
impl EventProperties {
async fn from(data: Data) -> anyhow::Result<EventProperties> {
let stats = data.index_controller.get_all_stats().await?;
let database_size = stats.database_size;
let last_update_timestamp = stats.last_update.map(|u| u.timestamp());
let number_of_documents = stats
.indexes
.values()
.map(|index| index.number_of_documents)
.collect();
Ok(EventProperties {
database_size,
last_update_timestamp,
number_of_documents,
})
}
}
#[derive(Debug, Serialize)]
struct UserProperties<'a> {
env: &'a str,
start_since_days: u64,
user_email: Option<String>,
server_provider: Option<String>,
}
#[derive(Debug, Serialize)]
struct Event<'a> {
user_id: &'a str,
event_type: &'a str,
device_id: &'a str,
time: u64,
app_version: &'a str,
user_properties: UserProperties<'a>,
event_properties: Option<EventProperties>,
}
#[derive(Debug, Serialize)]
struct AmplitudeRequest<'a> {
api_key: &'a str,
events: Vec<Event<'a>>,
}
pub async fn analytics_sender(data: Data, opt: Opt) {
let username = whoami::username();
let hostname = whoami::hostname();
let platform = whoami::platform();
let uid = username + &hostname + &platform.to_string();
let mut hasher = SipHasher::new();
uid.hash(&mut hasher);
let hash = hasher.finish();
let uid = format!("{:X}", hash);
let platform = platform.to_string();
let first_start = Instant::now();
loop {
let n = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
let user_id = &uid;
let device_id = &platform;
let time = n.as_secs();
let event_type = "runtime_tick";
let elapsed_since_start = first_start.elapsed().as_secs() / 86_400; // One day
let event_properties = EventProperties::from(data.clone()).await.ok();
let app_version = env!("CARGO_PKG_VERSION").to_string();
let app_version = app_version.as_str();
let user_email = std::env::var("MEILI_USER_EMAIL").ok();
let server_provider = std::env::var("MEILI_SERVER_PROVIDER").ok();
let user_properties = UserProperties {
env: &opt.env,
start_since_days: elapsed_since_start,
user_email,
server_provider,
};
let event = Event {
user_id,
event_type,
device_id,
time,
app_version,
user_properties,
event_properties,
};
let request = AmplitudeRequest {
api_key: AMPLITUDE_API_KEY,
events: vec![event],
};
let response = reqwest::Client::new()
.post("https://api2.amplitude.com/2/httpapi")
.timeout(Duration::from_secs(60)) // 1 minute max
.json(&request)
.send()
.await;
if let Err(e) = response {
debug!("Unsuccessful call to Amplitude: {}", e);
}
tokio::time::sleep(Duration::from_secs(3600)).await;
}
}

View File

@ -0,0 +1,51 @@
use std::{any::Any, sync::Arc};
use actix_web::HttpRequest;
use serde_json::Value;
use crate::{routes::indexes::documents::UpdateDocumentsQuery, Opt};
use super::{find_user_id, Analytics};
pub struct MockAnalytics;
#[derive(Default)]
pub struct SearchAggregator {}
#[allow(dead_code)]
impl SearchAggregator {
pub fn from_query(_: &dyn Any, _: &dyn Any) -> Self {
Self::default()
}
pub fn succeed(&mut self, _: &dyn Any) {}
}
impl MockAnalytics {
#[allow(clippy::new_ret_no_self)]
pub fn new(opt: &Opt) -> (Arc<dyn Analytics>, String) {
let user = find_user_id(&opt.db_path).unwrap_or_default();
(Arc::new(Self), user)
}
}
impl Analytics for MockAnalytics {
// These methods are noop and should be optimized out
fn publish(&self, _event_name: String, _send: Value, _request: Option<&HttpRequest>) {}
fn get_search(&self, _aggregate: super::SearchAggregator) {}
fn post_search(&self, _aggregate: super::SearchAggregator) {}
fn add_documents(
&self,
_documents_query: &UpdateDocumentsQuery,
_index_creation: bool,
_request: &HttpRequest,
) {
}
fn update_documents(
&self,
_documents_query: &UpdateDocumentsQuery,
_index_creation: bool,
_request: &HttpRequest,
) {
}
}

View File

@ -0,0 +1,84 @@
mod mock_analytics;
// if we are in release mode and the feature analytics was enabled
#[cfg(all(not(debug_assertions), feature = "analytics"))]
mod segment_analytics;
use std::fs;
use std::path::{Path, PathBuf};
use actix_web::HttpRequest;
use once_cell::sync::Lazy;
use platform_dirs::AppDirs;
use serde_json::Value;
use crate::routes::indexes::documents::UpdateDocumentsQuery;
pub use mock_analytics::MockAnalytics;
// if we are in debug mode OR the analytics feature is disabled
// the `SegmentAnalytics` point to the mock instead of the real analytics
#[cfg(any(debug_assertions, not(feature = "analytics")))]
pub type SegmentAnalytics = mock_analytics::MockAnalytics;
#[cfg(any(debug_assertions, not(feature = "analytics")))]
pub type SearchAggregator = mock_analytics::SearchAggregator;
// if we are in release mode and the feature analytics was enabled
// we use the real analytics
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub type SegmentAnalytics = segment_analytics::SegmentAnalytics;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub type SearchAggregator = segment_analytics::SearchAggregator;
/// The Meilisearch config dir:
/// `~/.config/Meilisearch` on *NIX or *BSD.
/// `~/Library/ApplicationSupport` on macOS.
/// `%APPDATA` (= `C:\Users%USERNAME%\AppData\Roaming`) on windows.
static MEILISEARCH_CONFIG_PATH: Lazy<Option<PathBuf>> =
Lazy::new(|| AppDirs::new(Some("Meilisearch"), false).map(|appdir| appdir.config_dir));
fn config_user_id_path(db_path: &Path) -> Option<PathBuf> {
db_path
.canonicalize()
.ok()
.map(|path| {
path.join("instance-uid")
.display()
.to_string()
.replace('/', "-")
})
.zip(MEILISEARCH_CONFIG_PATH.as_ref())
.map(|(filename, config_path)| config_path.join(filename.trim_start_matches('-')))
}
/// Look for the instance-uid in the `data.ms` or in `~/.config/Meilisearch/path-to-db-instance-uid`
fn find_user_id(db_path: &Path) -> Option<String> {
fs::read_to_string(db_path.join("instance-uid"))
.ok()
.or_else(|| fs::read_to_string(&config_user_id_path(db_path)?).ok())
}
pub trait Analytics: Sync + Send {
/// The method used to publish most analytics that do not need to be batched every hours
fn publish(&self, event_name: String, send: Value, request: Option<&HttpRequest>);
/// This method should be called to aggergate a get search
fn get_search(&self, aggregate: SearchAggregator);
/// This method should be called to aggregate a post search
fn post_search(&self, aggregate: SearchAggregator);
// this method should be called to aggregate a add documents request
fn add_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
);
// this method should be called to batch a update documents request
fn update_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
);
}

View File

@ -0,0 +1,586 @@
use std::collections::{BinaryHeap, HashMap, HashSet};
use std::fs;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use std::time::{Duration, Instant};
use actix_web::http::header::USER_AGENT;
use actix_web::HttpRequest;
use http::header::CONTENT_TYPE;
use meilisearch_auth::SearchRules;
use meilisearch_lib::index::{SearchQuery, SearchResult};
use meilisearch_lib::index_controller::Stats;
use meilisearch_lib::MeiliSearch;
use once_cell::sync::Lazy;
use regex::Regex;
use segment::message::{Identify, Track, User};
use segment::{AutoBatcher, Batcher, HttpClient};
use serde_json::{json, Value};
use sysinfo::{DiskExt, System, SystemExt};
use time::OffsetDateTime;
use tokio::select;
use tokio::sync::mpsc::{self, Receiver, Sender};
use uuid::Uuid;
use crate::analytics::Analytics;
use crate::routes::indexes::documents::UpdateDocumentsQuery;
use crate::Opt;
use super::{config_user_id_path, MEILISEARCH_CONFIG_PATH};
/// Write the instance-uid in the `data.ms` and in `~/.config/MeiliSearch/path-to-db-instance-uid`. Ignore the errors.
fn write_user_id(db_path: &Path, user_id: &str) {
let _ = fs::write(db_path.join("instance-uid"), user_id.as_bytes());
if let Some((meilisearch_config_path, user_id_path)) = MEILISEARCH_CONFIG_PATH
.as_ref()
.zip(config_user_id_path(db_path))
{
let _ = fs::create_dir_all(&meilisearch_config_path);
let _ = fs::write(user_id_path, user_id.as_bytes());
}
}
const SEGMENT_API_KEY: &str = "P3FWhhEsJiEDCuEHpmcN9DHcK4hVfBvb";
pub fn extract_user_agents(request: &HttpRequest) -> Vec<String> {
request
.headers()
.get(USER_AGENT)
.map(|header| header.to_str().ok())
.flatten()
.unwrap_or("unknown")
.split(';')
.map(str::trim)
.map(ToString::to_string)
.collect()
}
pub enum AnalyticsMsg {
BatchMessage(Track),
AggregateGetSearch(SearchAggregator),
AggregatePostSearch(SearchAggregator),
AggregateAddDocuments(DocumentsAggregator),
AggregateUpdateDocuments(DocumentsAggregator),
}
pub struct SegmentAnalytics {
sender: Sender<AnalyticsMsg>,
user: User,
}
impl SegmentAnalytics {
pub async fn new(opt: &Opt, meilisearch: &MeiliSearch) -> (Arc<dyn Analytics>, String) {
let user_id = super::find_user_id(&opt.db_path);
let first_time_run = user_id.is_none();
let user_id = user_id.unwrap_or_else(|| Uuid::new_v4().to_string());
write_user_id(&opt.db_path, &user_id);
let client = HttpClient::default();
let user = User::UserId { user_id };
let mut batcher = AutoBatcher::new(client, Batcher::new(None), SEGMENT_API_KEY.to_string());
// If Meilisearch is Launched for the first time:
// 1. Send an event Launched associated to the user `total_launch`.
// 2. Batch an event Launched with the real instance-id and send it in one hour.
if first_time_run {
let _ = batcher
.push(Track {
user: User::UserId {
user_id: "total_launch".to_string(),
},
event: "Launched".to_string(),
..Default::default()
})
.await;
let _ = batcher.flush().await;
let _ = batcher
.push(Track {
user: user.clone(),
event: "Launched".to_string(),
..Default::default()
})
.await;
}
let (sender, inbox) = mpsc::channel(100); // How many analytics can we bufferize
let segment = Box::new(Segment {
inbox,
user: user.clone(),
opt: opt.clone(),
batcher,
post_search_aggregator: SearchAggregator::default(),
get_search_aggregator: SearchAggregator::default(),
add_documents_aggregator: DocumentsAggregator::default(),
update_documents_aggregator: DocumentsAggregator::default(),
});
tokio::spawn(segment.run(meilisearch.clone()));
let this = Self {
sender,
user: user.clone(),
};
(Arc::new(this), user.to_string())
}
}
impl super::Analytics for SegmentAnalytics {
fn publish(&self, event_name: String, mut send: Value, request: Option<&HttpRequest>) {
let user_agent = request
.map(|req| req.headers().get(USER_AGENT))
.flatten()
.map(|header| header.to_str().unwrap_or("unknown"))
.map(|s| s.split(';').map(str::trim).collect::<Vec<&str>>());
send["user-agent"] = json!(user_agent);
let event = Track {
user: self.user.clone(),
event: event_name.clone(),
properties: send,
..Default::default()
};
let _ = self
.sender
.try_send(AnalyticsMsg::BatchMessage(event.into()));
}
fn get_search(&self, aggregate: SearchAggregator) {
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateGetSearch(aggregate));
}
fn post_search(&self, aggregate: SearchAggregator) {
let _ = self
.sender
.try_send(AnalyticsMsg::AggregatePostSearch(aggregate));
}
fn add_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) {
let aggregate = DocumentsAggregator::from_query(documents_query, index_creation, request);
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateAddDocuments(aggregate));
}
fn update_documents(
&self,
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) {
let aggregate = DocumentsAggregator::from_query(documents_query, index_creation, request);
let _ = self
.sender
.try_send(AnalyticsMsg::AggregateUpdateDocuments(aggregate));
}
}
pub struct Segment {
inbox: Receiver<AnalyticsMsg>,
user: User,
opt: Opt,
batcher: AutoBatcher,
get_search_aggregator: SearchAggregator,
post_search_aggregator: SearchAggregator,
add_documents_aggregator: DocumentsAggregator,
update_documents_aggregator: DocumentsAggregator,
}
impl Segment {
fn compute_traits(opt: &Opt, stats: Stats) -> Value {
static FIRST_START_TIMESTAMP: Lazy<Instant> = Lazy::new(Instant::now);
static SYSTEM: Lazy<Value> = Lazy::new(|| {
let mut sys = System::new_all();
sys.refresh_all();
let kernel_version = sys
.kernel_version()
.map(|k| k.split_once("-").map(|(k, _)| k.to_string()))
.flatten();
json!({
"distribution": sys.name(),
"kernel_version": kernel_version,
"cores": sys.processors().len(),
"ram_size": sys.total_memory(),
"disk_size": sys.disks().iter().map(|disk| disk.total_space()).max(),
"server_provider": std::env::var("MEILI_SERVER_PROVIDER").ok(),
})
});
// The infos are all cli option except every option containing sensitive information.
// We consider an information as sensible if it contains a path, an address or a key.
let infos = {
// First we see if any sensitive fields were used.
let db_path = opt.db_path != PathBuf::from("./data.ms");
let import_dump = opt.import_dump.is_some();
let dumps_dir = opt.dumps_dir != PathBuf::from("dumps/");
let import_snapshot = opt.import_snapshot.is_some();
let snapshots_dir = opt.snapshot_dir != PathBuf::from("snapshots/");
let http_addr = opt.http_addr != "127.0.0.1:7700";
let mut infos = serde_json::to_value(opt).unwrap();
// Then we overwrite all sensitive field with a boolean representing if
// the feature was used or not.
infos["db_path"] = json!(db_path);
infos["import_dump"] = json!(import_dump);
infos["dumps_dir"] = json!(dumps_dir);
infos["import_snapshot"] = json!(import_snapshot);
infos["snapshot_dir"] = json!(snapshots_dir);
infos["http_addr"] = json!(http_addr);
infos
};
let number_of_documents = stats
.indexes
.values()
.map(|index| index.number_of_documents)
.collect::<Vec<u64>>();
json!({
"start_since_days": FIRST_START_TIMESTAMP.elapsed().as_secs() / (60 * 60 * 24), // one day
"system": *SYSTEM,
"stats": {
"database_size": stats.database_size,
"indexes_number": stats.indexes.len(),
"documents_number": number_of_documents,
},
"infos": infos,
})
}
async fn run(mut self, meilisearch: MeiliSearch) {
const INTERVAL: Duration = Duration::from_secs(60 * 60); // one hour
// The first batch must be sent after one hour.
let mut interval =
tokio::time::interval_at(tokio::time::Instant::now() + INTERVAL, INTERVAL);
loop {
select! {
_ = interval.tick() => {
self.tick(meilisearch.clone()).await;
},
msg = self.inbox.recv() => {
match msg {
Some(AnalyticsMsg::BatchMessage(msg)) => drop(self.batcher.push(msg).await),
Some(AnalyticsMsg::AggregateGetSearch(agreg)) => self.get_search_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregatePostSearch(agreg)) => self.post_search_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregateAddDocuments(agreg)) => self.add_documents_aggregator.aggregate(agreg),
Some(AnalyticsMsg::AggregateUpdateDocuments(agreg)) => self.update_documents_aggregator.aggregate(agreg),
None => (),
}
}
}
}
}
async fn tick(&mut self, meilisearch: MeiliSearch) {
if let Ok(stats) = meilisearch.get_all_stats(&SearchRules::default()).await {
let _ = self
.batcher
.push(Identify {
context: Some(json!({
"app": {
"version": env!("CARGO_PKG_VERSION").to_string(),
},
})),
user: self.user.clone(),
traits: Self::compute_traits(&self.opt, stats),
..Default::default()
})
.await;
}
let get_search = std::mem::take(&mut self.get_search_aggregator)
.into_event(&self.user, "Documents Searched GET");
let post_search = std::mem::take(&mut self.post_search_aggregator)
.into_event(&self.user, "Documents Searched POST");
let add_documents = std::mem::take(&mut self.add_documents_aggregator)
.into_event(&self.user, "Documents Added");
let update_documents = std::mem::take(&mut self.update_documents_aggregator)
.into_event(&self.user, "Documents Updated");
if let Some(get_search) = get_search {
let _ = self.batcher.push(get_search).await;
}
if let Some(post_search) = post_search {
let _ = self.batcher.push(post_search).await;
}
if let Some(add_documents) = add_documents {
let _ = self.batcher.push(add_documents).await;
}
if let Some(update_documents) = update_documents {
let _ = self.batcher.push(update_documents).await;
}
let _ = self.batcher.flush().await;
}
}
#[derive(Default)]
pub struct SearchAggregator {
timestamp: Option<OffsetDateTime>,
// context
user_agents: HashSet<String>,
// requests
total_received: usize,
total_succeeded: usize,
time_spent: BinaryHeap<usize>,
// sort
sort_with_geo_point: bool,
// everytime a request has a filter, this field must be incremented by the number of terms it contains
sort_sum_of_criteria_terms: usize,
// everytime a request has a filter, this field must be incremented by one
sort_total_number_of_criteria: usize,
// filter
filter_with_geo_radius: bool,
// everytime a request has a filter, this field must be incremented by the number of terms it contains
filter_sum_of_criteria_terms: usize,
// everytime a request has a filter, this field must be incremented by one
filter_total_number_of_criteria: usize,
used_syntax: HashMap<String, usize>,
// q
// The maximum number of terms in a q request
max_terms_number: usize,
// pagination
max_limit: usize,
max_offset: usize,
}
impl SearchAggregator {
pub fn from_query(query: &SearchQuery, request: &HttpRequest) -> Self {
let mut ret = Self::default();
ret.timestamp = Some(OffsetDateTime::now_utc());
ret.total_received = 1;
ret.user_agents = extract_user_agents(request).into_iter().collect();
if let Some(ref sort) = query.sort {
ret.sort_total_number_of_criteria = 1;
ret.sort_with_geo_point = sort.iter().any(|s| s.contains("_geoPoint("));
ret.sort_sum_of_criteria_terms = sort.len();
}
if let Some(ref filter) = query.filter {
static RE: Lazy<Regex> = Lazy::new(|| Regex::new("AND | OR").unwrap());
ret.filter_total_number_of_criteria = 1;
let syntax = match filter {
Value::String(_) => "string".to_string(),
Value::Array(values) => {
if values
.iter()
.map(|v| v.to_string())
.any(|s| RE.is_match(&s))
{
"mixed".to_string()
} else {
"array".to_string()
}
}
_ => "none".to_string(),
};
// convert the string to a HashMap
ret.used_syntax.insert(syntax, 1);
let stringified_filters = filter.to_string();
ret.filter_with_geo_radius = stringified_filters.contains("_geoRadius(");
ret.filter_sum_of_criteria_terms = RE.split(&stringified_filters).count();
}
if let Some(ref q) = query.q {
ret.max_terms_number = q.split_whitespace().count();
}
ret.max_limit = query.limit;
ret.max_offset = query.offset.unwrap_or_default();
ret
}
pub fn succeed(&mut self, result: &SearchResult) {
self.total_succeeded = self.total_succeeded.saturating_add(1);
self.time_spent.push(result.processing_time_ms as usize);
}
/// Aggregate one [SearchAggregator] into another.
pub fn aggregate(&mut self, mut other: Self) {
if self.timestamp.is_none() {
self.timestamp = other.timestamp;
}
// context
for user_agent in other.user_agents.into_iter() {
self.user_agents.insert(user_agent);
}
// request
self.total_received = self.total_received.saturating_add(other.total_received);
self.total_succeeded = self.total_succeeded.saturating_add(other.total_succeeded);
self.time_spent.append(&mut other.time_spent);
// sort
self.sort_with_geo_point |= other.sort_with_geo_point;
self.sort_sum_of_criteria_terms = self
.sort_sum_of_criteria_terms
.saturating_add(other.sort_sum_of_criteria_terms);
self.sort_total_number_of_criteria = self
.sort_total_number_of_criteria
.saturating_add(other.sort_total_number_of_criteria);
// filter
self.filter_with_geo_radius |= other.filter_with_geo_radius;
self.filter_sum_of_criteria_terms = self
.filter_sum_of_criteria_terms
.saturating_add(other.filter_sum_of_criteria_terms);
self.filter_total_number_of_criteria = self
.filter_total_number_of_criteria
.saturating_add(other.filter_total_number_of_criteria);
for (key, value) in other.used_syntax.into_iter() {
let used_syntax = self.used_syntax.entry(key).or_insert(0);
*used_syntax = used_syntax.saturating_add(value);
}
// q
self.max_terms_number = self.max_terms_number.max(other.max_terms_number);
// pagination
self.max_limit = self.max_limit.max(other.max_limit);
self.max_offset = self.max_offset.max(other.max_offset);
}
pub fn into_event(self, user: &User, event_name: &str) -> Option<Track> {
if self.total_received == 0 {
None
} else {
// the index of the 99th percentage of value
let percentile_99th = 0.99 * (self.total_succeeded as f64 - 1.) + 1.;
// we get all the values in a sorted manner
let time_spent = self.time_spent.into_sorted_vec();
// We are only intersted by the slowest value of the 99th fastest results
let time_spent = time_spent.get(percentile_99th as usize);
let properties = json!({
"user-agent": self.user_agents,
"requests": {
"99th_response_time": time_spent.map(|t| format!("{:.2}", t)),
"total_succeeded": self.total_succeeded,
"total_failed": self.total_received.saturating_sub(self.total_succeeded), // just to be sure we never panics
"total_received": self.total_received,
},
"sort": {
"with_geoPoint": self.sort_with_geo_point,
"avg_criteria_number": format!("{:.2}", self.sort_sum_of_criteria_terms as f64 / self.sort_total_number_of_criteria as f64),
},
"filter": {
"with_geoRadius": self.filter_with_geo_radius,
"avg_criteria_number": format!("{:.2}", self.filter_sum_of_criteria_terms as f64 / self.filter_total_number_of_criteria as f64),
"most_used_syntax": self.used_syntax.iter().max_by_key(|(_, v)| *v).map(|(k, _)| json!(k)).unwrap_or_else(|| json!(null)),
},
"q": {
"max_terms_number": self.max_terms_number,
},
"pagination": {
"max_limit": self.max_limit,
"max_offset": self.max_offset,
},
});
Some(Track {
timestamp: self.timestamp,
user: user.clone(),
event: event_name.to_string(),
properties,
..Default::default()
})
}
}
}
#[derive(Default)]
pub struct DocumentsAggregator {
timestamp: Option<OffsetDateTime>,
// set to true when at least one request was received
updated: bool,
// context
user_agents: HashSet<String>,
content_types: HashSet<String>,
primary_keys: HashSet<String>,
index_creation: bool,
}
impl DocumentsAggregator {
pub fn from_query(
documents_query: &UpdateDocumentsQuery,
index_creation: bool,
request: &HttpRequest,
) -> Self {
let mut ret = Self::default();
ret.timestamp = Some(OffsetDateTime::now_utc());
ret.updated = true;
ret.user_agents = extract_user_agents(request).into_iter().collect();
if let Some(primary_key) = documents_query.primary_key.clone() {
ret.primary_keys.insert(primary_key);
}
let content_type = request
.headers()
.get(CONTENT_TYPE)
.map(|s| s.to_str().unwrap_or("unkown"))
.unwrap_or("unkown")
.to_string();
ret.content_types.insert(content_type);
ret.index_creation = index_creation;
ret
}
/// Aggregate one [DocumentsAggregator] into another.
pub fn aggregate(&mut self, other: Self) {
if self.timestamp.is_none() {
self.timestamp = other.timestamp;
}
self.updated |= other.updated;
// we can't create a union because there is no `into_union` method
for user_agent in other.user_agents.into_iter() {
self.user_agents.insert(user_agent);
}
for primary_key in other.primary_keys.into_iter() {
self.primary_keys.insert(primary_key);
}
for content_type in other.content_types.into_iter() {
self.content_types.insert(content_type);
}
self.index_creation |= other.index_creation;
}
pub fn into_event(self, user: &User, event_name: &str) -> Option<Track> {
if !self.updated {
None
} else {
let properties = json!({
"user-agent": self.user_agents,
"payload_type": self.content_types,
"primary_key": self.primary_keys,
"index_creation": self.index_creation,
});
Some(Track {
timestamp: self.timestamp,
user: user.clone(),
event: event_name.to_string(),
properties,
..Default::default()
})
}
}
}

View File

@ -1,133 +0,0 @@
use std::ops::Deref;
use std::sync::Arc;
use sha2::Digest;
use crate::index::{Checked, Settings};
use crate::index_controller::{
error::Result, DumpInfo, IndexController, IndexMetadata, IndexSettings, IndexStats, Stats,
};
use crate::option::Opt;
pub mod search;
mod updates;
#[derive(Clone)]
pub struct Data {
inner: Arc<DataInner>,
}
impl Deref for Data {
type Target = DataInner;
fn deref(&self) -> &Self::Target {
&self.inner
}
}
pub struct DataInner {
pub index_controller: IndexController,
pub api_keys: ApiKeys,
options: Opt,
}
#[derive(Clone)]
pub struct ApiKeys {
pub public: Option<String>,
pub private: Option<String>,
pub master: Option<String>,
}
impl ApiKeys {
pub fn generate_missing_api_keys(&mut self) {
if let Some(master_key) = &self.master {
if self.private.is_none() {
let key = format!("{}-private", master_key);
let sha = sha2::Sha256::digest(key.as_bytes());
self.private = Some(format!("{:x}", sha));
}
if self.public.is_none() {
let key = format!("{}-public", master_key);
let sha = sha2::Sha256::digest(key.as_bytes());
self.public = Some(format!("{:x}", sha));
}
}
}
}
impl Data {
pub fn new(options: Opt) -> anyhow::Result<Data> {
let path = options.db_path.clone();
let index_controller = IndexController::new(&path, &options)?;
let mut api_keys = ApiKeys {
master: options.clone().master_key,
private: None,
public: None,
};
api_keys.generate_missing_api_keys();
let inner = DataInner {
index_controller,
api_keys,
options,
};
let inner = Arc::new(inner);
Ok(Data { inner })
}
pub async fn settings(&self, uid: String) -> Result<Settings<Checked>> {
self.index_controller.settings(uid).await
}
pub async fn list_indexes(&self) -> Result<Vec<IndexMetadata>> {
self.index_controller.list_indexes().await
}
pub async fn index(&self, uid: String) -> Result<IndexMetadata> {
self.index_controller.get_index(uid).await
}
pub async fn create_index(
&self,
uid: String,
primary_key: Option<String>,
) -> Result<IndexMetadata> {
let settings = IndexSettings {
uid: Some(uid),
primary_key,
};
let meta = self.index_controller.create_index(settings).await?;
Ok(meta)
}
pub async fn get_index_stats(&self, uid: String) -> Result<IndexStats> {
Ok(self.index_controller.get_index_stats(uid).await?)
}
pub async fn get_all_stats(&self) -> Result<Stats> {
Ok(self.index_controller.get_all_stats().await?)
}
pub async fn create_dump(&self) -> Result<DumpInfo> {
Ok(self.index_controller.create_dump().await?)
}
pub async fn dump_status(&self, uid: String) -> Result<DumpInfo> {
Ok(self.index_controller.dump_info(uid).await?)
}
#[inline]
pub fn http_payload_size_limit(&self) -> usize {
self.options.http_payload_size_limit.get_bytes() as usize
}
#[inline]
pub fn api_keys(&self) -> &ApiKeys {
&self.api_keys
}
}

View File

@ -1,34 +0,0 @@
use serde_json::{Map, Value};
use super::Data;
use crate::index::{SearchQuery, SearchResult};
use crate::index_controller::error::Result;
impl Data {
pub async fn search(&self, index: String, search_query: SearchQuery) -> Result<SearchResult> {
self.index_controller.search(index, search_query).await
}
pub async fn retrieve_documents(
&self,
index: String,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Map<String, Value>>> {
self.index_controller
.documents(index, offset, limit, attributes_to_retrieve)
.await
}
pub async fn retrieve_document(
&self,
index: String,
document_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Map<String, Value>> {
self.index_controller
.document(index, document_id, attributes_to_retrieve)
.await
}
}

View File

@ -1,80 +0,0 @@
use milli::update::{IndexDocumentsMethod, UpdateFormat};
use crate::extractors::payload::Payload;
use crate::index::{Checked, Settings};
use crate::index_controller::{error::Result, IndexMetadata, IndexSettings, UpdateStatus};
use crate::Data;
impl Data {
pub async fn add_documents(
&self,
index: String,
method: IndexDocumentsMethod,
format: UpdateFormat,
stream: Payload,
primary_key: Option<String>,
) -> Result<UpdateStatus> {
let update_status = self
.index_controller
.add_documents(index, method, format, stream, primary_key)
.await?;
Ok(update_status)
}
pub async fn update_settings(
&self,
index: String,
settings: Settings<Checked>,
create: bool,
) -> Result<UpdateStatus> {
let update = self
.index_controller
.update_settings(index, settings, create)
.await?;
Ok(update)
}
pub async fn clear_documents(&self, index: String) -> Result<UpdateStatus> {
let update = self.index_controller.clear_documents(index).await?;
Ok(update)
}
pub async fn delete_documents(
&self,
index: String,
document_ids: Vec<String>,
) -> Result<UpdateStatus> {
let update = self
.index_controller
.delete_documents(index, document_ids)
.await?;
Ok(update)
}
pub async fn delete_index(&self, index: String) -> Result<()> {
self.index_controller.delete_index(index).await?;
Ok(())
}
pub async fn get_update_status(&self, index: String, uid: u64) -> Result<UpdateStatus> {
self.index_controller.update_status(index, uid).await
}
pub async fn get_updates_status(&self, index: String) -> Result<Vec<UpdateStatus>> {
self.index_controller.all_update_status(index).await
}
pub async fn update_index(
&self,
uid: String,
primary_key: Option<String>,
new_uid: Option<String>,
) -> Result<IndexMetadata> {
let settings = IndexSettings {
uid: new_uid,
primary_key,
};
self.index_controller.update_index(uid, settings).await
}
}

View File

@ -1,141 +1,57 @@
use std::error::Error;
use std::fmt;
use actix_web as aweb;
use actix_web::body::Body;
use actix_web::dev::BaseHttpResponseBuilder;
use actix_web::http::StatusCode;
use aweb::error::{JsonPayloadError, QueryPayloadError};
use meilisearch_error::{Code, ErrorCode};
use milli::UserError;
use serde::{Deserialize, Serialize};
use meilisearch_error::{Code, ErrorCode, ResponseError};
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct ResponseError {
#[serde(skip)]
code: StatusCode,
message: String,
error_code: String,
error_type: String,
error_link: String,
#[derive(Debug, thiserror::Error)]
pub enum MeilisearchHttpError {
#[error("A Content-Type header is missing. Accepted values for the Content-Type header are: {}",
.0.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", "))]
MissingContentType(Vec<String>),
#[error(
"The Content-Type `{0}` is invalid. Accepted values for the Content-Type header are: {}",
.1.iter().map(|s| format!("`{}`", s)).collect::<Vec<_>>().join(", ")
)]
InvalidContentType(String, Vec<String>),
}
impl fmt::Display for ResponseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.message.fmt(f)
}
}
impl<T> From<T> for ResponseError
where
T: ErrorCode,
{
fn from(other: T) -> Self {
Self {
code: other.http_status(),
message: other.to_string(),
error_code: other.error_name(),
error_type: other.error_type(),
error_link: other.error_url(),
}
}
}
impl aweb::error::ResponseError for ResponseError {
fn error_response(&self) -> aweb::BaseHttpResponse<Body> {
let json = serde_json::to_vec(self).unwrap();
BaseHttpResponseBuilder::new(self.status_code())
.content_type("application/json")
.body(json)
}
fn status_code(&self) -> StatusCode {
self.code
}
}
macro_rules! internal_error {
($target:ty : $($other:path), *) => {
$(
impl From<$other> for $target {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
#[derive(Debug)]
pub struct MilliError<'a>(pub &'a milli::Error);
impl Error for MilliError<'_> {}
impl fmt::Display for MilliError<'_> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
self.0.fmt(f)
}
}
impl ErrorCode for MilliError<'_> {
impl ErrorCode for MeilisearchHttpError {
fn error_code(&self) -> Code {
match self.0 {
milli::Error::InternalError(_) => Code::Internal,
milli::Error::IoError(_) => Code::Internal,
milli::Error::UserError(ref error) => {
match error {
// TODO: wait for spec for new error codes.
UserError::Csv(_)
| UserError::SerdeJson(_)
| UserError::MaxDatabaseSizeReached
| UserError::InvalidCriterionName { .. }
| UserError::InvalidDocumentId { .. }
| UserError::InvalidStoreFile
| UserError::NoSpaceLeftOnDevice
| UserError::DocumentLimitReached => Code::Internal,
UserError::AttributeLimitReached => Code::MaxFieldsLimitExceeded,
UserError::InvalidFilter(_) => Code::Filter,
UserError::InvalidFilterAttribute(_) => Code::Filter,
UserError::MissingDocumentId { .. } => Code::MissingDocumentId,
UserError::MissingPrimaryKey => Code::MissingPrimaryKey,
UserError::PrimaryKeyCannotBeChanged => Code::PrimaryKeyAlreadyPresent,
UserError::PrimaryKeyCannotBeReset => Code::PrimaryKeyAlreadyPresent,
UserError::UnknownInternalDocumentId { .. } => Code::DocumentNotFound,
UserError::InvalidFacetsDistribution { .. } => Code::BadRequest,
}
}
}
}
}
impl fmt::Display for PayloadError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
PayloadError::Json(e) => e.fmt(f),
PayloadError::Query(e) => e.fmt(f),
MeilisearchHttpError::MissingContentType(_) => Code::MissingContentType,
MeilisearchHttpError::InvalidContentType(_, _) => Code::InvalidContentType,
}
}
}
#[derive(Debug)]
pub enum PayloadError {
Json(JsonPayloadError),
Query(QueryPayloadError),
impl From<MeilisearchHttpError> for aweb::Error {
fn from(other: MeilisearchHttpError) -> Self {
aweb::Error::from(ResponseError::from(other))
}
}
impl Error for PayloadError {}
#[derive(Debug, thiserror::Error)]
pub enum PayloadError {
#[error("{0}")]
Json(JsonPayloadError),
#[error("{0}")]
Query(QueryPayloadError),
#[error("The json payload provided is malformed. `{0}`.")]
MalformedPayload(serde_json::error::Error),
#[error("A json payload is missing.")]
MissingPayload,
}
impl ErrorCode for PayloadError {
fn error_code(&self) -> Code {
match self {
PayloadError::Json(err) => match err {
JsonPayloadError::Overflow => Code::PayloadTooLarge,
JsonPayloadError::Overflow { .. } => Code::PayloadTooLarge,
JsonPayloadError::ContentType => Code::UnsupportedMediaType,
JsonPayloadError::Payload(aweb::error::PayloadError::Overflow) => {
Code::PayloadTooLarge
}
JsonPayloadError::Deserialize(_) | JsonPayloadError::Payload(_) => Code::BadRequest,
JsonPayloadError::Payload(_) => Code::BadRequest,
JsonPayloadError::Deserialize(_) => Code::BadRequest,
JsonPayloadError::Serialize(_) => Code::Internal,
_ => Code::Internal,
},
@ -143,13 +59,29 @@ impl ErrorCode for PayloadError {
QueryPayloadError::Deserialize(_) => Code::BadRequest,
_ => Code::Internal,
},
PayloadError::MissingPayload => Code::MissingPayload,
PayloadError::MalformedPayload(_) => Code::MalformedPayload,
}
}
}
impl From<JsonPayloadError> for PayloadError {
fn from(other: JsonPayloadError) -> Self {
Self::Json(other)
match other {
JsonPayloadError::Deserialize(e)
if e.classify() == serde_json::error::Category::Eof
&& e.line() == 1
&& e.column() == 0 =>
{
Self::MissingPayload
}
JsonPayloadError::Deserialize(e)
if e.classify() != serde_json::error::Category::Data =>
{
Self::MalformedPayload(e)
}
_ => Self::Json(other),
}
}
}
@ -159,9 +91,8 @@ impl From<QueryPayloadError> for PayloadError {
}
}
pub fn payload_error_handler<E>(err: E) -> ResponseError
where
E: Into<PayloadError>,
{
err.into().into()
impl From<PayloadError> for aweb::Error {
fn from(other: PayloadError) -> Self {
aweb::Error::from(ResponseError::from(other))
}
}

View File

@ -2,24 +2,21 @@ use meilisearch_error::{Code, ErrorCode};
#[derive(Debug, thiserror::Error)]
pub enum AuthenticationError {
#[error("You must have an authorization token")]
#[error("The Authorization header is missing. It must use the bearer authorization method.")]
MissingAuthorizationHeader,
#[error("Invalid API key")]
InvalidToken(String),
#[error("The provided API key is invalid.")]
InvalidToken,
// Triggered on configuration error.
#[error("Irretrievable state")]
#[error("An internal error has occurred. `Irretrievable state`.")]
IrretrievableState,
#[error("Unknown authentication policy")]
UnknownPolicy,
}
impl ErrorCode for AuthenticationError {
fn error_code(&self) -> Code {
match self {
AuthenticationError::MissingAuthorizationHeader => Code::MissingAuthorizationHeader,
AuthenticationError::InvalidToken(_) => Code::InvalidToken,
AuthenticationError::InvalidToken => Code::InvalidToken,
AuthenticationError::IrretrievableState => Code::Internal,
AuthenticationError::UnknownPolicy => Code::Internal,
}
}
}

View File

@ -1,84 +1,82 @@
mod error;
use std::any::{Any, TypeId};
use std::collections::HashMap;
use std::marker::PhantomData;
use std::ops::Deref;
use std::pin::Pin;
use actix_web::FromRequest;
use futures::future::err;
use futures::future::{ok, Ready};
use futures::Future;
use meilisearch_error::{Code, ResponseError};
use crate::error::ResponseError;
use error::AuthenticationError;
use meilisearch_auth::{AuthController, AuthFilter};
macro_rules! create_policies {
($($name:ident), *) => {
pub mod policies {
use std::collections::HashSet;
use crate::extractors::authentication::Policy;
$(
#[derive(Debug, Default)]
pub struct $name {
inner: HashSet<Vec<u8>>
}
impl $name {
pub fn new() -> Self {
Self { inner: HashSet::new() }
}
pub fn add(&mut self, token: Vec<u8>) {
self.inner.insert(token);
}
}
impl Policy for $name {
fn authenticate(&self, token: &[u8]) -> bool {
self.inner.contains(token)
}
}
)*
}
};
}
create_policies!(Public, Private, Admin);
/// Instanciate a `Policies`, filled with the given policies.
macro_rules! init_policies {
($($name:ident), *) => {
{
let mut policies = crate::extractors::authentication::Policies::new();
$(
let policy = $name::new();
policies.insert(policy);
)*
policies
}
};
}
/// Adds user to all specified policies.
macro_rules! create_users {
($policies:ident, $($user:expr => { $($policy:ty), * }), *) => {
{
$(
$(
$policies.get_mut::<$policy>().map(|p| p.add($user.to_owned()));
)*
)*
}
};
}
pub struct GuardedData<T, D> {
pub struct GuardedData<P, D> {
data: D,
_marker: PhantomData<T>,
filters: AuthFilter,
_marker: PhantomData<P>,
}
impl<T, D> Deref for GuardedData<T, D> {
impl<P, D> GuardedData<P, D> {
pub fn filters(&self) -> &AuthFilter {
&self.filters
}
async fn auth_bearer(
auth: AuthController,
token: String,
index: Option<String>,
data: Option<D>,
) -> Result<Self, ResponseError>
where
P: Policy + 'static,
{
match Self::authenticate(auth, token, index).await? {
Some(filters) => match data {
Some(data) => Ok(Self {
data,
filters,
_marker: PhantomData,
}),
None => Err(AuthenticationError::IrretrievableState.into()),
},
None => Err(AuthenticationError::InvalidToken.into()),
}
}
async fn auth_token(auth: AuthController, data: Option<D>) -> Result<Self, ResponseError>
where
P: Policy + 'static,
{
match Self::authenticate(auth, String::new(), None).await? {
Some(filters) => match data {
Some(data) => Ok(Self {
data,
filters,
_marker: PhantomData,
}),
None => Err(AuthenticationError::IrretrievableState.into()),
},
None => Err(AuthenticationError::MissingAuthorizationHeader.into()),
}
}
async fn authenticate(
auth: AuthController,
token: String,
index: Option<String>,
) -> Result<Option<AuthFilter>, ResponseError>
where
P: Policy + 'static,
{
tokio::task::spawn_blocking(move || P::authenticate(auth, token.as_ref(), index.as_deref()))
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))
}
}
impl<P, D> Deref for GuardedData<P, D> {
type Target = D;
fn deref(&self) -> &Self::Target {
@ -86,97 +84,180 @@ impl<T, D> Deref for GuardedData<T, D> {
}
}
pub trait Policy {
fn authenticate(&self, token: &[u8]) -> bool;
}
#[derive(Debug)]
pub struct Policies {
inner: HashMap<TypeId, Box<dyn Any>>,
}
impl Policies {
pub fn new() -> Self {
Self {
inner: HashMap::new(),
}
}
pub fn insert<S: Policy + 'static>(&mut self, policy: S) {
self.inner.insert(TypeId::of::<S>(), Box::new(policy));
}
pub fn get<S: Policy + 'static>(&self) -> Option<&S> {
self.inner
.get(&TypeId::of::<S>())
.and_then(|p| p.downcast_ref::<S>())
}
pub fn get_mut<S: Policy + 'static>(&mut self) -> Option<&mut S> {
self.inner
.get_mut(&TypeId::of::<S>())
.and_then(|p| p.downcast_mut::<S>())
}
}
impl Default for Policies {
fn default() -> Self {
Self::new()
}
}
pub enum AuthConfig {
NoAuth,
Auth(Policies),
}
impl Default for AuthConfig {
fn default() -> Self {
Self::NoAuth
}
}
impl<P: Policy + 'static, D: 'static + Clone> FromRequest for GuardedData<P, D> {
type Config = AuthConfig;
type Error = ResponseError;
type Future = Ready<Result<Self, Self::Error>>;
type Future = Pin<Box<dyn Future<Output = Result<Self, Self::Error>>>>;
fn from_request(
req: &actix_web::HttpRequest,
_payload: &mut actix_http::Payload,
_payload: &mut actix_web::dev::Payload,
) -> Self::Future {
match req.app_data::<Self::Config>() {
Some(config) => match config {
AuthConfig::NoAuth => match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
},
AuthConfig::Auth(policies) => match policies.get::<P>() {
Some(policy) => match req.headers().get("x-meili-api-key") {
Some(token) => {
if policy.authenticate(token.as_bytes()) {
match req.app_data::<D>().cloned() {
Some(data) => ok(Self {
data,
_marker: PhantomData,
}),
None => err(AuthenticationError::IrretrievableState.into()),
}
} else {
err(AuthenticationError::InvalidToken(String::from("hello")).into())
}
match req.app_data::<AuthController>().cloned() {
Some(auth) => match req
.headers()
.get("Authorization")
.map(|type_token| type_token.to_str().unwrap_or_default().splitn(2, ' '))
{
Some(mut type_token) => match type_token.next() {
Some("Bearer") => {
// TODO: find a less hardcoded way?
let index = req.match_info().get("index_uid");
match type_token.next() {
Some(token) => Box::pin(Self::auth_bearer(
auth,
token.to_string(),
index.map(String::from),
req.app_data::<D>().cloned(),
)),
None => Box::pin(err(AuthenticationError::InvalidToken.into())),
}
None => err(AuthenticationError::MissingAuthorizationHeader.into()),
},
None => err(AuthenticationError::UnknownPolicy.into()),
}
_otherwise => {
Box::pin(err(AuthenticationError::MissingAuthorizationHeader.into()))
}
},
None => Box::pin(Self::auth_token(auth, req.app_data::<D>().cloned())),
},
None => err(AuthenticationError::IrretrievableState.into()),
None => Box::pin(err(AuthenticationError::IrretrievableState.into())),
}
}
}
pub trait Policy {
fn authenticate(auth: AuthController, token: &str, index: Option<&str>) -> Option<AuthFilter>;
}
pub mod policies {
use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation};
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use crate::extractors::authentication::Policy;
use meilisearch_auth::{Action, AuthController, AuthFilter, SearchRules};
// reexport actions in policies in order to be used in routes configuration.
pub use meilisearch_auth::actions;
fn tenant_token_validation() -> Validation {
let mut validation = Validation::default();
validation.validate_exp = false;
validation.required_spec_claims.remove("exp");
validation.algorithms = vec![Algorithm::HS256, Algorithm::HS384, Algorithm::HS512];
validation
}
/// Extracts the key prefix used to sign the payload from the payload, without performing any validation.
fn extract_key_prefix(token: &str) -> Option<String> {
let mut validation = tenant_token_validation();
validation.insecure_disable_signature_validation();
let dummy_key = DecodingKey::from_secret(b"secret");
let token_data = decode::<Claims>(token, &dummy_key, &validation).ok()?;
// get token fields without validating it.
let Claims { api_key_prefix, .. } = token_data.claims;
Some(api_key_prefix)
}
pub struct MasterPolicy;
impl Policy for MasterPolicy {
fn authenticate(
auth: AuthController,
token: &str,
_index: Option<&str>,
) -> Option<AuthFilter> {
if let Some(master_key) = auth.get_master_key() {
if master_key == token {
return Some(AuthFilter::default());
}
}
None
}
}
pub struct ActionPolicy<const A: u8>;
impl<const A: u8> Policy for ActionPolicy<A> {
fn authenticate(
auth: AuthController,
token: &str,
index: Option<&str>,
) -> Option<AuthFilter> {
// authenticate if token is the master key.
if auth.get_master_key().map_or(true, |mk| mk == token) {
return Some(AuthFilter::default());
}
// Tenant token
if let Some(filters) = ActionPolicy::<A>::authenticate_tenant_token(&auth, token, index)
{
return Some(filters);
} else if let Some(action) = Action::from_repr(A) {
// API key
if let Ok(true) = auth.authenticate(token.as_bytes(), action, index) {
return auth.get_key_filters(token, None).ok();
}
}
None
}
}
impl<const A: u8> ActionPolicy<A> {
fn authenticate_tenant_token(
auth: &AuthController,
token: &str,
index: Option<&str>,
) -> Option<AuthFilter> {
// Only search action can be accessed by a tenant token.
if A != actions::SEARCH {
return None;
}
let api_key_prefix = extract_key_prefix(token)?;
// check if parent key is authorized to do the action.
if auth
.is_key_authorized(api_key_prefix.as_bytes(), Action::Search, index)
.ok()?
{
// Check if tenant token is valid.
let key = auth.generate_key(&api_key_prefix)?;
let data = decode::<Claims>(
token,
&DecodingKey::from_secret(key.as_bytes()),
&tenant_token_validation(),
)
.ok()?;
// Check index access if an index restriction is provided.
if let Some(index) = index {
if !data.claims.search_rules.is_index_authorized(index) {
return None;
}
}
// Check if token is expired.
if let Some(exp) = data.claims.exp {
if OffsetDateTime::now_utc().unix_timestamp() > exp {
return None;
}
}
return auth
.get_key_filters(api_key_prefix, Some(data.claims.search_rules))
.ok();
}
None
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct Claims {
search_rules: SearchRules,
exp: Option<i64>,
api_key_prefix: String,
}
}

View File

@ -1,3 +1,4 @@
pub mod payload;
#[macro_use]
pub mod authentication;
pub mod sequential_extractor;

View File

@ -1,7 +1,7 @@
use std::pin::Pin;
use std::task::{Context, Poll};
use actix_http::error::PayloadError;
use actix_web::error::PayloadError;
use actix_web::{dev, web, FromRequest, HttpRequest};
use futures::future::{ready, Ready};
use futures::Stream;
@ -28,8 +28,6 @@ impl Default for PayloadConfig {
}
impl FromRequest for Payload {
type Config = PayloadConfig;
type Error = PayloadError;
type Future = Ready<Result<Payload, Self::Error>>;
@ -39,7 +37,7 @@ impl FromRequest for Payload {
let limit = req
.app_data::<PayloadConfig>()
.map(|c| c.limit)
.unwrap_or(Self::Config::default().limit);
.unwrap_or(PayloadConfig::default().limit);
ready(Ok(Payload {
payload: payload.take(),
limit,

View File

@ -0,0 +1,148 @@
#![allow(non_snake_case)]
use std::{future::Future, pin::Pin, task::Poll};
use actix_web::{dev::Payload, FromRequest, Handler, HttpRequest};
use pin_project_lite::pin_project;
/// `SeqHandler` is an actix `Handler` that enforces that extractors errors are returned in the
/// same order as they are defined in the wrapped handler. This is needed because, by default, actix
/// resolves the extractors concurrently, whereas we always need the authentication extractor to
/// throw first.
#[derive(Clone)]
pub struct SeqHandler<H>(pub H);
pub struct SeqFromRequest<T>(T);
/// This macro implements `FromRequest` for arbitrary arity handler, except for one, which is
/// useless anyway.
macro_rules! gen_seq {
($ty:ident; $($T:ident)+) => {
pin_project! {
pub struct $ty<$($T: FromRequest), +> {
$(
#[pin]
$T: ExtractFuture<$T::Future, $T, $T::Error>,
)+
}
}
impl<$($T: FromRequest), +> Future for $ty<$($T),+> {
type Output = Result<SeqFromRequest<($($T),+)>, actix_web::Error>;
fn poll(self: Pin<&mut Self>, cx: &mut std::task::Context<'_>) -> Poll<Self::Output> {
let mut this = self.project();
let mut count_fut = 0;
let mut count_finished = 0;
$(
count_fut += 1;
match this.$T.as_mut().project() {
ExtractProj::Future { fut } => match fut.poll(cx) {
Poll::Ready(Ok(output)) => {
count_finished += 1;
let _ = this
.$T
.as_mut()
.project_replace(ExtractFuture::Done { output });
}
Poll::Ready(Err(error)) => {
count_finished += 1;
let _ = this
.$T
.as_mut()
.project_replace(ExtractFuture::Error { error });
}
Poll::Pending => (),
},
ExtractProj::Done { .. } => count_finished += 1,
ExtractProj::Error { .. } => {
// short circuit if all previous are finished and we had an error.
if count_finished == count_fut {
match this.$T.project_replace(ExtractFuture::Empty) {
ExtractReplaceProj::Error { error } => {
return Poll::Ready(Err(error.into()))
}
_ => unreachable!("Invalid future state"),
}
} else {
count_finished += 1;
}
}
ExtractProj::Empty => unreachable!("From request polled after being finished. {}", stringify!($T)),
}
)+
if count_fut == count_finished {
let result = (
$(
match this.$T.project_replace(ExtractFuture::Empty) {
ExtractReplaceProj::Done { output } => output,
ExtractReplaceProj::Error { error } => return Poll::Ready(Err(error.into())),
_ => unreachable!("Invalid future state"),
},
)+
);
Poll::Ready(Ok(SeqFromRequest(result)))
} else {
Poll::Pending
}
}
}
impl<$($T: FromRequest,)+> FromRequest for SeqFromRequest<($($T,)+)> {
type Error = actix_web::Error;
type Future = $ty<$($T),+>;
fn from_request(req: &HttpRequest, payload: &mut Payload) -> Self::Future {
$ty {
$(
$T: ExtractFuture::Future {
fut: $T::from_request(req, payload),
},
)+
}
}
}
impl<Han, $($T: FromRequest),+> Handler<SeqFromRequest<($($T),+)>> for SeqHandler<Han>
where
Han: Handler<($($T),+)>,
{
type Output = Han::Output;
type Future = Han::Future;
fn call(&self, args: SeqFromRequest<($($T),+)>) -> Self::Future {
self.0.call(args.0)
}
}
};
}
// Not working for a single argument, but then, it is not really necessary.
// gen_seq! { SeqFromRequestFut1; A }
gen_seq! { SeqFromRequestFut2; A B }
gen_seq! { SeqFromRequestFut3; A B C }
gen_seq! { SeqFromRequestFut4; A B C D }
gen_seq! { SeqFromRequestFut5; A B C D E }
gen_seq! { SeqFromRequestFut6; A B C D E F }
pin_project! {
#[project = ExtractProj]
#[project_replace = ExtractReplaceProj]
enum ExtractFuture<Fut, Res, Err> {
Future {
#[pin]
fut: Fut,
},
Done {
output: Res,
},
Error {
error: Err,
},
Empty,
}
}

View File

@ -1,10 +1,11 @@
use meilisearch_lib::heed::Env;
use walkdir::WalkDir;
pub trait EnvSizer {
fn size(&self) -> u64;
}
impl EnvSizer for heed::Env {
impl EnvSizer for Env {
fn size(&self) -> u64 {
WalkDir::new(self.path())
.into_iter()

View File

@ -1,4 +1,3 @@
pub mod compression;
mod env;
pub use env::EnvSizer;

View File

@ -1,194 +0,0 @@
use std::collections::{BTreeSet, HashSet};
use std::fs::create_dir_all;
use std::marker::PhantomData;
use std::ops::Deref;
use std::path::Path;
use std::sync::Arc;
use heed::{EnvOpenOptions, RoTxn};
use milli::obkv_to_json;
use serde::{de::Deserializer, Deserialize};
use serde_json::{Map, Value};
use crate::helpers::EnvSizer;
use error::Result;
pub use search::{default_crop_length, SearchQuery, SearchResult, DEFAULT_SEARCH_LIMIT};
pub use updates::{Checked, Facets, Settings, Unchecked};
use self::error::IndexError;
pub mod error;
pub mod update_handler;
mod dump;
mod search;
mod updates;
pub type Document = Map<String, Value>;
#[derive(Clone)]
pub struct Index(pub Arc<milli::Index>);
impl Deref for Index {
type Target = milli::Index;
fn deref(&self) -> &Self::Target {
self.0.as_ref()
}
}
pub fn deserialize_some<'de, T, D>(deserializer: D) -> std::result::Result<Option<T>, D::Error>
where
T: Deserialize<'de>,
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(Some)
}
impl Index {
pub fn open(path: impl AsRef<Path>, size: usize) -> Result<Self> {
create_dir_all(&path)?;
let mut options = EnvOpenOptions::new();
options.map_size(size);
let index = milli::Index::new(options, &path)?;
Ok(Index(Arc::new(index)))
}
pub fn settings(&self) -> Result<Settings<Checked>> {
let txn = self.read_txn()?;
self.settings_txn(&txn)
}
pub fn settings_txn(&self, txn: &RoTxn) -> Result<Settings<Checked>> {
let displayed_attributes = self
.displayed_fields(&txn)?
.map(|fields| fields.into_iter().map(String::from).collect());
let searchable_attributes = self
.searchable_fields(&txn)?
.map(|fields| fields.into_iter().map(String::from).collect());
let faceted_attributes = self.faceted_fields(&txn)?.into_iter().collect();
let criteria = self
.criteria(&txn)?
.into_iter()
.map(|c| c.to_string())
.collect();
let stop_words = self
.stop_words(&txn)?
.map(|stop_words| -> Result<BTreeSet<_>> {
Ok(stop_words.stream().into_strs()?.into_iter().collect())
})
.transpose()?
.unwrap_or_else(BTreeSet::new);
let distinct_field = self.distinct_field(&txn)?.map(String::from);
// in milli each word in the synonyms map were split on their separator. Since we lost
// this information we are going to put space between words.
let synonyms = self
.synonyms(&txn)?
.iter()
.map(|(key, values)| {
(
key.join(" "),
values.iter().map(|value| value.join(" ")).collect(),
)
})
.collect();
Ok(Settings {
displayed_attributes: Some(displayed_attributes),
searchable_attributes: Some(searchable_attributes),
filterable_attributes: Some(Some(faceted_attributes)),
ranking_rules: Some(Some(criteria)),
stop_words: Some(Some(stop_words)),
distinct_attribute: Some(distinct_field),
synonyms: Some(Some(synonyms)),
_kind: PhantomData,
})
}
pub fn retrieve_documents<S: AsRef<str>>(
&self,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<S>>,
) -> Result<Vec<Map<String, Value>>> {
let txn = self.read_txn()?;
let fields_ids_map = self.fields_ids_map(&txn)?;
let fields_to_display =
self.fields_to_display(&txn, &attributes_to_retrieve, &fields_ids_map)?;
let iter = self.documents.range(&txn, &(..))?.skip(offset).take(limit);
let mut documents = Vec::new();
for entry in iter {
let (_id, obkv) = entry?;
let object = obkv_to_json(&fields_to_display, &fields_ids_map, obkv)?;
documents.push(object);
}
Ok(documents)
}
pub fn retrieve_document<S: AsRef<str>>(
&self,
doc_id: String,
attributes_to_retrieve: Option<Vec<S>>,
) -> Result<Map<String, Value>> {
let txn = self.read_txn()?;
let fields_ids_map = self.fields_ids_map(&txn)?;
let fields_to_display =
self.fields_to_display(&txn, &attributes_to_retrieve, &fields_ids_map)?;
let internal_id = self
.external_documents_ids(&txn)?
.get(doc_id.as_bytes())
.ok_or_else(|| IndexError::DocumentNotFound(doc_id.clone()))?;
let document = self
.documents(&txn, std::iter::once(internal_id))?
.into_iter()
.next()
.map(|(_, d)| d)
.ok_or(IndexError::DocumentNotFound(doc_id))?;
let document = obkv_to_json(&fields_to_display, &fields_ids_map, document)?;
Ok(document)
}
pub fn size(&self) -> u64 {
self.env.size()
}
fn fields_to_display<S: AsRef<str>>(
&self,
txn: &heed::RoTxn,
attributes_to_retrieve: &Option<Vec<S>>,
fields_ids_map: &milli::FieldsIdsMap,
) -> Result<Vec<u8>> {
let mut displayed_fields_ids = match self.displayed_fields_ids(&txn)? {
Some(ids) => ids.into_iter().collect::<Vec<_>>(),
None => fields_ids_map.iter().map(|(id, _)| id).collect(),
};
let attributes_to_retrieve_ids = match attributes_to_retrieve {
Some(attrs) => attrs
.iter()
.filter_map(|f| fields_ids_map.id(f.as_ref()))
.collect::<HashSet<_>>(),
None => fields_ids_map.iter().map(|(id, _)| id).collect(),
};
displayed_fields_ids.retain(|fid| attributes_to_retrieve_ids.contains(fid));
Ok(displayed_fields_ids)
}
}

View File

@ -1,92 +0,0 @@
use std::fs::File;
use crate::index::Index;
use grenad::CompressionType;
use milli::update::UpdateBuilder;
use rayon::ThreadPool;
use crate::index_controller::UpdateMeta;
use crate::index_controller::{Failed, Processed, Processing};
use crate::option::IndexerOpts;
pub struct UpdateHandler {
max_nb_chunks: Option<usize>,
chunk_compression_level: Option<u32>,
thread_pool: ThreadPool,
log_frequency: usize,
max_memory: usize,
linked_hash_map_size: usize,
chunk_compression_type: CompressionType,
chunk_fusing_shrink_size: u64,
}
impl UpdateHandler {
pub fn new(opt: &IndexerOpts) -> anyhow::Result<Self> {
let thread_pool = rayon::ThreadPoolBuilder::new()
.num_threads(opt.indexing_jobs.unwrap_or(num_cpus::get() / 2))
.build()?;
Ok(Self {
max_nb_chunks: opt.max_nb_chunks,
chunk_compression_level: opt.chunk_compression_level,
thread_pool,
log_frequency: opt.log_every_n,
max_memory: opt.max_memory.get_bytes() as usize,
linked_hash_map_size: opt.linked_hash_map_size,
chunk_compression_type: opt.chunk_compression_type,
chunk_fusing_shrink_size: opt.chunk_fusing_shrink_size.get_bytes(),
})
}
pub fn update_builder(&self, update_id: u64) -> UpdateBuilder {
// We prepare the update by using the update builder.
let mut update_builder = UpdateBuilder::new(update_id);
if let Some(max_nb_chunks) = self.max_nb_chunks {
update_builder.max_nb_chunks(max_nb_chunks);
}
if let Some(chunk_compression_level) = self.chunk_compression_level {
update_builder.chunk_compression_level(chunk_compression_level);
}
update_builder.thread_pool(&self.thread_pool);
update_builder.log_every_n(self.log_frequency);
update_builder.max_memory(self.max_memory);
update_builder.linked_hash_map_size(self.linked_hash_map_size);
update_builder.chunk_compression_type(self.chunk_compression_type);
update_builder.chunk_fusing_shrink_size(self.chunk_fusing_shrink_size);
update_builder
}
pub fn handle_update(
&self,
meta: Processing,
content: Option<File>,
index: Index,
) -> Result<Processed, Failed> {
use UpdateMeta::*;
let update_id = meta.id();
let update_builder = self.update_builder(update_id);
let result = match meta.meta() {
DocumentsAddition {
method,
format,
primary_key,
} => index.update_documents(
*format,
*method,
content,
update_builder,
primary_key.as_deref(),
),
ClearDocuments => index.clear_documents(update_builder),
DeleteDocuments { ids } => index.delete_documents(ids, update_builder),
Settings(settings) => index.update_settings(&settings.clone().check(), update_builder),
};
match result {
Ok(result) => Ok(meta.process(result)),
Err(e) => Err(meta.fail(e.into())),
}
}
}

View File

@ -1,382 +0,0 @@
use std::collections::{BTreeMap, BTreeSet, HashSet};
use std::io;
use std::marker::PhantomData;
use std::num::NonZeroUsize;
use flate2::read::GzDecoder;
use log::{debug, info, trace};
use milli::update::{IndexDocumentsMethod, UpdateBuilder, UpdateFormat};
use serde::{Deserialize, Serialize, Serializer};
use crate::index_controller::UpdateResult;
use super::error::Result;
use super::{deserialize_some, Index};
fn serialize_with_wildcard<S>(
field: &Option<Option<Vec<String>>>,
s: S,
) -> std::result::Result<S::Ok, S::Error>
where
S: Serializer,
{
let wildcard = vec!["*".to_string()];
s.serialize_some(&field.as_ref().map(|o| o.as_ref().unwrap_or(&wildcard)))
}
#[derive(Clone, Default, Debug, Serialize)]
pub struct Checked;
#[derive(Clone, Default, Debug, Serialize, Deserialize)]
pub struct Unchecked;
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
#[serde(bound(serialize = "T: Serialize", deserialize = "T: Deserialize<'static>"))]
pub struct Settings<T> {
#[serde(
default,
deserialize_with = "deserialize_some",
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Option::is_none"
)]
pub displayed_attributes: Option<Option<Vec<String>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
serialize_with = "serialize_with_wildcard",
skip_serializing_if = "Option::is_none"
)]
pub searchable_attributes: Option<Option<Vec<String>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
skip_serializing_if = "Option::is_none"
)]
pub filterable_attributes: Option<Option<HashSet<String>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
skip_serializing_if = "Option::is_none"
)]
pub ranking_rules: Option<Option<Vec<String>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
skip_serializing_if = "Option::is_none"
)]
pub stop_words: Option<Option<BTreeSet<String>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
skip_serializing_if = "Option::is_none"
)]
pub synonyms: Option<Option<BTreeMap<String, Vec<String>>>>,
#[serde(
default,
deserialize_with = "deserialize_some",
skip_serializing_if = "Option::is_none"
)]
pub distinct_attribute: Option<Option<String>>,
#[serde(skip)]
pub _kind: PhantomData<T>,
}
impl Settings<Checked> {
pub fn cleared() -> Settings<Checked> {
Settings {
displayed_attributes: Some(None),
searchable_attributes: Some(None),
filterable_attributes: Some(None),
ranking_rules: Some(None),
stop_words: Some(None),
synonyms: Some(None),
distinct_attribute: Some(None),
_kind: PhantomData,
}
}
pub fn into_unchecked(self) -> Settings<Unchecked> {
let Self {
displayed_attributes,
searchable_attributes,
filterable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
..
} = self;
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes,
ranking_rules,
stop_words,
synonyms,
distinct_attribute,
_kind: PhantomData,
}
}
}
impl Settings<Unchecked> {
pub fn check(mut self) -> Settings<Checked> {
let displayed_attributes = match self.displayed_attributes.take() {
Some(Some(fields)) => {
if fields.iter().any(|f| f == "*") {
Some(None)
} else {
Some(Some(fields))
}
}
otherwise => otherwise,
};
let searchable_attributes = match self.searchable_attributes.take() {
Some(Some(fields)) => {
if fields.iter().any(|f| f == "*") {
Some(None)
} else {
Some(Some(fields))
}
}
otherwise => otherwise,
};
Settings {
displayed_attributes,
searchable_attributes,
filterable_attributes: self.filterable_attributes,
ranking_rules: self.ranking_rules,
stop_words: self.stop_words,
synonyms: self.synonyms,
distinct_attribute: self.distinct_attribute,
_kind: PhantomData,
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "camelCase")]
pub struct Facets {
pub level_group_size: Option<NonZeroUsize>,
pub min_level_size: Option<NonZeroUsize>,
}
impl Index {
pub fn update_documents(
&self,
format: UpdateFormat,
method: IndexDocumentsMethod,
content: Option<impl io::Read>,
update_builder: UpdateBuilder,
primary_key: Option<&str>,
) -> Result<UpdateResult> {
let mut txn = self.write_txn()?;
let result = self.update_documents_txn(
&mut txn,
format,
method,
content,
update_builder,
primary_key,
)?;
txn.commit()?;
Ok(result)
}
pub fn update_documents_txn<'a, 'b>(
&'a self,
txn: &mut heed::RwTxn<'a, 'b>,
format: UpdateFormat,
method: IndexDocumentsMethod,
content: Option<impl io::Read>,
update_builder: UpdateBuilder,
primary_key: Option<&str>,
) -> Result<UpdateResult> {
trace!("performing document addition");
// Set the primary key if not set already, ignore if already set.
if let (None, Some(primary_key)) = (self.primary_key(txn)?, primary_key) {
let mut builder = UpdateBuilder::new(0).settings(txn, &self);
builder.set_primary_key(primary_key.to_string());
builder.execute(|_, _| ())?;
}
let mut builder = update_builder.index_documents(txn, self);
builder.update_format(format);
builder.index_documents_method(method);
let indexing_callback =
|indexing_step, update_id| debug!("update {}: {:?}", update_id, indexing_step);
let gzipped = false;
let addition = match content {
Some(content) if gzipped => {
builder.execute(GzDecoder::new(content), indexing_callback)?
}
Some(content) => builder.execute(content, indexing_callback)?,
None => builder.execute(std::io::empty(), indexing_callback)?,
};
info!("document addition done: {:?}", addition);
Ok(UpdateResult::DocumentsAddition(addition))
}
pub fn clear_documents(&self, update_builder: UpdateBuilder) -> Result<UpdateResult> {
// We must use the write transaction of the update here.
let mut wtxn = self.write_txn()?;
let builder = update_builder.clear_documents(&mut wtxn, self);
let _count = builder.execute()?;
wtxn.commit()
.and(Ok(UpdateResult::Other))
.map_err(Into::into)
}
pub fn update_settings_txn<'a, 'b>(
&'a self,
txn: &mut heed::RwTxn<'a, 'b>,
settings: &Settings<Checked>,
update_builder: UpdateBuilder,
) -> Result<UpdateResult> {
// We must use the write transaction of the update here.
let mut builder = update_builder.settings(txn, self);
if let Some(ref names) = settings.searchable_attributes {
match names {
Some(names) => builder.set_searchable_fields(names.clone()),
None => builder.reset_searchable_fields(),
}
}
if let Some(ref names) = settings.displayed_attributes {
match names {
Some(names) => builder.set_displayed_fields(names.clone()),
None => builder.reset_displayed_fields(),
}
}
if let Some(ref facet_types) = settings.filterable_attributes {
let facet_types = facet_types.clone().unwrap_or_else(HashSet::new);
builder.set_filterable_fields(facet_types);
}
if let Some(ref criteria) = settings.ranking_rules {
match criteria {
Some(criteria) => builder.set_criteria(criteria.clone()),
None => builder.reset_criteria(),
}
}
if let Some(ref stop_words) = settings.stop_words {
match stop_words {
Some(stop_words) => builder.set_stop_words(stop_words.clone()),
None => builder.reset_stop_words(),
}
}
if let Some(ref synonyms) = settings.synonyms {
match synonyms {
Some(synonyms) => builder.set_synonyms(synonyms.clone().into_iter().collect()),
None => builder.reset_synonyms(),
}
}
if let Some(ref distinct_attribute) = settings.distinct_attribute {
match distinct_attribute {
Some(attr) => builder.set_distinct_field(attr.clone()),
None => builder.reset_distinct_field(),
}
}
builder.execute(|indexing_step, update_id| {
debug!("update {}: {:?}", update_id, indexing_step)
})?;
Ok(UpdateResult::Other)
}
pub fn update_settings(
&self,
settings: &Settings<Checked>,
update_builder: UpdateBuilder,
) -> Result<UpdateResult> {
let mut txn = self.write_txn()?;
let result = self.update_settings_txn(&mut txn, settings, update_builder)?;
txn.commit()?;
Ok(result)
}
pub fn delete_documents(
&self,
document_ids: &[String],
update_builder: UpdateBuilder,
) -> Result<UpdateResult> {
let mut txn = self.write_txn()?;
let mut builder = update_builder.delete_documents(&mut txn, self)?;
// We ignore unexisting document ids
document_ids.iter().for_each(|id| {
builder.delete_external_id(id);
});
let deleted = builder.execute()?;
txn.commit()
.and(Ok(UpdateResult::DocumentDeletion { deleted }))
.map_err(Into::into)
}
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_setting_check() {
// test no changes
let settings = Settings {
displayed_attributes: Some(Some(vec![String::from("hello")])),
searchable_attributes: Some(Some(vec![String::from("hello")])),
filterable_attributes: None,
ranking_rules: None,
stop_words: None,
synonyms: None,
distinct_attribute: None,
_kind: PhantomData::<Unchecked>,
};
let checked = settings.clone().check();
assert_eq!(settings.displayed_attributes, checked.displayed_attributes);
assert_eq!(
settings.searchable_attributes,
checked.searchable_attributes
);
// test wildcard
// test no changes
let settings = Settings {
displayed_attributes: Some(Some(vec![String::from("*")])),
searchable_attributes: Some(Some(vec![String::from("hello"), String::from("*")])),
filterable_attributes: None,
ranking_rules: None,
stop_words: None,
synonyms: None,
distinct_attribute: None,
_kind: PhantomData::<Unchecked>,
};
let checked = settings.check();
assert_eq!(checked.displayed_attributes, Some(None));
assert_eq!(checked.searchable_attributes, Some(None));
}
}

View File

@ -1,52 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
use crate::index_controller::update_actor::error::UpdateActorError;
use crate::index_controller::uuid_resolver::error::UuidResolverError;
pub type Result<T> = std::result::Result<T, DumpActorError>;
#[derive(thiserror::Error, Debug)]
pub enum DumpActorError {
#[error("Another dump is already in progress")]
DumpAlreadyRunning,
#[error("Dump `{0}` not found")]
DumpDoesNotExist(String),
#[error("Internal error: {0}")]
Internal(Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("{0}")]
UuidResolver(#[from] UuidResolverError),
#[error("{0}")]
UpdateActor(#[from] UpdateActorError),
}
macro_rules! internal_error {
($($other:path), *) => {
$(
impl From<$other> for DumpActorError {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
internal_error!(
heed::Error,
std::io::Error,
tokio::task::JoinError,
serde_json::error::Error,
tempfile::PersistError
);
impl ErrorCode for DumpActorError {
fn error_code(&self) -> Code {
match self {
DumpActorError::DumpAlreadyRunning => Code::DumpAlreadyInProgress,
DumpActorError::DumpDoesNotExist(_) => Code::NotFound,
DumpActorError::Internal(_) => Code::Internal,
DumpActorError::UuidResolver(e) => e.error_code(),
DumpActorError::UpdateActor(e) => e.error_code(),
}
}
}

View File

@ -1,53 +0,0 @@
use std::path::Path;
use actix_web::web::Bytes;
use tokio::sync::{mpsc, oneshot};
use super::error::Result;
use super::{DumpActor, DumpActorHandle, DumpInfo, DumpMsg};
#[derive(Clone)]
pub struct DumpActorHandleImpl {
sender: mpsc::Sender<DumpMsg>,
}
#[async_trait::async_trait]
impl DumpActorHandle for DumpActorHandleImpl {
async fn create_dump(&self) -> Result<DumpInfo> {
let (ret, receiver) = oneshot::channel();
let msg = DumpMsg::CreateDump { ret };
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
async fn dump_info(&self, uid: String) -> Result<DumpInfo> {
let (ret, receiver) = oneshot::channel();
let msg = DumpMsg::DumpInfo { ret, uid };
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
}
impl DumpActorHandleImpl {
pub fn new(
path: impl AsRef<Path>,
uuid_resolver: crate::index_controller::uuid_resolver::UuidResolverHandleImpl,
update: crate::index_controller::update_actor::UpdateActorHandleImpl<Bytes>,
index_db_size: usize,
update_db_size: usize,
) -> anyhow::Result<Self> {
let (sender, receiver) = mpsc::channel(10);
let actor = DumpActor::new(
receiver,
uuid_resolver,
update,
path,
index_db_size,
update_db_size,
);
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}

View File

@ -1,2 +0,0 @@
pub mod v1;
pub mod v2;

View File

@ -1,182 +0,0 @@
use std::collections::{BTreeMap, BTreeSet};
use std::fs::{create_dir_all, File};
use std::io::BufRead;
use std::marker::PhantomData;
use std::path::Path;
use std::sync::Arc;
use heed::EnvOpenOptions;
use log::{error, info, warn};
use milli::update::{IndexDocumentsMethod, UpdateFormat};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::index_controller::{self, uuid_resolver::HeedUuidStore, IndexMetadata};
use crate::{
index::{deserialize_some, update_handler::UpdateHandler, Index, Unchecked},
option::IndexerOpts,
};
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct MetadataV1 {
db_version: String,
indexes: Vec<IndexMetadata>,
}
impl MetadataV1 {
pub fn load_dump(
self,
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
size: usize,
indexer_options: &IndexerOpts,
) -> anyhow::Result<()> {
info!(
"Loading dump, dump database version: {}, dump version: V1",
self.db_version
);
let uuid_store = HeedUuidStore::new(&dst)?;
for index in self.indexes {
let uuid = Uuid::new_v4();
uuid_store.insert(index.uid.clone(), uuid)?;
let src = src.as_ref().join(index.uid);
load_index(
&src,
&dst,
uuid,
index.meta.primary_key.as_deref(),
size,
indexer_options,
)?;
}
Ok(())
}
}
// These are the settings used in legacy meilisearch (<v0.21.0).
#[derive(Default, Clone, Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct Settings {
#[serde(default, deserialize_with = "deserialize_some")]
pub ranking_rules: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub distinct_attribute: Option<Option<String>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub searchable_attributes: Option<Option<Vec<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub displayed_attributes: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub stop_words: Option<Option<BTreeSet<String>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub synonyms: Option<Option<BTreeMap<String, Vec<String>>>>,
#[serde(default, deserialize_with = "deserialize_some")]
pub filterable_attributes: Option<Option<Vec<String>>>,
}
fn load_index(
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
uuid: Uuid,
primary_key: Option<&str>,
size: usize,
indexer_options: &IndexerOpts,
) -> anyhow::Result<()> {
let index_path = dst.as_ref().join(&format!("indexes/index-{}", uuid));
create_dir_all(&index_path)?;
let mut options = EnvOpenOptions::new();
options.map_size(size);
let index = milli::Index::new(options, index_path)?;
let index = Index(Arc::new(index));
// extract `settings.json` file and import content
let settings = import_settings(&src)?;
let settings: index_controller::Settings<Unchecked> = settings.into();
let mut txn = index.write_txn()?;
let handler = UpdateHandler::new(&indexer_options)?;
index.update_settings_txn(&mut txn, &settings.check(), handler.update_builder(0))?;
let file = File::open(&src.as_ref().join("documents.jsonl"))?;
let mut reader = std::io::BufReader::new(file);
reader.fill_buf()?;
if !reader.buffer().is_empty() {
index.update_documents_txn(
&mut txn,
UpdateFormat::JsonStream,
IndexDocumentsMethod::ReplaceDocuments,
Some(reader),
handler.update_builder(0),
primary_key,
)?;
}
txn.commit()?;
// Finaly, we extract the original milli::Index and close it
Arc::try_unwrap(index.0)
.map_err(|_e| "Couldn't close the index properly")
.unwrap()
.prepare_for_closing()
.wait();
// Updates are ignored in dumps V1.
Ok(())
}
/// we need to **always** be able to convert the old settings to the settings currently being used
impl From<Settings> for index_controller::Settings<Unchecked> {
fn from(settings: Settings) -> Self {
Self {
distinct_attribute: settings.distinct_attribute,
// we need to convert the old `Vec<String>` into a `BTreeSet<String>`
displayed_attributes: settings.displayed_attributes.map(|o| o.map(|vec| vec.into_iter().collect())),
searchable_attributes: settings.searchable_attributes,
// we previously had a `Vec<String>` but now we have a `HashMap<String, String>`
// representing the name of the faceted field + the type of the field. Since the type
// was not known in the V1 of the dump we are just going to assume everything is a
// String
filterable_attributes: settings.filterable_attributes.map(|o| o.map(|vec| vec.into_iter().collect())),
// we need to convert the old `Vec<String>` into a `BTreeSet<String>`
ranking_rules: settings.ranking_rules.map(|o| o.map(|vec| vec.into_iter().filter_map(|criterion| {
match criterion.as_str() {
"words" | "typo" | "proximity" | "attribute" => Some(criterion),
s if s.starts_with("asc") || s.starts_with("desc") => Some(criterion),
"wordsPosition" => {
warn!("The criteria `words` and `wordsPosition` have been merged into a single criterion `words` so `wordsPositon` will be ignored");
Some(String::from("words"))
}
"exactness" => {
error!("The criterion `{}` is not implemented currently and thus will be ignored", criterion);
None
}
s => {
error!("Unknown criterion found in the dump: `{}`, it will be ignored", s);
None
}
}
}).collect())),
// we need to convert the old `Vec<String>` into a `BTreeSet<String>`
stop_words: settings.stop_words.map(|o| o.map(|vec| vec.into_iter().collect())),
// we need to convert the old `Vec<String>` into a `BTreeMap<String>`
synonyms: settings.synonyms.map(|o| o.map(|vec| vec.into_iter().collect())),
_kind: PhantomData,
}
}
}
/// Extract Settings from `settings.json` file present at provided `dir_path`
fn import_settings(dir_path: impl AsRef<Path>) -> anyhow::Result<Settings> {
let path = dir_path.as_ref().join("settings.json");
let file = File::open(path)?;
let reader = std::io::BufReader::new(file);
let metadata = serde_json::from_reader(reader)?;
Ok(metadata)
}

View File

@ -1,59 +0,0 @@
use std::path::Path;
use chrono::{DateTime, Utc};
use log::info;
use serde::{Deserialize, Serialize};
use crate::index::Index;
use crate::index_controller::{update_actor::UpdateStore, uuid_resolver::HeedUuidStore};
use crate::option::IndexerOpts;
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct MetadataV2 {
db_version: String,
index_db_size: usize,
update_db_size: usize,
dump_date: DateTime<Utc>,
}
impl MetadataV2 {
pub fn new(index_db_size: usize, update_db_size: usize) -> Self {
Self {
db_version: env!("CARGO_PKG_VERSION").to_string(),
index_db_size,
update_db_size,
dump_date: Utc::now(),
}
}
pub fn load_dump(
self,
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
index_db_size: usize,
update_db_size: usize,
indexing_options: &IndexerOpts,
) -> anyhow::Result<()> {
info!(
"Loading dump from {}, dump database version: {}, dump version: V2",
self.dump_date, self.db_version
);
info!("Loading index database.");
HeedUuidStore::load_dump(src.as_ref(), &dst)?;
info!("Loading updates.");
UpdateStore::load_dump(&src, &dst, update_db_size)?;
info!("Loading indexes.");
let indexes_path = src.as_ref().join("indexes");
let indexes = indexes_path.read_dir()?;
for index in indexes {
let index = index?;
Index::load_dump(&index.path(), &dst, index_db_size, indexing_options)?;
}
Ok(())
}
}

View File

@ -1,203 +0,0 @@
use std::fs::File;
use std::path::{Path, PathBuf};
use anyhow::Context;
use chrono::{DateTime, Utc};
use log::{info, trace, warn};
#[cfg(test)]
use mockall::automock;
use serde::{Deserialize, Serialize};
use tokio::fs::create_dir_all;
use loaders::v1::MetadataV1;
use loaders::v2::MetadataV2;
pub use actor::DumpActor;
pub use handle_impl::*;
pub use message::DumpMsg;
use super::{update_actor::UpdateActorHandle, uuid_resolver::UuidResolverHandle};
use crate::index_controller::dump_actor::error::DumpActorError;
use crate::{helpers::compression, option::IndexerOpts};
use error::Result;
mod actor;
pub mod error;
mod handle_impl;
mod loaders;
mod message;
const META_FILE_NAME: &str = "metadata.json";
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait DumpActorHandle {
/// Start the creation of a dump
/// Implementation: [handle_impl::DumpActorHandleImpl::create_dump]
async fn create_dump(&self) -> Result<DumpInfo>;
/// Return the status of an already created dump
/// Implementation: [handle_impl::DumpActorHandleImpl::dump_status]
async fn dump_info(&self, uid: String) -> Result<DumpInfo>;
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "dumpVersion")]
pub enum Metadata {
V1(MetadataV1),
V2(MetadataV2),
}
impl Metadata {
pub fn new_v2(index_db_size: usize, update_db_size: usize) -> Self {
let meta = MetadataV2::new(index_db_size, update_db_size);
Self::V2(meta)
}
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Clone)]
#[serde(rename_all = "snake_case")]
pub enum DumpStatus {
Done,
InProgress,
Failed,
}
#[derive(Debug, Serialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct DumpInfo {
pub uid: String,
pub status: DumpStatus,
#[serde(skip_serializing_if = "Option::is_none")]
pub error: Option<String>,
started_at: DateTime<Utc>,
#[serde(skip_serializing_if = "Option::is_none")]
finished_at: Option<DateTime<Utc>>,
}
impl DumpInfo {
pub fn new(uid: String, status: DumpStatus) -> Self {
Self {
uid,
status,
error: None,
started_at: Utc::now(),
finished_at: None,
}
}
pub fn with_error(&mut self, error: String) {
self.status = DumpStatus::Failed;
self.finished_at = Some(Utc::now());
self.error = Some(error);
}
pub fn done(&mut self) {
self.finished_at = Some(Utc::now());
self.status = DumpStatus::Done;
}
pub fn dump_already_in_progress(&self) -> bool {
self.status == DumpStatus::InProgress
}
}
pub fn load_dump(
dst_path: impl AsRef<Path>,
src_path: impl AsRef<Path>,
index_db_size: usize,
update_db_size: usize,
indexer_opts: &IndexerOpts,
) -> anyhow::Result<()> {
let tmp_src = tempfile::tempdir_in(".")?;
let tmp_src_path = tmp_src.path();
compression::from_tar_gz(&src_path, tmp_src_path)?;
let meta_path = tmp_src_path.join(META_FILE_NAME);
let mut meta_file = File::open(&meta_path)?;
let meta: Metadata = serde_json::from_reader(&mut meta_file)?;
let dst_dir = dst_path
.as_ref()
.parent()
.with_context(|| format!("Invalid db path: {}", dst_path.as_ref().display()))?;
let tmp_dst = tempfile::tempdir_in(dst_dir)?;
match meta {
Metadata::V1(meta) => {
meta.load_dump(&tmp_src_path, tmp_dst.path(), index_db_size, indexer_opts)?
}
Metadata::V2(meta) => meta.load_dump(
&tmp_src_path,
tmp_dst.path(),
index_db_size,
update_db_size,
indexer_opts,
)?,
}
// Persist and atomically rename the db
let persisted_dump = tmp_dst.into_path();
if dst_path.as_ref().exists() {
warn!("Overwriting database at {}", dst_path.as_ref().display());
std::fs::remove_dir_all(&dst_path)?;
}
std::fs::rename(&persisted_dump, &dst_path)?;
Ok(())
}
struct DumpTask<U, P> {
path: PathBuf,
uuid_resolver: U,
update_handle: P,
uid: String,
update_db_size: usize,
index_db_size: usize,
}
impl<U, P> DumpTask<U, P>
where
U: UuidResolverHandle + Send + Sync + Clone + 'static,
P: UpdateActorHandle + Send + Sync + Clone + 'static,
{
async fn run(self) -> Result<()> {
trace!("Performing dump.");
create_dir_all(&self.path).await?;
let path_clone = self.path.clone();
let temp_dump_dir =
tokio::task::spawn_blocking(|| tempfile::TempDir::new_in(path_clone)).await??;
let temp_dump_path = temp_dump_dir.path().to_owned();
let meta = Metadata::new_v2(self.index_db_size, self.update_db_size);
let meta_path = temp_dump_path.join(META_FILE_NAME);
let mut meta_file = File::create(&meta_path)?;
serde_json::to_writer(&mut meta_file, &meta)?;
let uuids = self.uuid_resolver.dump(temp_dump_path.clone()).await?;
self.update_handle
.dump(uuids, temp_dump_path.clone())
.await?;
let dump_path = tokio::task::spawn_blocking(move || -> Result<PathBuf> {
let temp_dump_file = tempfile::NamedTempFile::new_in(&self.path)?;
compression::to_tar_gz(temp_dump_path, temp_dump_file.path())
.map_err(|e| DumpActorError::Internal(e.into()))?;
let dump_path = self.path.join(self.uid).with_extension("dump");
temp_dump_file.persist(&dump_path)?;
Ok(dump_path)
})
.await??;
info!("Created dump in {:?}.", dump_path);
Ok(())
}
}

View File

@ -1,40 +0,0 @@
use meilisearch_error::Code;
use meilisearch_error::ErrorCode;
use crate::index::error::IndexError;
use super::dump_actor::error::DumpActorError;
use super::index_actor::error::IndexActorError;
use super::update_actor::error::UpdateActorError;
use super::uuid_resolver::error::UuidResolverError;
pub type Result<T> = std::result::Result<T, IndexControllerError>;
#[derive(Debug, thiserror::Error)]
pub enum IndexControllerError {
#[error("Index creation must have an uid")]
MissingUid,
#[error("{0}")]
Uuid(#[from] UuidResolverError),
#[error("{0}")]
IndexActor(#[from] IndexActorError),
#[error("{0}")]
UpdateActor(#[from] UpdateActorError),
#[error("{0}")]
DumpActor(#[from] DumpActorError),
#[error("{0}")]
IndexError(#[from] IndexError),
}
impl ErrorCode for IndexControllerError {
fn error_code(&self) -> Code {
match self {
IndexControllerError::MissingUid => Code::BadRequest,
IndexControllerError::Uuid(e) => e.error_code(),
IndexControllerError::IndexActor(e) => e.error_code(),
IndexControllerError::UpdateActor(e) => e.error_code(),
IndexControllerError::DumpActor(e) => e.error_code(),
IndexControllerError::IndexError(e) => e.error_code(),
}
}
}

View File

@ -1,348 +0,0 @@
use std::fs::File;
use std::path::PathBuf;
use std::sync::Arc;
use async_stream::stream;
use futures::stream::StreamExt;
use heed::CompactionOption;
use log::debug;
use milli::update::UpdateBuilder;
use tokio::task::spawn_blocking;
use tokio::{fs, sync::mpsc};
use uuid::Uuid;
use crate::index::{
update_handler::UpdateHandler, Checked, Document, SearchQuery, SearchResult, Settings,
};
use crate::index_controller::{
get_arc_ownership_blocking, Failed, IndexStats, Processed, Processing,
};
use crate::option::IndexerOpts;
use super::error::{IndexActorError, Result};
use super::{IndexMeta, IndexMsg, IndexSettings, IndexStore};
pub const CONCURRENT_INDEX_MSG: usize = 10;
pub struct IndexActor<S> {
receiver: Option<mpsc::Receiver<IndexMsg>>,
update_handler: Arc<UpdateHandler>,
store: S,
}
impl<S: IndexStore + Sync + Send> IndexActor<S> {
pub fn new(receiver: mpsc::Receiver<IndexMsg>, store: S) -> anyhow::Result<Self> {
let options = IndexerOpts::default();
let update_handler = UpdateHandler::new(&options)?;
let update_handler = Arc::new(update_handler);
let receiver = Some(receiver);
Ok(Self {
receiver,
update_handler,
store,
})
}
/// `run` poll the write_receiver and read_receiver concurrently, but while messages send
/// through the read channel are processed concurrently, the messages sent through the write
/// channel are processed one at a time.
pub async fn run(mut self) {
let mut receiver = self
.receiver
.take()
.expect("Index Actor must have a inbox at this point.");
let stream = stream! {
loop {
match receiver.recv().await {
Some(msg) => yield msg,
None => break,
}
}
};
stream
.for_each_concurrent(Some(CONCURRENT_INDEX_MSG), |msg| self.handle_message(msg))
.await;
}
async fn handle_message(&self, msg: IndexMsg) {
use IndexMsg::*;
match msg {
CreateIndex {
uuid,
primary_key,
ret,
} => {
let _ = ret.send(self.handle_create_index(uuid, primary_key).await);
}
Update {
ret,
meta,
data,
uuid,
} => {
let _ = ret.send(self.handle_update(uuid, meta, data).await);
}
Search { ret, query, uuid } => {
let _ = ret.send(self.handle_search(uuid, query).await);
}
Settings { ret, uuid } => {
let _ = ret.send(self.handle_settings(uuid).await);
}
Documents {
ret,
uuid,
attributes_to_retrieve,
offset,
limit,
} => {
let _ = ret.send(
self.handle_fetch_documents(uuid, offset, limit, attributes_to_retrieve)
.await,
);
}
Document {
uuid,
attributes_to_retrieve,
doc_id,
ret,
} => {
let _ = ret.send(
self.handle_fetch_document(uuid, doc_id, attributes_to_retrieve)
.await,
);
}
Delete { uuid, ret } => {
let _ = ret.send(self.handle_delete(uuid).await);
}
GetMeta { uuid, ret } => {
let _ = ret.send(self.handle_get_meta(uuid).await);
}
UpdateIndex {
uuid,
index_settings,
ret,
} => {
let _ = ret.send(self.handle_update_index(uuid, index_settings).await);
}
Snapshot { uuid, path, ret } => {
let _ = ret.send(self.handle_snapshot(uuid, path).await);
}
Dump { uuid, path, ret } => {
let _ = ret.send(self.handle_dump(uuid, path).await);
}
GetStats { uuid, ret } => {
let _ = ret.send(self.handle_get_stats(uuid).await);
}
}
}
async fn handle_search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || index.perform_search(query)).await??;
Ok(result)
}
async fn handle_create_index(
&self,
uuid: Uuid,
primary_key: Option<String>,
) -> Result<IndexMeta> {
let index = self.store.create(uuid, primary_key).await?;
let meta = spawn_blocking(move || IndexMeta::new(&index)).await??;
Ok(meta)
}
async fn handle_update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<File>,
) -> Result<std::result::Result<Processed, Failed>> {
debug!("Processing update {}", meta.id());
let update_handler = self.update_handler.clone();
let index = match self.store.get(uuid).await? {
Some(index) => index,
None => self.store.create(uuid, None).await?,
};
Ok(spawn_blocking(move || update_handler.handle_update(meta, data, index)).await?)
}
async fn handle_settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || index.settings()).await??;
Ok(result)
}
async fn handle_fetch_documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result =
spawn_blocking(move || index.retrieve_documents(offset, limit, attributes_to_retrieve))
.await??;
Ok(result)
}
async fn handle_fetch_document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result =
spawn_blocking(move || index.retrieve_document(doc_id, attributes_to_retrieve))
.await??;
Ok(result)
}
async fn handle_delete(&self, uuid: Uuid) -> Result<()> {
let index = self.store.delete(uuid).await?;
if let Some(index) = index {
tokio::task::spawn(async move {
let index = index.0;
let store = get_arc_ownership_blocking(index).await;
spawn_blocking(move || {
store.prepare_for_closing().wait();
debug!("Index closed");
});
});
}
Ok(())
}
async fn handle_get_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
match self.store.get(uuid).await? {
Some(index) => {
let meta = spawn_blocking(move || IndexMeta::new(&index)).await??;
Ok(meta)
}
None => Err(IndexActorError::UnexistingIndex),
}
}
async fn handle_update_index(
&self,
uuid: Uuid,
index_settings: IndexSettings,
) -> Result<IndexMeta> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let result = spawn_blocking(move || match index_settings.primary_key {
Some(primary_key) => {
let mut txn = index.write_txn()?;
if index.primary_key(&txn)?.is_some() {
return Err(IndexActorError::ExistingPrimaryKey);
}
let mut builder = UpdateBuilder::new(0).settings(&mut txn, &index);
builder.set_primary_key(primary_key);
builder.execute(|_, _| ())?;
let meta = IndexMeta::new_txn(&index, &txn)?;
txn.commit()?;
Ok(meta)
}
None => {
let meta = IndexMeta::new(&index)?;
Ok(meta)
}
})
.await??;
Ok(result)
}
async fn handle_snapshot(&self, uuid: Uuid, mut path: PathBuf) -> Result<()> {
use tokio::fs::create_dir_all;
path.push("indexes");
create_dir_all(&path).await?;
if let Some(index) = self.store.get(uuid).await? {
let mut index_path = path.join(format!("index-{}", uuid));
create_dir_all(&index_path).await?;
index_path.push("data.mdb");
spawn_blocking(move || -> Result<()> {
// Get write txn to wait for ongoing write transaction before snapshot.
let _txn = index.write_txn()?;
index
.env
.copy_to_path(index_path, CompactionOption::Enabled)?;
Ok(())
})
.await??;
}
Ok(())
}
/// Create a `documents.jsonl` and a `settings.json` in `path/uid/` with a dump of all the
/// documents and all the settings.
async fn handle_dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
let path = path.join(format!("indexes/index-{}/", uuid));
fs::create_dir_all(&path).await?;
tokio::task::spawn_blocking(move || index.dump(path)).await??;
Ok(())
}
async fn handle_get_stats(&self, uuid: Uuid) -> Result<IndexStats> {
let index = self
.store
.get(uuid)
.await?
.ok_or(IndexActorError::UnexistingIndex)?;
spawn_blocking(move || {
let rtxn = index.read_txn()?;
Ok(IndexStats {
size: index.size(),
number_of_documents: index.number_of_documents(&rtxn)?,
is_indexing: None,
field_distribution: index.field_distribution(&rtxn)?,
})
})
.await?
}
}

View File

@ -1,48 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
use crate::{error::MilliError, index::error::IndexError};
pub type Result<T> = std::result::Result<T, IndexActorError>;
#[derive(thiserror::Error, Debug)]
pub enum IndexActorError {
#[error("{0}")]
IndexError(#[from] IndexError),
#[error("Index already exists")]
IndexAlreadyExists,
#[error("Index not found")]
UnexistingIndex,
#[error("A primary key is already present. It's impossible to update it")]
ExistingPrimaryKey,
#[error("Internal Error: {0}")]
Internal(Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("{0}")]
Milli(#[from] milli::Error),
}
macro_rules! internal_error {
($($other:path), *) => {
$(
impl From<$other> for IndexActorError {
fn from(other: $other) -> Self {
Self::Internal(Box::new(other))
}
}
)*
}
}
internal_error!(heed::Error, tokio::task::JoinError, std::io::Error);
impl ErrorCode for IndexActorError {
fn error_code(&self) -> Code {
match self {
IndexActorError::IndexError(e) => e.error_code(),
IndexActorError::IndexAlreadyExists => Code::IndexAlreadyExists,
IndexActorError::UnexistingIndex => Code::IndexNotFound,
IndexActorError::ExistingPrimaryKey => Code::PrimaryKeyAlreadyPresent,
IndexActorError::Internal(_) => Code::Internal,
IndexActorError::Milli(e) => MilliError(e).error_code(),
}
}
}

View File

@ -1,159 +0,0 @@
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use crate::{
index::Checked,
index_controller::{IndexSettings, IndexStats, Processing},
};
use crate::{
index::{Document, SearchQuery, SearchResult, Settings},
index_controller::{Failed, Processed},
};
use super::error::Result;
use super::{IndexActor, IndexActorHandle, IndexMeta, IndexMsg, MapIndexStore};
#[derive(Clone)]
pub struct IndexActorHandleImpl {
sender: mpsc::Sender<IndexMsg>,
}
#[async_trait::async_trait]
impl IndexActorHandle for IndexActorHandleImpl {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::CreateIndex {
ret,
uuid,
primary_key,
};
let _ = self.sender.send(msg).await;
receiver.await.expect("IndexActor has been killed")
}
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
) -> Result<std::result::Result<Processed, Failed>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Update {
ret,
meta,
data,
uuid,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Search { uuid, query, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Settings { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Documents {
uuid,
ret,
offset,
attributes_to_retrieve,
limit,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Document {
uuid,
ret,
doc_id,
attributes_to_retrieve,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Delete { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::GetMeta { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn update_index(&self, uuid: Uuid, index_settings: IndexSettings) -> Result<IndexMeta> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::UpdateIndex {
uuid,
index_settings,
ret,
};
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Snapshot { uuid, path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::Dump { uuid, path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats> {
let (ret, receiver) = oneshot::channel();
let msg = IndexMsg::GetStats { uuid, ret };
let _ = self.sender.send(msg).await;
Ok(receiver.await.expect("IndexActor has been killed")?)
}
}
impl IndexActorHandleImpl {
pub fn new(path: impl AsRef<Path>, index_size: usize) -> anyhow::Result<Self> {
let (sender, receiver) = mpsc::channel(100);
let store = MapIndexStore::new(path, index_size);
let actor = IndexActor::new(receiver, store)?;
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}

View File

@ -1,74 +0,0 @@
use std::path::PathBuf;
use tokio::sync::oneshot;
use uuid::Uuid;
use super::error::Result as IndexResult;
use crate::index::{Checked, Document, SearchQuery, SearchResult, Settings};
use crate::index_controller::{Failed, IndexStats, Processed, Processing};
use super::{IndexMeta, IndexSettings};
#[allow(clippy::large_enum_variant)]
pub enum IndexMsg {
CreateIndex {
uuid: Uuid,
primary_key: Option<String>,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
Update {
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
ret: oneshot::Sender<IndexResult<Result<Processed, Failed>>>,
},
Search {
uuid: Uuid,
query: SearchQuery,
ret: oneshot::Sender<IndexResult<SearchResult>>,
},
Settings {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<Settings<Checked>>>,
},
Documents {
uuid: Uuid,
attributes_to_retrieve: Option<Vec<String>>,
offset: usize,
limit: usize,
ret: oneshot::Sender<IndexResult<Vec<Document>>>,
},
Document {
uuid: Uuid,
attributes_to_retrieve: Option<Vec<String>>,
doc_id: String,
ret: oneshot::Sender<IndexResult<Document>>,
},
Delete {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<()>>,
},
GetMeta {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
UpdateIndex {
uuid: Uuid,
index_settings: IndexSettings,
ret: oneshot::Sender<IndexResult<IndexMeta>>,
},
Snapshot {
uuid: Uuid,
path: PathBuf,
ret: oneshot::Sender<IndexResult<()>>,
},
Dump {
uuid: Uuid,
path: PathBuf,
ret: oneshot::Sender<IndexResult<()>>,
},
GetStats {
uuid: Uuid,
ret: oneshot::Sender<IndexResult<IndexStats>>,
},
}

View File

@ -1,169 +0,0 @@
use std::fs::File;
use std::path::PathBuf;
use chrono::{DateTime, Utc};
#[cfg(test)]
use mockall::automock;
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use actor::IndexActor;
pub use actor::CONCURRENT_INDEX_MSG;
pub use handle_impl::IndexActorHandleImpl;
use message::IndexMsg;
use store::{IndexStore, MapIndexStore};
use crate::index::{Checked, Document, Index, SearchQuery, SearchResult, Settings};
use crate::index_controller::{Failed, IndexStats, Processed, Processing};
use error::Result;
use super::IndexSettings;
mod actor;
pub mod error;
mod handle_impl;
mod message;
mod store;
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct IndexMeta {
created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
pub primary_key: Option<String>,
}
impl IndexMeta {
fn new(index: &Index) -> Result<Self> {
let txn = index.read_txn()?;
Self::new_txn(index, &txn)
}
fn new_txn(index: &Index, txn: &heed::RoTxn) -> Result<Self> {
let created_at = index.created_at(&txn)?;
let updated_at = index.updated_at(&txn)?;
let primary_key = index.primary_key(&txn)?.map(String::from);
Ok(Self {
created_at,
updated_at,
primary_key,
})
}
}
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait IndexActorHandle {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta>;
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<File>,
) -> Result<std::result::Result<Processed, Failed>>;
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult>;
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>>;
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>>;
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document>;
async fn delete(&self, uuid: Uuid) -> Result<()>;
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta>;
async fn update_index(&self, uuid: Uuid, index_settings: IndexSettings) -> Result<IndexMeta>;
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()>;
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()>;
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats>;
}
#[cfg(test)]
mod test {
use std::sync::Arc;
use super::*;
#[async_trait::async_trait]
/// Useful for passing around an `Arc<MockIndexActorHandle>` in tests.
impl IndexActorHandle for Arc<MockIndexActorHandle> {
async fn create_index(&self, uuid: Uuid, primary_key: Option<String>) -> Result<IndexMeta> {
self.as_ref().create_index(uuid, primary_key).await
}
async fn update(
&self,
uuid: Uuid,
meta: Processing,
data: Option<std::fs::File>,
) -> Result<std::result::Result<Processed, Failed>> {
self.as_ref().update(uuid, meta, data).await
}
async fn search(&self, uuid: Uuid, query: SearchQuery) -> Result<SearchResult> {
self.as_ref().search(uuid, query).await
}
async fn settings(&self, uuid: Uuid) -> Result<Settings<Checked>> {
self.as_ref().settings(uuid).await
}
async fn documents(
&self,
uuid: Uuid,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
self.as_ref()
.documents(uuid, offset, limit, attributes_to_retrieve)
.await
}
async fn document(
&self,
uuid: Uuid,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
self.as_ref()
.document(uuid, doc_id, attributes_to_retrieve)
.await
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
self.as_ref().delete(uuid).await
}
async fn get_index_meta(&self, uuid: Uuid) -> Result<IndexMeta> {
self.as_ref().get_index_meta(uuid).await
}
async fn update_index(
&self,
uuid: Uuid,
index_settings: IndexSettings,
) -> Result<IndexMeta> {
self.as_ref().update_index(uuid, index_settings).await
}
async fn snapshot(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
self.as_ref().snapshot(uuid, path).await
}
async fn dump(&self, uuid: Uuid, path: PathBuf) -> Result<()> {
self.as_ref().dump(uuid, path).await
}
async fn get_index_stats(&self, uuid: Uuid) -> Result<IndexStats> {
self.as_ref().get_index_stats(uuid).await
}
}
}

View File

@ -1,441 +0,0 @@
use std::collections::BTreeMap;
use std::path::Path;
use std::sync::Arc;
use std::time::Duration;
use actix_web::web::Bytes;
use chrono::{DateTime, Utc};
use futures::stream::StreamExt;
use log::error;
use log::info;
use milli::FieldDistribution;
use serde::{Deserialize, Serialize};
use tokio::sync::mpsc;
use tokio::time::sleep;
use uuid::Uuid;
use dump_actor::DumpActorHandle;
pub use dump_actor::{DumpInfo, DumpStatus};
use index_actor::IndexActorHandle;
use snapshot::{load_snapshot, SnapshotService};
use update_actor::UpdateActorHandle;
pub use updates::*;
use uuid_resolver::{error::UuidResolverError, UuidResolverHandle};
use crate::extractors::payload::Payload;
use crate::index::{Checked, Document, SearchQuery, SearchResult, Settings};
use crate::option::Opt;
use error::Result;
use self::dump_actor::load_dump;
use self::error::IndexControllerError;
mod dump_actor;
pub mod error;
pub mod index_actor;
mod snapshot;
mod update_actor;
mod updates;
mod uuid_resolver;
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct IndexMetadata {
#[serde(skip)]
pub uuid: Uuid,
pub uid: String,
name: String,
#[serde(flatten)]
pub meta: index_actor::IndexMeta,
}
#[derive(Clone, Debug)]
pub struct IndexSettings {
pub uid: Option<String>,
pub primary_key: Option<String>,
}
#[derive(Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct IndexStats {
#[serde(skip)]
pub size: u64,
pub number_of_documents: u64,
/// Whether the current index is performing an update. It is initially `None` when the
/// index returns it, since it is the `UpdateStore` that knows what index is currently indexing. It is
/// later set to either true or false, we we retrieve the information from the `UpdateStore`
pub is_indexing: Option<bool>,
pub field_distribution: FieldDistribution,
}
#[derive(Clone)]
pub struct IndexController {
uuid_resolver: uuid_resolver::UuidResolverHandleImpl,
index_handle: index_actor::IndexActorHandleImpl,
update_handle: update_actor::UpdateActorHandleImpl<Bytes>,
dump_handle: dump_actor::DumpActorHandleImpl,
}
#[derive(Serialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Stats {
pub database_size: u64,
pub last_update: Option<DateTime<Utc>>,
pub indexes: BTreeMap<String, IndexStats>,
}
impl IndexController {
pub fn new(path: impl AsRef<Path>, options: &Opt) -> anyhow::Result<Self> {
let index_size = options.max_index_size.get_bytes() as usize;
let update_store_size = options.max_index_size.get_bytes() as usize;
if let Some(ref path) = options.import_snapshot {
info!("Loading from snapshot {:?}", path);
load_snapshot(
&options.db_path,
path,
options.ignore_snapshot_if_db_exists,
options.ignore_missing_snapshot,
)?;
} else if let Some(ref src_path) = options.import_dump {
load_dump(
&options.db_path,
src_path,
options.max_index_size.get_bytes() as usize,
options.max_udb_size.get_bytes() as usize,
&options.indexer_options,
)?;
}
std::fs::create_dir_all(&path)?;
let uuid_resolver = uuid_resolver::UuidResolverHandleImpl::new(&path)?;
let index_handle = index_actor::IndexActorHandleImpl::new(&path, index_size)?;
let update_handle = update_actor::UpdateActorHandleImpl::new(
index_handle.clone(),
&path,
update_store_size,
)?;
let dump_handle = dump_actor::DumpActorHandleImpl::new(
&options.dumps_dir,
uuid_resolver.clone(),
update_handle.clone(),
options.max_index_size.get_bytes() as usize,
options.max_udb_size.get_bytes() as usize,
)?;
if options.schedule_snapshot {
let snapshot_service = SnapshotService::new(
uuid_resolver.clone(),
update_handle.clone(),
Duration::from_secs(options.snapshot_interval_sec),
options.snapshot_dir.clone(),
options
.db_path
.file_name()
.map(|n| n.to_owned().into_string().expect("invalid path"))
.unwrap_or_else(|| String::from("data.ms")),
);
tokio::task::spawn(snapshot_service.run());
}
Ok(Self {
uuid_resolver,
index_handle,
update_handle,
dump_handle,
})
}
pub async fn add_documents(
&self,
uid: String,
method: milli::update::IndexDocumentsMethod,
format: milli::update::UpdateFormat,
payload: Payload,
primary_key: Option<String>,
) -> Result<UpdateStatus> {
let perform_update = |uuid| async move {
let meta = UpdateMeta::DocumentsAddition {
method,
format,
primary_key,
};
let (sender, receiver) = mpsc::channel(10);
// It is necessary to spawn a local task to send the payload to the update handle to
// prevent dead_locking between the update_handle::update that waits for the update to be
// registered and the update_actor that waits for the the payload to be sent to it.
tokio::task::spawn_local(async move {
payload
.for_each(|r| async {
let _ = sender.send(r).await;
})
.await
});
// This must be done *AFTER* spawning the task.
self.update_handle.update(meta, receiver, uuid).await
};
match self.uuid_resolver.get(uid).await {
Ok(uuid) => Ok(perform_update(uuid).await?),
Err(UuidResolverError::UnexistingIndex(name)) => {
let uuid = Uuid::new_v4();
let status = perform_update(uuid).await?;
// ignore if index creation fails now, since it may already have been created
let _ = self.index_handle.create_index(uuid, None).await;
self.uuid_resolver.insert(name, uuid).await?;
Ok(status)
}
Err(e) => Err(e.into()),
}
}
pub async fn clear_documents(&self, uid: String) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let meta = UpdateMeta::ClearDocuments;
let (_, receiver) = mpsc::channel(1);
let status = self.update_handle.update(meta, receiver, uuid).await?;
Ok(status)
}
pub async fn delete_documents(
&self,
uid: String,
documents: Vec<String>,
) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let meta = UpdateMeta::DeleteDocuments { ids: documents };
let (_, receiver) = mpsc::channel(1);
let status = self.update_handle.update(meta, receiver, uuid).await?;
Ok(status)
}
pub async fn update_settings(
&self,
uid: String,
settings: Settings<Checked>,
create: bool,
) -> Result<UpdateStatus> {
let perform_udpate = |uuid| async move {
let meta = UpdateMeta::Settings(settings.into_unchecked());
// Nothing so send, drop the sender right away, as not to block the update actor.
let (_, receiver) = mpsc::channel(1);
self.update_handle.update(meta, receiver, uuid).await
};
match self.uuid_resolver.get(uid).await {
Ok(uuid) => Ok(perform_udpate(uuid).await?),
Err(UuidResolverError::UnexistingIndex(name)) if create => {
let uuid = Uuid::new_v4();
let status = perform_udpate(uuid).await?;
// ignore if index creation fails now, since it may already have been created
let _ = self.index_handle.create_index(uuid, None).await;
self.uuid_resolver.insert(name, uuid).await?;
Ok(status)
}
Err(e) => Err(e.into()),
}
}
pub async fn create_index(&self, index_settings: IndexSettings) -> Result<IndexMetadata> {
let IndexSettings { uid, primary_key } = index_settings;
let uid = uid.ok_or(IndexControllerError::MissingUid)?;
let uuid = Uuid::new_v4();
let meta = self.index_handle.create_index(uuid, primary_key).await?;
self.uuid_resolver.insert(uid.clone(), uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn delete_index(&self, uid: String) -> Result<()> {
let uuid = self.uuid_resolver.delete(uid).await?;
// We remove the index from the resolver synchronously, and effectively perform the index
// deletion as a background task.
let update_handle = self.update_handle.clone();
let index_handle = self.index_handle.clone();
tokio::spawn(async move {
if let Err(e) = update_handle.delete(uuid).await {
error!("Error while deleting index: {}", e);
}
if let Err(e) = index_handle.delete(uuid).await {
error!("Error while deleting index: {}", e);
}
});
Ok(())
}
pub async fn update_status(&self, uid: String, id: u64) -> Result<UpdateStatus> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.update_handle.update_status(uuid, id).await?;
Ok(result)
}
pub async fn all_update_status(&self, uid: String) -> Result<Vec<UpdateStatus>> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.update_handle.get_all_updates_status(uuid).await?;
Ok(result)
}
pub async fn list_indexes(&self) -> Result<Vec<IndexMetadata>> {
let uuids = self.uuid_resolver.list().await?;
let mut ret = Vec::new();
for (uid, uuid) in uuids {
let meta = self.index_handle.get_index_meta(uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
ret.push(meta);
}
Ok(ret)
}
pub async fn settings(&self, uid: String) -> Result<Settings<Checked>> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let settings = self.index_handle.settings(uuid).await?;
Ok(settings)
}
pub async fn documents(
&self,
uid: String,
offset: usize,
limit: usize,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Vec<Document>> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let documents = self
.index_handle
.documents(uuid, offset, limit, attributes_to_retrieve)
.await?;
Ok(documents)
}
pub async fn document(
&self,
uid: String,
doc_id: String,
attributes_to_retrieve: Option<Vec<String>>,
) -> Result<Document> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let document = self
.index_handle
.document(uuid, doc_id, attributes_to_retrieve)
.await?;
Ok(document)
}
pub async fn update_index(
&self,
uid: String,
mut index_settings: IndexSettings,
) -> Result<IndexMetadata> {
if index_settings.uid.is_some() {
index_settings.uid.take();
}
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let meta = self.index_handle.update_index(uuid, index_settings).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn search(&self, uid: String, query: SearchQuery) -> Result<SearchResult> {
let uuid = self.uuid_resolver.get(uid).await?;
let result = self.index_handle.search(uuid, query).await?;
Ok(result)
}
pub async fn get_index(&self, uid: String) -> Result<IndexMetadata> {
let uuid = self.uuid_resolver.get(uid.clone()).await?;
let meta = self.index_handle.get_index_meta(uuid).await?;
let meta = IndexMetadata {
uuid,
name: uid.clone(),
uid,
meta,
};
Ok(meta)
}
pub async fn get_uuids_size(&self) -> Result<u64> {
Ok(self.uuid_resolver.get_size().await?)
}
pub async fn get_index_stats(&self, uid: String) -> Result<IndexStats> {
let uuid = self.uuid_resolver.get(uid).await?;
let update_infos = self.update_handle.get_info().await?;
let mut stats = self.index_handle.get_index_stats(uuid).await?;
// Check if the currently indexing update is from out index.
stats.is_indexing = Some(Some(uuid) == update_infos.processing);
Ok(stats)
}
pub async fn get_all_stats(&self) -> Result<Stats> {
let update_infos = self.update_handle.get_info().await?;
let mut database_size = self.get_uuids_size().await? + update_infos.size;
let mut last_update: Option<DateTime<_>> = None;
let mut indexes = BTreeMap::new();
for index in self.list_indexes().await? {
let mut index_stats = self.index_handle.get_index_stats(index.uuid).await?;
database_size += index_stats.size;
last_update = last_update.map_or(Some(index.meta.updated_at), |last| {
Some(last.max(index.meta.updated_at))
});
index_stats.is_indexing = Some(Some(index.uuid) == update_infos.processing);
indexes.insert(index.uid, index_stats);
}
Ok(Stats {
database_size,
last_update,
indexes,
})
}
pub async fn create_dump(&self) -> Result<DumpInfo> {
Ok(self.dump_handle.create_dump().await?)
}
pub async fn dump_info(&self, uid: String) -> Result<DumpInfo> {
Ok(self.dump_handle.dump_info(uid).await?)
}
}
pub async fn get_arc_ownership_blocking<T>(mut item: Arc<T>) -> T {
loop {
match Arc::try_unwrap(item) {
Ok(item) => return item,
Err(item_arc) => {
item = item_arc;
sleep(Duration::from_millis(100)).await;
continue;
}
}
}
}

View File

@ -1,268 +0,0 @@
use std::path::{Path, PathBuf};
use std::time::Duration;
use anyhow::bail;
use log::{error, info, trace};
use tokio::fs;
use tokio::task::spawn_blocking;
use tokio::time::sleep;
use super::update_actor::UpdateActorHandle;
use super::uuid_resolver::UuidResolverHandle;
use crate::helpers::compression;
pub struct SnapshotService<U, R> {
uuid_resolver_handle: R,
update_handle: U,
snapshot_period: Duration,
snapshot_path: PathBuf,
db_name: String,
}
impl<U, R> SnapshotService<U, R>
where
U: UpdateActorHandle,
R: UuidResolverHandle,
{
pub fn new(
uuid_resolver_handle: R,
update_handle: U,
snapshot_period: Duration,
snapshot_path: PathBuf,
db_name: String,
) -> Self {
Self {
uuid_resolver_handle,
update_handle,
snapshot_period,
snapshot_path,
db_name,
}
}
pub async fn run(self) {
info!(
"Snapshot scheduled every {}s.",
self.snapshot_period.as_secs()
);
loop {
if let Err(e) = self.perform_snapshot().await {
error!("Error while performing snapshot: {}", e);
}
sleep(self.snapshot_period).await;
}
}
async fn perform_snapshot(&self) -> anyhow::Result<()> {
trace!("Performing snapshot.");
let snapshot_dir = self.snapshot_path.clone();
fs::create_dir_all(&snapshot_dir).await?;
let temp_snapshot_dir =
spawn_blocking(move || tempfile::tempdir_in(snapshot_dir)).await??;
let temp_snapshot_path = temp_snapshot_dir.path().to_owned();
let uuids = self
.uuid_resolver_handle
.snapshot(temp_snapshot_path.clone())
.await?;
if uuids.is_empty() {
return Ok(());
}
self.update_handle
.snapshot(uuids, temp_snapshot_path.clone())
.await?;
let snapshot_dir = self.snapshot_path.clone();
let snapshot_path = self
.snapshot_path
.join(format!("{}.snapshot", self.db_name));
let snapshot_path = spawn_blocking(move || -> anyhow::Result<PathBuf> {
let temp_snapshot_file = tempfile::NamedTempFile::new_in(snapshot_dir)?;
let temp_snapshot_file_path = temp_snapshot_file.path().to_owned();
compression::to_tar_gz(temp_snapshot_path, temp_snapshot_file_path)?;
temp_snapshot_file.persist(&snapshot_path)?;
Ok(snapshot_path)
})
.await??;
trace!("Created snapshot in {:?}.", snapshot_path);
Ok(())
}
}
pub fn load_snapshot(
db_path: impl AsRef<Path>,
snapshot_path: impl AsRef<Path>,
ignore_snapshot_if_db_exists: bool,
ignore_missing_snapshot: bool,
) -> anyhow::Result<()> {
if !db_path.as_ref().exists() && snapshot_path.as_ref().exists() {
match compression::from_tar_gz(snapshot_path, &db_path) {
Ok(()) => Ok(()),
Err(e) => {
// clean created db folder
std::fs::remove_dir_all(&db_path)?;
Err(e)
}
}
} else if db_path.as_ref().exists() && !ignore_snapshot_if_db_exists {
bail!(
"database already exists at {:?}, try to delete it or rename it",
db_path
.as_ref()
.canonicalize()
.unwrap_or_else(|_| db_path.as_ref().to_owned())
)
} else if !snapshot_path.as_ref().exists() && !ignore_missing_snapshot {
bail!(
"snapshot doesn't exist at {:?}",
snapshot_path
.as_ref()
.canonicalize()
.unwrap_or_else(|_| snapshot_path.as_ref().to_owned())
)
} else {
Ok(())
}
}
#[cfg(test)]
mod test {
use std::iter::FromIterator;
use std::{collections::HashSet, sync::Arc};
use futures::future::{err, ok};
use rand::Rng;
use tokio::time::timeout;
use uuid::Uuid;
use super::*;
use crate::index_controller::index_actor::MockIndexActorHandle;
use crate::index_controller::update_actor::{
error::UpdateActorError, MockUpdateActorHandle, UpdateActorHandleImpl,
};
use crate::index_controller::uuid_resolver::{
error::UuidResolverError, MockUuidResolverHandle,
};
#[actix_rt::test]
async fn test_normal() {
let mut rng = rand::thread_rng();
let uuids_num: usize = rng.gen_range(5, 10);
let uuids = (0..uuids_num)
.map(|_| Uuid::new_v4())
.collect::<HashSet<_>>();
let mut uuid_resolver = MockUuidResolverHandle::new();
let uuids_clone = uuids.clone();
uuid_resolver
.expect_snapshot()
.times(1)
.returning(move |_| Box::pin(ok(uuids_clone.clone())));
let uuids_clone = uuids.clone();
let mut index_handle = MockIndexActorHandle::new();
index_handle
.expect_snapshot()
.withf(move |uuid, _path| uuids_clone.contains(uuid))
.times(uuids_num)
.returning(move |_, _| Box::pin(ok(())));
let dir = tempfile::tempdir_in(".").unwrap();
let handle = Arc::new(index_handle);
let update_handle =
UpdateActorHandleImpl::<Vec<u8>>::new(handle.clone(), dir.path(), 4096 * 100).unwrap();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
snapshot_service.perform_snapshot().await.unwrap();
}
#[actix_rt::test]
async fn error_performing_uuid_snapshot() {
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
.times(1)
// abitrary error
.returning(|_| Box::pin(err(UuidResolverError::NameAlreadyExist)));
let update_handle = MockUpdateActorHandle::new();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
assert!(snapshot_service.perform_snapshot().await.is_err());
// Nothing was written to the file
assert!(!snapshot_path.path().join("data.ms.snapshot").exists());
}
#[actix_rt::test]
async fn error_performing_index_snapshot() {
let uuid = Uuid::new_v4();
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
.times(1)
.returning(move |_| Box::pin(ok(HashSet::from_iter(Some(uuid)))));
let mut update_handle = MockUpdateActorHandle::new();
update_handle
.expect_snapshot()
// abitrary error
.returning(|_, _| Box::pin(err(UpdateActorError::UnexistingUpdate(0))));
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
assert!(snapshot_service.perform_snapshot().await.is_err());
// Nothing was written to the file
assert!(!snapshot_path.path().join("data.ms.snapshot").exists());
}
#[actix_rt::test]
async fn test_loop() {
let mut uuid_resolver = MockUuidResolverHandle::new();
uuid_resolver
.expect_snapshot()
// we expect the funtion to be called between 2 and 3 time in the given interval.
.times(2..4)
// abitrary error, to short-circuit the function
.returning(move |_| Box::pin(err(UuidResolverError::NameAlreadyExist)));
let update_handle = MockUpdateActorHandle::new();
let snapshot_path = tempfile::tempdir_in(".").unwrap();
let snapshot_service = SnapshotService::new(
uuid_resolver,
update_handle,
Duration::from_millis(100),
snapshot_path.path().to_owned(),
"data.ms".to_string(),
);
let _ = timeout(Duration::from_millis(300), snapshot_service.run()).await;
}
}

View File

@ -1,274 +0,0 @@
use std::collections::HashSet;
use std::io::SeekFrom;
use std::path::{Path, PathBuf};
use std::sync::atomic::AtomicBool;
use std::sync::Arc;
use async_stream::stream;
use futures::StreamExt;
use log::trace;
use oxidized_json_checker::JsonChecker;
use tokio::fs;
use tokio::io::AsyncWriteExt;
use tokio::sync::mpsc;
use uuid::Uuid;
use super::error::{Result, UpdateActorError};
use super::{PayloadData, UpdateMsg, UpdateStore, UpdateStoreInfo};
use crate::index_controller::index_actor::IndexActorHandle;
use crate::index_controller::{UpdateMeta, UpdateStatus};
pub struct UpdateActor<D, I> {
path: PathBuf,
store: Arc<UpdateStore>,
inbox: Option<mpsc::Receiver<UpdateMsg<D>>>,
index_handle: I,
must_exit: Arc<AtomicBool>,
}
impl<D, I> UpdateActor<D, I>
where
D: AsRef<[u8]> + Sized + 'static,
I: IndexActorHandle + Clone + Send + Sync + 'static,
{
pub fn new(
update_db_size: usize,
inbox: mpsc::Receiver<UpdateMsg<D>>,
path: impl AsRef<Path>,
index_handle: I,
) -> anyhow::Result<Self> {
let path = path.as_ref().join("updates");
std::fs::create_dir_all(&path)?;
let mut options = heed::EnvOpenOptions::new();
options.map_size(update_db_size);
let must_exit = Arc::new(AtomicBool::new(false));
let store = UpdateStore::open(options, &path, index_handle.clone(), must_exit.clone())?;
std::fs::create_dir_all(path.join("update_files"))?;
let inbox = Some(inbox);
Ok(Self {
path,
store,
inbox,
index_handle,
must_exit,
})
}
pub async fn run(mut self) {
use UpdateMsg::*;
trace!("Started update actor.");
let mut inbox = self
.inbox
.take()
.expect("A receiver should be present by now.");
let must_exit = self.must_exit.clone();
let stream = stream! {
loop {
let msg = inbox.recv().await;
if must_exit.load(std::sync::atomic::Ordering::Relaxed) {
break;
}
match msg {
Some(msg) => yield msg,
None => break,
}
}
};
stream
.for_each_concurrent(Some(10), |msg| async {
match msg {
Update {
uuid,
meta,
data,
ret,
} => {
let _ = ret.send(self.handle_update(uuid, meta, data).await);
}
ListUpdates { uuid, ret } => {
let _ = ret.send(self.handle_list_updates(uuid).await);
}
GetUpdate { uuid, ret, id } => {
let _ = ret.send(self.handle_get_update(uuid, id).await);
}
Delete { uuid, ret } => {
let _ = ret.send(self.handle_delete(uuid).await);
}
Snapshot { uuids, path, ret } => {
let _ = ret.send(self.handle_snapshot(uuids, path).await);
}
GetInfo { ret } => {
let _ = ret.send(self.handle_get_info().await);
}
Dump { uuids, path, ret } => {
let _ = ret.send(self.handle_dump(uuids, path).await);
}
}
})
.await;
}
async fn handle_update(
&self,
uuid: Uuid,
meta: UpdateMeta,
payload: mpsc::Receiver<PayloadData<D>>,
) -> Result<UpdateStatus> {
let file_path = match meta {
UpdateMeta::DocumentsAddition { .. } => {
let update_file_id = uuid::Uuid::new_v4();
let path = self
.path
.join(format!("update_files/update_{}", update_file_id));
let mut file = fs::OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(&path)
.await?;
async fn write_to_file<D>(
file: &mut fs::File,
mut payload: mpsc::Receiver<PayloadData<D>>,
) -> Result<usize>
where
D: AsRef<[u8]> + Sized + 'static,
{
let mut file_len = 0;
while let Some(bytes) = payload.recv().await {
let bytes = bytes?;
file_len += bytes.as_ref().len();
file.write_all(bytes.as_ref()).await?;
}
file.flush().await?;
Ok(file_len)
}
let file_len = write_to_file(&mut file, payload).await;
match file_len {
Ok(len) if len > 0 => {
let file = file.into_std().await;
Some((file, update_file_id))
}
Err(e) => {
fs::remove_file(&path).await?;
return Err(e);
}
_ => {
fs::remove_file(&path).await?;
None
}
}
}
_ => None,
};
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || {
use std::io::{copy, sink, BufReader, Seek};
// If the payload is empty, ignore the check.
let update_uuid = if let Some((mut file, uuid)) = file_path {
// set the file back to the beginning
file.seek(SeekFrom::Start(0))?;
// Check that the json payload is valid:
let reader = BufReader::new(&mut file);
let mut checker = JsonChecker::new(reader);
if copy(&mut checker, &mut sink()).is_err() || checker.finish().is_err() {
// The json file is invalid, we use Serde to get a nice error message:
file.seek(SeekFrom::Start(0))?;
let _: serde_json::Value = serde_json::from_reader(file)
.map_err(|e| UpdateActorError::InvalidPayload(Box::new(e)))?;
}
Some(uuid)
} else {
None
};
// The payload is valid, we can register it to the update store.
let status = update_store
.register_update(meta, update_uuid, uuid)
.map(UpdateStatus::Enqueued)?;
Ok(status)
})
.await?
}
async fn handle_list_updates(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || {
let result = update_store.list(uuid)?;
Ok(result)
})
.await?
}
async fn handle_get_update(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus> {
let store = self.store.clone();
tokio::task::spawn_blocking(move || {
let result = store
.meta(uuid, id)?
.ok_or(UpdateActorError::UnexistingUpdate(id))?;
Ok(result)
})
.await?
}
async fn handle_delete(&self, uuid: Uuid) -> Result<()> {
let store = self.store.clone();
tokio::task::spawn_blocking(move || store.delete_all(uuid)).await??;
Ok(())
}
async fn handle_snapshot(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let index_handle = self.index_handle.clone();
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || update_store.snapshot(&uuids, &path, index_handle))
.await??;
Ok(())
}
async fn handle_dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let index_handle = self.index_handle.clone();
let update_store = self.store.clone();
tokio::task::spawn_blocking(move || -> Result<()> {
update_store.dump(&uuids, path.to_path_buf(), index_handle)?;
Ok(())
})
.await??;
Ok(())
}
async fn handle_get_info(&self) -> Result<UpdateStoreInfo> {
let update_store = self.store.clone();
let info = tokio::task::spawn_blocking(move || -> Result<UpdateStoreInfo> {
let info = update_store.get_info()?;
Ok(info)
})
.await??;
Ok(info)
}
}

View File

@ -1,61 +0,0 @@
use std::error::Error;
use meilisearch_error::{Code, ErrorCode};
use crate::index_controller::index_actor::error::IndexActorError;
pub type Result<T> = std::result::Result<T, UpdateActorError>;
#[derive(Debug, thiserror::Error)]
#[allow(clippy::large_enum_variant)]
pub enum UpdateActorError {
#[error("Update {0} not found.")]
UnexistingUpdate(u64),
#[error("Internal error: {0}")]
Internal(Box<dyn Error + Send + Sync + 'static>),
#[error("{0}")]
IndexActor(#[from] IndexActorError),
#[error(
"update store was shut down due to a fatal error, please check your logs for more info."
)]
FatalUpdateStoreError,
#[error("{0}")]
InvalidPayload(Box<dyn Error + Send + Sync + 'static>),
#[error("{0}")]
PayloadError(#[from] actix_web::error::PayloadError),
}
impl<T> From<tokio::sync::mpsc::error::SendError<T>> for UpdateActorError {
fn from(_: tokio::sync::mpsc::error::SendError<T>) -> Self {
Self::FatalUpdateStoreError
}
}
impl From<tokio::sync::oneshot::error::RecvError> for UpdateActorError {
fn from(_: tokio::sync::oneshot::error::RecvError) -> Self {
Self::FatalUpdateStoreError
}
}
internal_error!(
UpdateActorError: heed::Error,
std::io::Error,
serde_json::Error,
tokio::task::JoinError
);
impl ErrorCode for UpdateActorError {
fn error_code(&self) -> Code {
match self {
UpdateActorError::UnexistingUpdate(_) => Code::NotFound,
UpdateActorError::Internal(_) => Code::Internal,
UpdateActorError::IndexActor(e) => e.error_code(),
UpdateActorError::FatalUpdateStoreError => Code::Internal,
UpdateActorError::InvalidPayload(_) => Code::BadRequest,
UpdateActorError::PayloadError(error) => match error {
actix_http::error::PayloadError::Overflow => Code::PayloadTooLarge,
_ => Code::Internal,
},
}
}
}

View File

@ -1,103 +0,0 @@
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use crate::index_controller::{IndexActorHandle, UpdateStatus};
use super::error::Result;
use super::{PayloadData, UpdateActor, UpdateActorHandle, UpdateMeta, UpdateMsg, UpdateStoreInfo};
#[derive(Clone)]
pub struct UpdateActorHandleImpl<D> {
sender: mpsc::Sender<UpdateMsg<D>>,
}
impl<D> UpdateActorHandleImpl<D>
where
D: AsRef<[u8]> + Sized + 'static + Sync + Send,
{
pub fn new<I>(
index_handle: I,
path: impl AsRef<Path>,
update_store_size: usize,
) -> anyhow::Result<Self>
where
I: IndexActorHandle + Clone + Send + Sync + 'static,
{
let path = path.as_ref().to_owned();
let (sender, receiver) = mpsc::channel(100);
let actor = UpdateActor::new(update_store_size, receiver, path, index_handle)?;
tokio::task::spawn(actor.run());
Ok(Self { sender })
}
}
#[async_trait::async_trait]
impl<D> UpdateActorHandle for UpdateActorHandleImpl<D>
where
D: AsRef<[u8]> + Sized + 'static + Sync + Send,
{
type Data = D;
async fn get_all_updates_status(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::ListUpdates { uuid, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn update_status(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::GetUpdate { uuid, id, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn delete(&self, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Delete { uuid, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn snapshot(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Snapshot { uuids, path, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Dump { uuids, path, ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn get_info(&self) -> Result<UpdateStoreInfo> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::GetInfo { ret };
self.sender.send(msg).await?;
receiver.await?
}
async fn update(
&self,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<Self::Data>>,
uuid: Uuid,
) -> Result<UpdateStatus> {
let (ret, receiver) = oneshot::channel();
let msg = UpdateMsg::Update {
uuid,
data,
meta,
ret,
};
self.sender.send(msg).await?;
receiver.await?
}
}

View File

@ -1,43 +0,0 @@
use std::collections::HashSet;
use std::path::PathBuf;
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use super::error::Result;
use super::{PayloadData, UpdateMeta, UpdateStatus, UpdateStoreInfo};
pub enum UpdateMsg<D> {
Update {
uuid: Uuid,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<D>>,
ret: oneshot::Sender<Result<UpdateStatus>>,
},
ListUpdates {
uuid: Uuid,
ret: oneshot::Sender<Result<Vec<UpdateStatus>>>,
},
GetUpdate {
uuid: Uuid,
ret: oneshot::Sender<Result<UpdateStatus>>,
id: u64,
},
Delete {
uuid: Uuid,
ret: oneshot::Sender<Result<()>>,
},
Snapshot {
uuids: HashSet<Uuid>,
path: PathBuf,
ret: oneshot::Sender<Result<()>>,
},
Dump {
uuids: HashSet<Uuid>,
path: PathBuf,
ret: oneshot::Sender<Result<()>>,
},
GetInfo {
ret: oneshot::Sender<Result<UpdateStoreInfo>>,
},
}

View File

@ -1,44 +0,0 @@
use std::{collections::HashSet, path::PathBuf};
use actix_http::error::PayloadError;
use tokio::sync::mpsc;
use uuid::Uuid;
use crate::index_controller::{UpdateMeta, UpdateStatus};
use actor::UpdateActor;
use error::Result;
use message::UpdateMsg;
pub use handle_impl::UpdateActorHandleImpl;
pub use store::{UpdateStore, UpdateStoreInfo};
mod actor;
pub mod error;
mod handle_impl;
mod message;
pub mod store;
type PayloadData<D> = std::result::Result<D, PayloadError>;
#[cfg(test)]
use mockall::automock;
#[async_trait::async_trait]
#[cfg_attr(test, automock(type Data=Vec<u8>;))]
pub trait UpdateActorHandle {
type Data: AsRef<[u8]> + Sized + 'static + Sync + Send;
async fn get_all_updates_status(&self, uuid: Uuid) -> Result<Vec<UpdateStatus>>;
async fn update_status(&self, uuid: Uuid, id: u64) -> Result<UpdateStatus>;
async fn delete(&self, uuid: Uuid) -> Result<()>;
async fn snapshot(&self, uuid: HashSet<Uuid>, path: PathBuf) -> Result<()>;
async fn dump(&self, uuids: HashSet<Uuid>, path: PathBuf) -> Result<()>;
async fn get_info(&self) -> Result<UpdateStoreInfo>;
async fn update(
&self,
meta: UpdateMeta,
data: mpsc::Receiver<PayloadData<Self::Data>>,
uuid: Uuid,
) -> Result<UpdateStatus>;
}

View File

@ -1,86 +0,0 @@
use std::{borrow::Cow, convert::TryInto, mem::size_of};
use heed::{BytesDecode, BytesEncode};
use uuid::Uuid;
pub struct NextIdCodec;
pub enum NextIdKey {
Global,
Index(Uuid),
}
impl<'a> BytesEncode<'a> for NextIdCodec {
type EItem = NextIdKey;
fn bytes_encode(item: &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
match item {
NextIdKey::Global => Some(Cow::Borrowed(b"__global__")),
NextIdKey::Index(ref uuid) => Some(Cow::Borrowed(uuid.as_bytes())),
}
}
}
pub struct PendingKeyCodec;
impl<'a> BytesEncode<'a> for PendingKeyCodec {
type EItem = (u64, Uuid, u64);
fn bytes_encode((global_id, uuid, update_id): &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
let mut bytes = Vec::with_capacity(size_of::<Self::EItem>());
bytes.extend_from_slice(&global_id.to_be_bytes());
bytes.extend_from_slice(uuid.as_bytes());
bytes.extend_from_slice(&update_id.to_be_bytes());
Some(Cow::Owned(bytes))
}
}
impl<'a> BytesDecode<'a> for PendingKeyCodec {
type DItem = (u64, Uuid, u64);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let global_id_bytes = bytes.get(0..size_of::<u64>())?.try_into().ok()?;
let global_id = u64::from_be_bytes(global_id_bytes);
let uuid_bytes = bytes
.get(size_of::<u64>()..(size_of::<u64>() + size_of::<Uuid>()))?
.try_into()
.ok()?;
let uuid = Uuid::from_bytes(uuid_bytes);
let update_id_bytes = bytes
.get((size_of::<u64>() + size_of::<Uuid>())..)?
.try_into()
.ok()?;
let update_id = u64::from_be_bytes(update_id_bytes);
Some((global_id, uuid, update_id))
}
}
pub struct UpdateKeyCodec;
impl<'a> BytesEncode<'a> for UpdateKeyCodec {
type EItem = (Uuid, u64);
fn bytes_encode((uuid, update_id): &'a Self::EItem) -> Option<Cow<'a, [u8]>> {
let mut bytes = Vec::with_capacity(size_of::<Self::EItem>());
bytes.extend_from_slice(uuid.as_bytes());
bytes.extend_from_slice(&update_id.to_be_bytes());
Some(Cow::Owned(bytes))
}
}
impl<'a> BytesDecode<'a> for UpdateKeyCodec {
type DItem = (Uuid, u64);
fn bytes_decode(bytes: &'a [u8]) -> Option<Self::DItem> {
let uuid_bytes = bytes.get(0..size_of::<Uuid>())?.try_into().ok()?;
let uuid = Uuid::from_bytes(uuid_bytes);
let update_id_bytes = bytes.get(size_of::<Uuid>()..)?.try_into().ok()?;
let update_id = u64::from_be_bytes(update_id_bytes);
Some((uuid, update_id))
}
}

View File

@ -1,184 +0,0 @@
use std::{
collections::HashSet,
fs::{create_dir_all, File},
io::{BufRead, BufReader, Write},
path::{Path, PathBuf},
};
use heed::{EnvOpenOptions, RoTxn};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use super::{Result, State, UpdateStore};
use crate::index_controller::{
index_actor::IndexActorHandle, update_actor::store::update_uuid_to_file_path, Enqueued,
UpdateStatus,
};
#[derive(Serialize, Deserialize)]
struct UpdateEntry {
uuid: Uuid,
update: UpdateStatus,
}
impl UpdateStore {
pub fn dump(
&self,
uuids: &HashSet<Uuid>,
path: PathBuf,
handle: impl IndexActorHandle,
) -> Result<()> {
let state_lock = self.state.write();
state_lock.swap(State::Dumping);
// txn must *always* be acquired after state lock, or it will dead lock.
let txn = self.env.write_txn()?;
let dump_path = path.join("updates");
create_dir_all(&dump_path)?;
self.dump_updates(&txn, uuids, &dump_path)?;
let fut = dump_indexes(uuids, handle, &path);
tokio::runtime::Handle::current().block_on(fut)?;
state_lock.swap(State::Idle);
Ok(())
}
fn dump_updates(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
path: impl AsRef<Path>,
) -> Result<()> {
let dump_data_path = path.as_ref().join("data.jsonl");
let mut dump_data_file = File::create(dump_data_path)?;
let update_files_path = path.as_ref().join(super::UPDATE_DIR);
create_dir_all(&update_files_path)?;
self.dump_pending(&txn, uuids, &mut dump_data_file, &path)?;
self.dump_completed(&txn, uuids, &mut dump_data_file)?;
Ok(())
}
fn dump_pending(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
mut file: &mut File,
dst_path: impl AsRef<Path>,
) -> Result<()> {
let pendings = self.pending_queue.iter(txn)?.lazily_decode_data();
for pending in pendings {
let ((_, uuid, _), data) = pending?;
if uuids.contains(&uuid) {
let update = data.decode()?;
if let Some(ref update_uuid) = update.content {
let src = super::update_uuid_to_file_path(&self.path, *update_uuid);
let dst = super::update_uuid_to_file_path(&dst_path, *update_uuid);
std::fs::copy(src, dst)?;
}
let update_json = UpdateEntry {
uuid,
update: update.into(),
};
serde_json::to_writer(&mut file, &update_json)?;
file.write_all(b"\n")?;
}
}
Ok(())
}
fn dump_completed(
&self,
txn: &RoTxn,
uuids: &HashSet<Uuid>,
mut file: &mut File,
) -> Result<()> {
let updates = self.updates.iter(txn)?.lazily_decode_data();
for update in updates {
let ((uuid, _), data) = update?;
if uuids.contains(&uuid) {
let update = data.decode()?;
let update_json = UpdateEntry { uuid, update };
serde_json::to_writer(&mut file, &update_json)?;
file.write_all(b"\n")?;
}
}
Ok(())
}
pub fn load_dump(
src: impl AsRef<Path>,
dst: impl AsRef<Path>,
db_size: usize,
) -> anyhow::Result<()> {
let dst_update_path = dst.as_ref().join("updates/");
create_dir_all(&dst_update_path)?;
let mut options = EnvOpenOptions::new();
options.map_size(db_size as usize);
let (store, _) = UpdateStore::new(options, &dst_update_path)?;
let src_update_path = src.as_ref().join("updates");
let update_data = File::open(&src_update_path.join("data.jsonl"))?;
let mut update_data = BufReader::new(update_data);
std::fs::create_dir_all(dst_update_path.join("update_files/"))?;
let mut wtxn = store.env.write_txn()?;
let mut line = String::new();
loop {
match update_data.read_line(&mut line) {
Ok(0) => break,
Ok(_) => {
let UpdateEntry { uuid, update } = serde_json::from_str(&line)?;
store.register_raw_updates(&mut wtxn, &update, uuid)?;
// Copy ascociated update path if it exists
if let UpdateStatus::Enqueued(Enqueued {
content: Some(uuid),
..
}) = update
{
let src = update_uuid_to_file_path(&src_update_path, uuid);
let dst = update_uuid_to_file_path(&dst_update_path, uuid);
std::fs::copy(src, dst)?;
}
}
_ => break,
}
line.clear();
}
wtxn.commit()?;
Ok(())
}
}
async fn dump_indexes(
uuids: &HashSet<Uuid>,
handle: impl IndexActorHandle,
path: impl AsRef<Path>,
) -> Result<()> {
for uuid in uuids {
handle.dump(*uuid, path.as_ref().to_owned()).await?;
}
Ok(())
}

View File

@ -1,724 +0,0 @@
mod codec;
pub mod dump;
use std::fs::{copy, create_dir_all, remove_file, File};
use std::path::Path;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::{
collections::{BTreeMap, HashSet},
path::PathBuf,
time::Duration,
};
use arc_swap::ArcSwap;
use futures::StreamExt;
use heed::types::{ByteSlice, OwnedType, SerdeJson};
use heed::zerocopy::U64;
use heed::{CompactionOption, Database, Env, EnvOpenOptions};
use log::error;
use parking_lot::{Mutex, MutexGuard};
use tokio::runtime::Handle;
use tokio::sync::mpsc;
use tokio::sync::mpsc::error::TrySendError;
use tokio::time::timeout;
use uuid::Uuid;
use codec::*;
use super::error::Result;
use super::UpdateMeta;
use crate::helpers::EnvSizer;
use crate::index_controller::{index_actor::CONCURRENT_INDEX_MSG, updates::*, IndexActorHandle};
#[allow(clippy::upper_case_acronyms)]
type BEU64 = U64<heed::byteorder::BE>;
const UPDATE_DIR: &str = "update_files";
pub struct UpdateStoreInfo {
/// Size of the update store in bytes.
pub size: u64,
/// Uuid of the currently processing update if it exists
pub processing: Option<Uuid>,
}
/// A data structure that allows concurrent reads AND exactly one writer.
pub struct StateLock {
lock: Mutex<()>,
data: ArcSwap<State>,
}
pub struct StateLockGuard<'a> {
_lock: MutexGuard<'a, ()>,
state: &'a StateLock,
}
impl StateLockGuard<'_> {
pub fn swap(&self, state: State) -> Arc<State> {
self.state.data.swap(Arc::new(state))
}
}
impl StateLock {
fn from_state(state: State) -> Self {
let lock = Mutex::new(());
let data = ArcSwap::from(Arc::new(state));
Self { lock, data }
}
pub fn read(&self) -> Arc<State> {
self.data.load().clone()
}
pub fn write(&self) -> StateLockGuard {
let _lock = self.lock.lock();
let state = &self;
StateLockGuard { _lock, state }
}
}
#[allow(clippy::large_enum_variant)]
pub enum State {
Idle,
Processing(Uuid, Processing),
Snapshoting,
Dumping,
}
#[derive(Clone)]
pub struct UpdateStore {
pub env: Env,
/// A queue containing the updates to process, ordered by arrival.
/// The key are built as follow:
/// | global_update_id | index_uuid | update_id |
/// | 8-bytes | 16-bytes | 8-bytes |
pending_queue: Database<PendingKeyCodec, SerdeJson<Enqueued>>,
/// Map indexes to the next available update id. If NextIdKey::Global is queried, then the next
/// global update id is returned
next_update_id: Database<NextIdCodec, OwnedType<BEU64>>,
/// Contains all the performed updates meta, be they failed, aborted, or processed.
/// The keys are built as follow:
/// | Uuid | id |
/// | 16-bytes | 8-bytes |
updates: Database<UpdateKeyCodec, SerdeJson<UpdateStatus>>,
/// Indicates the current state of the update store,
state: Arc<StateLock>,
/// Wake up the loop when a new event occurs.
notification_sender: mpsc::Sender<()>,
path: PathBuf,
}
impl UpdateStore {
fn new(
mut options: EnvOpenOptions,
path: impl AsRef<Path>,
) -> anyhow::Result<(Self, mpsc::Receiver<()>)> {
options.max_dbs(5);
let env = options.open(&path)?;
let pending_queue = env.create_database(Some("pending-queue"))?;
let next_update_id = env.create_database(Some("next-update-id"))?;
let updates = env.create_database(Some("updates"))?;
let state = Arc::new(StateLock::from_state(State::Idle));
let (notification_sender, notification_receiver) = mpsc::channel(1);
Ok((
Self {
env,
pending_queue,
next_update_id,
updates,
state,
notification_sender,
path: path.as_ref().to_owned(),
},
notification_receiver,
))
}
pub fn open(
options: EnvOpenOptions,
path: impl AsRef<Path>,
index_handle: impl IndexActorHandle + Clone + Sync + Send + 'static,
must_exit: Arc<AtomicBool>,
) -> anyhow::Result<Arc<Self>> {
let (update_store, mut notification_receiver) = Self::new(options, path)?;
let update_store = Arc::new(update_store);
// Send a first notification to trigger the process.
if let Err(TrySendError::Closed(())) = update_store.notification_sender.try_send(()) {
panic!("Failed to init update store");
}
// We need a weak reference so we can take ownership on the arc later when we
// want to close the index.
let duration = Duration::from_secs(10 * 60); // 10 minutes
let update_store_weak = Arc::downgrade(&update_store);
tokio::task::spawn(async move {
// Block and wait for something to process with a timeout. The timeout
// function returns a Result and we must just unlock the loop on Result.
'outer: while timeout(duration, notification_receiver.recv())
.await
.map_or(true, |o| o.is_some())
{
loop {
match update_store_weak.upgrade() {
Some(update_store) => {
let handler = index_handle.clone();
let res = tokio::task::spawn_blocking(move || {
update_store.process_pending_update(handler)
})
.await
.expect("Fatal error processing update.");
match res {
Ok(Some(_)) => (),
Ok(None) => break,
Err(e) => {
error!("Fatal error while processing an update that requires the update store to shutdown: {}", e);
must_exit.store(true, Ordering::SeqCst);
break 'outer;
}
}
}
// the ownership on the arc has been taken, we need to exit.
None => break 'outer,
}
}
}
error!("Update store loop exited.");
});
Ok(update_store)
}
/// Returns the next global update id and the next update id for a given `index_uuid`.
fn next_update_id(&self, txn: &mut heed::RwTxn, index_uuid: Uuid) -> heed::Result<(u64, u64)> {
let global_id = self
.next_update_id
.get(txn, &NextIdKey::Global)?
.map(U64::get)
.unwrap_or_default();
self.next_update_id
.put(txn, &NextIdKey::Global, &BEU64::new(global_id + 1))?;
let update_id = self.next_update_id_raw(txn, index_uuid)?;
Ok((global_id, update_id))
}
/// Returns the next next update id for a given `index_uuid` without
/// incrementing the global update id. This is useful for the dumps.
fn next_update_id_raw(&self, txn: &mut heed::RwTxn, index_uuid: Uuid) -> heed::Result<u64> {
let update_id = self
.next_update_id
.get(txn, &NextIdKey::Index(index_uuid))?
.map(U64::get)
.unwrap_or_default();
self.next_update_id.put(
txn,
&NextIdKey::Index(index_uuid),
&BEU64::new(update_id + 1),
)?;
Ok(update_id)
}
/// Registers the update content in the pending store and the meta
/// into the pending-meta store. Returns the new unique update id.
pub fn register_update(
&self,
meta: UpdateMeta,
content: Option<Uuid>,
index_uuid: Uuid,
) -> heed::Result<Enqueued> {
let mut txn = self.env.write_txn()?;
let (global_id, update_id) = self.next_update_id(&mut txn, index_uuid)?;
let meta = Enqueued::new(meta, update_id, content);
self.pending_queue
.put(&mut txn, &(global_id, index_uuid, update_id), &meta)?;
txn.commit()?;
if let Err(TrySendError::Closed(())) = self.notification_sender.try_send(()) {
panic!("Update store loop exited");
}
Ok(meta)
}
/// Push already processed update in the UpdateStore without triggering the notification
/// process. This is useful for the dumps.
pub fn register_raw_updates(
&self,
wtxn: &mut heed::RwTxn,
update: &UpdateStatus,
index_uuid: Uuid,
) -> heed::Result<()> {
match update {
UpdateStatus::Enqueued(enqueued) => {
let (global_id, _update_id) = self.next_update_id(wtxn, index_uuid)?;
self.pending_queue.remap_key_type::<PendingKeyCodec>().put(
wtxn,
&(global_id, index_uuid, enqueued.id()),
&enqueued,
)?;
}
_ => {
let _update_id = self.next_update_id_raw(wtxn, index_uuid)?;
self.updates
.put(wtxn, &(index_uuid, update.id()), &update)?;
}
}
Ok(())
}
/// Executes the user provided function on the next pending update (the one with the lowest id).
/// This is asynchronous as it let the user process the update with a read-only txn and
/// only writing the result meta to the processed-meta store *after* it has been processed.
fn process_pending_update(&self, index_handle: impl IndexActorHandle) -> Result<Option<()>> {
// Create a read transaction to be able to retrieve the pending update in order.
let rtxn = self.env.read_txn()?;
let first_meta = self.pending_queue.first(&rtxn)?;
drop(rtxn);
// If there is a pending update we process and only keep
// a reader while processing it, not a writer.
match first_meta {
Some(((global_id, index_uuid, _), mut pending)) => {
let content = pending.content.take();
let processing = pending.processing();
// Acquire the state lock and set the current state to processing.
// txn must *always* be acquired after state lock, or it will dead lock.
let state = self.state.write();
state.swap(State::Processing(index_uuid, processing.clone()));
let result =
self.perform_update(content, processing, index_handle, index_uuid, global_id);
state.swap(State::Idle);
result
}
None => Ok(None),
}
}
fn perform_update(
&self,
content: Option<Uuid>,
processing: Processing,
index_handle: impl IndexActorHandle,
index_uuid: Uuid,
global_id: u64,
) -> Result<Option<()>> {
let content_path = content.map(|uuid| update_uuid_to_file_path(&self.path, uuid));
let update_id = processing.id();
let file = match content_path {
Some(ref path) => {
let file = File::open(path)?;
Some(file)
}
None => None,
};
// Process the pending update using the provided user function.
let handle = Handle::current();
let result =
match handle.block_on(index_handle.update(index_uuid, processing.clone(), file)) {
Ok(result) => result,
Err(e) => Err(processing.fail(e.into())),
};
// Once the pending update have been successfully processed
// we must remove the content from the pending and processing stores and
// write the *new* meta to the processed-meta store and commit.
let mut wtxn = self.env.write_txn()?;
self.pending_queue
.delete(&mut wtxn, &(global_id, index_uuid, update_id))?;
let result = match result {
Ok(res) => res.into(),
Err(res) => res.into(),
};
self.updates
.put(&mut wtxn, &(index_uuid, update_id), &result)?;
wtxn.commit()?;
if let Some(ref path) = content_path {
remove_file(&path)?;
}
Ok(Some(()))
}
/// List the updates for `index_uuid`.
pub fn list(&self, index_uuid: Uuid) -> Result<Vec<UpdateStatus>> {
let mut update_list = BTreeMap::<u64, UpdateStatus>::new();
let txn = self.env.read_txn()?;
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, id), pending) = entry?;
if uuid == index_uuid {
update_list.insert(id, pending.decode()?.into());
}
}
let updates = self
.updates
.remap_key_type::<ByteSlice>()
.prefix_iter(&txn, index_uuid.as_bytes())?;
for entry in updates {
let (_, update) = entry?;
update_list.insert(update.id(), update);
}
// If the currently processing update is from this index, replace the corresponding pending update with this one.
match *self.state.read() {
State::Processing(uuid, ref processing) if uuid == index_uuid => {
update_list.insert(processing.id(), processing.clone().into());
}
_ => (),
}
Ok(update_list.into_iter().map(|(_, v)| v).collect())
}
/// Returns the update associated meta or `None` if the update doesn't exist.
pub fn meta(&self, index_uuid: Uuid, update_id: u64) -> heed::Result<Option<UpdateStatus>> {
// Check if the update is the one currently processing
match *self.state.read() {
State::Processing(uuid, ref processing)
if uuid == index_uuid && processing.id() == update_id =>
{
return Ok(Some(processing.clone().into()));
}
_ => (),
}
let txn = self.env.read_txn()?;
// Else, check if it is in the updates database:
let update = self.updates.get(&txn, &(index_uuid, update_id))?;
if let Some(update) = update {
return Ok(Some(update));
}
// If nothing was found yet, we resolve to iterate over the pending queue.
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, id), pending) = entry?;
if uuid == index_uuid && id == update_id {
return Ok(Some(pending.decode()?.into()));
}
}
// No update was found.
Ok(None)
}
/// Delete all updates for an index from the update store. If the currently processing update
/// is for `index_uuid`, the call will block until the update is terminated.
pub fn delete_all(&self, index_uuid: Uuid) -> Result<()> {
let mut txn = self.env.write_txn()?;
// Contains all the content file paths that we need to be removed if the deletion was successful.
let mut uuids_to_remove = Vec::new();
let mut pendings = self.pending_queue.iter_mut(&mut txn)?.lazily_decode_data();
while let Some(Ok(((_, uuid, _), pending))) = pendings.next() {
if uuid == index_uuid {
unsafe {
pendings.del_current()?;
}
let mut pending = pending.decode()?;
if let Some(update_uuid) = pending.content.take() {
uuids_to_remove.push(update_uuid);
}
}
}
drop(pendings);
let mut updates = self
.updates
.remap_key_type::<ByteSlice>()
.prefix_iter_mut(&mut txn, index_uuid.as_bytes())?
.lazily_decode_data();
while let Some(_) = updates.next() {
unsafe {
updates.del_current()?;
}
}
drop(updates);
txn.commit()?;
uuids_to_remove
.iter()
.map(|uuid| update_uuid_to_file_path(&self.path, *uuid))
.for_each(|path| {
let _ = remove_file(path);
});
// If the currently processing update is from our index, we wait until it is
// finished before returning. This ensure that no write to the index occurs after we delete it.
if let State::Processing(uuid, _) = *self.state.read() {
if uuid == index_uuid {
// wait for a write lock, do nothing with it.
self.state.write();
}
}
Ok(())
}
pub fn snapshot(
&self,
uuids: &HashSet<Uuid>,
path: impl AsRef<Path>,
handle: impl IndexActorHandle + Clone,
) -> Result<()> {
let state_lock = self.state.write();
state_lock.swap(State::Snapshoting);
let txn = self.env.write_txn()?;
let update_path = path.as_ref().join("updates");
create_dir_all(&update_path)?;
// acquire write lock to prevent further writes during snapshot
create_dir_all(&update_path)?;
let db_path = update_path.join("data.mdb");
// create db snapshot
self.env.copy_to_path(&db_path, CompactionOption::Enabled)?;
let update_files_path = update_path.join(UPDATE_DIR);
create_dir_all(&update_files_path)?;
let pendings = self.pending_queue.iter(&txn)?.lazily_decode_data();
for entry in pendings {
let ((_, uuid, _), pending) = entry?;
if uuids.contains(&uuid) {
if let Enqueued {
content: Some(uuid),
..
} = pending.decode()?
{
let path = update_uuid_to_file_path(&self.path, uuid);
copy(path, &update_files_path)?;
}
}
}
let path = &path.as_ref().to_path_buf();
let handle = &handle;
// Perform the snapshot of each index concurently. Only a third of the capabilities of
// the index actor at a time not to put too much pressure on the index actor
let mut stream = futures::stream::iter(uuids.iter())
.map(move |uuid| handle.snapshot(*uuid, path.clone()))
.buffer_unordered(CONCURRENT_INDEX_MSG / 3);
Handle::current().block_on(async {
while let Some(res) = stream.next().await {
res?;
}
Ok(()) as Result<()>
})?;
Ok(())
}
pub fn get_info(&self) -> Result<UpdateStoreInfo> {
let mut size = self.env.size();
let txn = self.env.read_txn()?;
for entry in self.pending_queue.iter(&txn)? {
let (_, pending) = entry?;
if let Enqueued {
content: Some(uuid),
..
} = pending
{
let path = update_uuid_to_file_path(&self.path, uuid);
size += File::open(path)?.metadata()?.len();
}
}
let processing = match *self.state.read() {
State::Processing(uuid, _) => Some(uuid),
_ => None,
};
Ok(UpdateStoreInfo { size, processing })
}
}
fn update_uuid_to_file_path(root: impl AsRef<Path>, uuid: Uuid) -> PathBuf {
root.as_ref()
.join(UPDATE_DIR)
.join(format!("update_{}", uuid))
}
#[cfg(test)]
mod test {
use super::*;
use crate::index_controller::{
index_actor::{error::IndexActorError, MockIndexActorHandle},
UpdateResult,
};
use futures::future::ok;
#[actix_rt::test]
async fn test_next_id() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut options = EnvOpenOptions::new();
let handle = Arc::new(MockIndexActorHandle::new());
options.map_size(4096 * 100);
let update_store = UpdateStore::open(
options,
dir.path(),
handle,
Arc::new(AtomicBool::new(false)),
)
.unwrap();
let index1_uuid = Uuid::new_v4();
let index2_uuid = Uuid::new_v4();
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index1_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((0, 0), ids);
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index2_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((1, 0), ids);
let mut txn = update_store.env.write_txn().unwrap();
let ids = update_store.next_update_id(&mut txn, index1_uuid).unwrap();
txn.commit().unwrap();
assert_eq!((2, 1), ids);
}
#[actix_rt::test]
async fn test_register_update() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut options = EnvOpenOptions::new();
let handle = Arc::new(MockIndexActorHandle::new());
options.map_size(4096 * 100);
let update_store = UpdateStore::open(
options,
dir.path(),
handle,
Arc::new(AtomicBool::new(false)),
)
.unwrap();
let meta = UpdateMeta::ClearDocuments;
let uuid = Uuid::new_v4();
let store_clone = update_store.clone();
tokio::task::spawn_blocking(move || {
store_clone.register_update(meta, None, uuid).unwrap();
})
.await
.unwrap();
let txn = update_store.env.read_txn().unwrap();
assert!(update_store
.pending_queue
.get(&txn, &(0, uuid, 0))
.unwrap()
.is_some());
}
#[actix_rt::test]
async fn test_process_update() {
let dir = tempfile::tempdir_in(".").unwrap();
let mut handle = MockIndexActorHandle::new();
handle
.expect_update()
.times(2)
.returning(|_index_uuid, processing, _file| {
if processing.id() == 0 {
Box::pin(ok(Ok(processing.process(UpdateResult::Other))))
} else {
Box::pin(ok(Err(
processing.fail(IndexActorError::ExistingPrimaryKey.into())
)))
}
});
let handle = Arc::new(handle);
let mut options = EnvOpenOptions::new();
options.map_size(4096 * 100);
let store = UpdateStore::open(
options,
dir.path(),
handle.clone(),
Arc::new(AtomicBool::new(false)),
)
.unwrap();
// wait a bit for the event loop exit.
tokio::time::sleep(std::time::Duration::from_millis(50)).await;
let mut txn = store.env.write_txn().unwrap();
let update = Enqueued::new(UpdateMeta::ClearDocuments, 0, None);
let uuid = Uuid::new_v4();
store
.pending_queue
.put(&mut txn, &(0, uuid, 0), &update)
.unwrap();
let update = Enqueued::new(UpdateMeta::ClearDocuments, 1, None);
store
.pending_queue
.put(&mut txn, &(1, uuid, 1), &update)
.unwrap();
txn.commit().unwrap();
// Process the pending, and check that it has been moved to the update databases, and
// removed from the pending database.
let store_clone = store.clone();
tokio::task::spawn_blocking(move || {
store_clone.process_pending_update(handle.clone()).unwrap();
store_clone.process_pending_update(handle).unwrap();
})
.await
.unwrap();
let txn = store.env.read_txn().unwrap();
assert!(store.pending_queue.first(&txn).unwrap().is_none());
let update = store.updates.get(&txn, &(uuid, 0)).unwrap().unwrap();
assert!(matches!(update, UpdateStatus::Processed(_)));
let update = store.updates.get(&txn, &(uuid, 1)).unwrap().unwrap();
assert!(matches!(update, UpdateStatus::Failed(_)));
}
}

View File

@ -1,233 +0,0 @@
use chrono::{DateTime, Utc};
use milli::update::{DocumentAdditionResult, IndexDocumentsMethod, UpdateFormat};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::{
error::ResponseError,
index::{Settings, Unchecked},
};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum UpdateResult {
DocumentsAddition(DocumentAdditionResult),
DocumentDeletion { deleted: u64 },
Other,
}
#[allow(clippy::large_enum_variant)]
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum UpdateMeta {
DocumentsAddition {
method: IndexDocumentsMethod,
format: UpdateFormat,
primary_key: Option<String>,
},
ClearDocuments,
DeleteDocuments {
ids: Vec<String>,
},
Settings(Settings<Unchecked>),
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Enqueued {
pub update_id: u64,
pub meta: UpdateMeta,
pub enqueued_at: DateTime<Utc>,
pub content: Option<Uuid>,
}
impl Enqueued {
pub fn new(meta: UpdateMeta, update_id: u64, content: Option<Uuid>) -> Self {
Self {
enqueued_at: Utc::now(),
meta,
update_id,
content,
}
}
pub fn processing(self) -> Processing {
Processing {
from: self,
started_processing_at: Utc::now(),
}
}
pub fn abort(self) -> Aborted {
Aborted {
from: self,
aborted_at: Utc::now(),
}
}
pub fn meta(&self) -> &UpdateMeta {
&self.meta
}
pub fn id(&self) -> u64 {
self.update_id
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Processed {
pub success: UpdateResult,
pub processed_at: DateTime<Utc>,
#[serde(flatten)]
pub from: Processing,
}
impl Processed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Processing {
#[serde(flatten)]
pub from: Enqueued,
pub started_processing_at: DateTime<Utc>,
}
impl Processing {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
pub fn process(self, success: UpdateResult) -> Processed {
Processed {
success,
from: self,
processed_at: Utc::now(),
}
}
pub fn fail(self, error: ResponseError) -> Failed {
Failed {
from: self,
error,
failed_at: Utc::now(),
}
}
}
#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(rename_all = "camelCase")]
pub struct Aborted {
#[serde(flatten)]
from: Enqueued,
aborted_at: DateTime<Utc>,
}
impl Aborted {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct Failed {
#[serde(flatten)]
pub from: Processing,
pub error: ResponseError,
pub failed_at: DateTime<Utc>,
}
impl Failed {
pub fn id(&self) -> u64 {
self.from.id()
}
pub fn meta(&self) -> &UpdateMeta {
self.from.meta()
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "status", rename_all = "camelCase")]
pub enum UpdateStatus {
Processing(Processing),
Enqueued(Enqueued),
Processed(Processed),
Aborted(Aborted),
Failed(Failed),
}
impl UpdateStatus {
pub fn id(&self) -> u64 {
match self {
UpdateStatus::Processing(u) => u.id(),
UpdateStatus::Enqueued(u) => u.id(),
UpdateStatus::Processed(u) => u.id(),
UpdateStatus::Aborted(u) => u.id(),
UpdateStatus::Failed(u) => u.id(),
}
}
pub fn meta(&self) -> &UpdateMeta {
match self {
UpdateStatus::Processing(u) => u.meta(),
UpdateStatus::Enqueued(u) => u.meta(),
UpdateStatus::Processed(u) => u.meta(),
UpdateStatus::Aborted(u) => u.meta(),
UpdateStatus::Failed(u) => u.meta(),
}
}
pub fn processed(&self) -> Option<&Processed> {
match self {
UpdateStatus::Processed(p) => Some(p),
_ => None,
}
}
}
impl From<Enqueued> for UpdateStatus {
fn from(other: Enqueued) -> Self {
Self::Enqueued(other)
}
}
impl From<Aborted> for UpdateStatus {
fn from(other: Aborted) -> Self {
Self::Aborted(other)
}
}
impl From<Processed> for UpdateStatus {
fn from(other: Processed) -> Self {
Self::Processed(other)
}
}
impl From<Processing> for UpdateStatus {
fn from(other: Processing) -> Self {
Self::Processing(other)
}
}
impl From<Failed> for UpdateStatus {
fn from(other: Failed) -> Self {
Self::Failed(other)
}
}

View File

@ -1,98 +0,0 @@
use std::{collections::HashSet, path::PathBuf};
use log::{trace, warn};
use tokio::sync::mpsc;
use uuid::Uuid;
use super::{error::UuidResolverError, Result, UuidResolveMsg, UuidStore};
pub struct UuidResolverActor<S> {
inbox: mpsc::Receiver<UuidResolveMsg>,
store: S,
}
impl<S: UuidStore> UuidResolverActor<S> {
pub fn new(inbox: mpsc::Receiver<UuidResolveMsg>, store: S) -> Self {
Self { inbox, store }
}
pub async fn run(mut self) {
use UuidResolveMsg::*;
trace!("uuid resolver started");
loop {
match self.inbox.recv().await {
Some(Get { uid: name, ret }) => {
let _ = ret.send(self.handle_get(name).await);
}
Some(Delete { uid: name, ret }) => {
let _ = ret.send(self.handle_delete(name).await);
}
Some(List { ret }) => {
let _ = ret.send(self.handle_list().await);
}
Some(Insert { ret, uuid, name }) => {
let _ = ret.send(self.handle_insert(name, uuid).await);
}
Some(SnapshotRequest { path, ret }) => {
let _ = ret.send(self.handle_snapshot(path).await);
}
Some(GetSize { ret }) => {
let _ = ret.send(self.handle_get_size().await);
}
Some(DumpRequest { path, ret }) => {
let _ = ret.send(self.handle_dump(path).await);
}
// all senders have been dropped, need to quit.
None => break,
}
}
warn!("exiting uuid resolver loop");
}
async fn handle_get(&self, uid: String) -> Result<Uuid> {
self.store
.get_uuid(uid.clone())
.await?
.ok_or(UuidResolverError::UnexistingIndex(uid))
}
async fn handle_delete(&self, uid: String) -> Result<Uuid> {
self.store
.delete(uid.clone())
.await?
.ok_or(UuidResolverError::UnexistingIndex(uid))
}
async fn handle_list(&self) -> Result<Vec<(String, Uuid)>> {
let result = self.store.list().await?;
Ok(result)
}
async fn handle_snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
self.store.snapshot(path).await
}
async fn handle_dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
self.store.dump(path).await
}
async fn handle_insert(&self, uid: String, uuid: Uuid) -> Result<()> {
if !is_index_uid_valid(&uid) {
return Err(UuidResolverError::BadlyFormatted(uid));
}
self.store.insert(uid, uuid).await?;
Ok(())
}
async fn handle_get_size(&self) -> Result<u64> {
self.store.get_size().await
}
}
fn is_index_uid_valid(uid: &str) -> bool {
uid.chars()
.all(|x| x.is_ascii_alphanumeric() || x == '-' || x == '_')
}

View File

@ -1,34 +0,0 @@
use meilisearch_error::{Code, ErrorCode};
pub type Result<T> = std::result::Result<T, UuidResolverError>;
#[derive(Debug, thiserror::Error)]
pub enum UuidResolverError {
#[error("Index already exists.")]
NameAlreadyExist,
#[error("Index \"{0}\" not found.")]
UnexistingIndex(String),
#[error("Index must have a valid uid; Index uid can be of type integer or string only composed of alphanumeric characters, hyphens (-) and underscores (_).")]
BadlyFormatted(String),
#[error("Internal error: {0}")]
Internal(Box<dyn std::error::Error + Sync + Send + 'static>),
}
internal_error!(
UuidResolverError: heed::Error,
uuid::Error,
std::io::Error,
tokio::task::JoinError,
serde_json::Error
);
impl ErrorCode for UuidResolverError {
fn error_code(&self) -> Code {
match self {
UuidResolverError::NameAlreadyExist => Code::IndexAlreadyExists,
UuidResolverError::UnexistingIndex(_) => Code::IndexNotFound,
UuidResolverError::BadlyFormatted(_) => Code::InvalidIndexUid,
UuidResolverError::Internal(_) => Code::Internal,
}
}
}

View File

@ -1,87 +0,0 @@
use std::collections::HashSet;
use std::path::{Path, PathBuf};
use tokio::sync::{mpsc, oneshot};
use uuid::Uuid;
use super::{HeedUuidStore, Result, UuidResolveMsg, UuidResolverActor, UuidResolverHandle};
#[derive(Clone)]
pub struct UuidResolverHandleImpl {
sender: mpsc::Sender<UuidResolveMsg>,
}
impl UuidResolverHandleImpl {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let (sender, reveiver) = mpsc::channel(100);
let store = HeedUuidStore::new(path)?;
let actor = UuidResolverActor::new(reveiver, store);
tokio::spawn(actor.run());
Ok(Self { sender })
}
}
#[async_trait::async_trait]
impl UuidResolverHandle for UuidResolverHandleImpl {
async fn get(&self, name: String) -> Result<Uuid> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Get { uid: name, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn delete(&self, name: String) -> Result<Uuid> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Delete { uid: name, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn list(&self) -> Result<Vec<(String, Uuid)>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::List { ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::Insert { ret, name, uuid };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::SnapshotRequest { path, ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn get_size(&self) -> Result<u64> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::GetSize { ret };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let (ret, receiver) = oneshot::channel();
let msg = UuidResolveMsg::DumpRequest { ret, path };
let _ = self.sender.send(msg).await;
Ok(receiver
.await
.expect("Uuid resolver actor has been killed")?)
}
}

View File

@ -1,35 +0,0 @@
mod actor;
pub mod error;
mod handle_impl;
mod message;
pub mod store;
use std::collections::HashSet;
use std::path::PathBuf;
use uuid::Uuid;
use actor::UuidResolverActor;
use error::Result;
use message::UuidResolveMsg;
use store::UuidStore;
#[cfg(test)]
use mockall::automock;
pub use handle_impl::UuidResolverHandleImpl;
pub use store::HeedUuidStore;
const UUID_STORE_SIZE: usize = 1_073_741_824; //1GiB
#[async_trait::async_trait]
#[cfg_attr(test, automock)]
pub trait UuidResolverHandle {
async fn get(&self, name: String) -> Result<Uuid>;
async fn insert(&self, name: String, uuid: Uuid) -> Result<()>;
async fn delete(&self, name: String) -> Result<Uuid>;
async fn list(&self) -> Result<Vec<(String, Uuid)>>;
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
async fn get_size(&self) -> Result<u64>;
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
}

View File

@ -1,224 +0,0 @@
use std::collections::HashSet;
use std::fs::{create_dir_all, File};
use std::io::{BufRead, BufReader, Write};
use std::path::{Path, PathBuf};
use heed::types::{ByteSlice, Str};
use heed::{CompactionOption, Database, Env, EnvOpenOptions};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use super::{error::UuidResolverError, Result, UUID_STORE_SIZE};
use crate::helpers::EnvSizer;
#[derive(Serialize, Deserialize)]
struct DumpEntry {
uuid: Uuid,
uid: String,
}
const UUIDS_DB_PATH: &str = "index_uuids";
#[async_trait::async_trait]
pub trait UuidStore: Sized {
// Create a new entry for `name`. Return an error if `err` and the entry already exists, return
// the uuid otherwise.
async fn get_uuid(&self, uid: String) -> Result<Option<Uuid>>;
async fn delete(&self, uid: String) -> Result<Option<Uuid>>;
async fn list(&self) -> Result<Vec<(String, Uuid)>>;
async fn insert(&self, name: String, uuid: Uuid) -> Result<()>;
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
async fn get_size(&self) -> Result<u64>;
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>>;
}
#[derive(Clone)]
pub struct HeedUuidStore {
env: Env,
db: Database<Str, ByteSlice>,
}
impl HeedUuidStore {
pub fn new(path: impl AsRef<Path>) -> Result<Self> {
let path = path.as_ref().join(UUIDS_DB_PATH);
create_dir_all(&path)?;
let mut options = EnvOpenOptions::new();
options.map_size(UUID_STORE_SIZE); // 1GB
let env = options.open(path)?;
let db = env.create_database(None)?;
Ok(Self { env, db })
}
pub fn get_uuid(&self, name: String) -> Result<Option<Uuid>> {
let env = self.env.clone();
let db = self.db;
let txn = env.read_txn()?;
match db.get(&txn, &name)? {
Some(uuid) => {
let uuid = Uuid::from_slice(uuid)?;
Ok(Some(uuid))
}
None => Ok(None),
}
}
pub fn delete(&self, uid: String) -> Result<Option<Uuid>> {
let env = self.env.clone();
let db = self.db;
let mut txn = env.write_txn()?;
match db.get(&txn, &uid)? {
Some(uuid) => {
let uuid = Uuid::from_slice(uuid)?;
db.delete(&mut txn, &uid)?;
txn.commit()?;
Ok(Some(uuid))
}
None => Ok(None),
}
}
pub fn list(&self) -> Result<Vec<(String, Uuid)>> {
let env = self.env.clone();
let db = self.db;
let txn = env.read_txn()?;
let mut entries = Vec::new();
for entry in db.iter(&txn)? {
let (name, uuid) = entry?;
let uuid = Uuid::from_slice(uuid)?;
entries.push((name.to_owned(), uuid))
}
Ok(entries)
}
pub fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let env = self.env.clone();
let db = self.db;
let mut txn = env.write_txn()?;
if db.get(&txn, &name)?.is_some() {
return Err(UuidResolverError::NameAlreadyExist);
}
db.put(&mut txn, &name, uuid.as_bytes())?;
txn.commit()?;
Ok(())
}
pub fn snapshot(&self, mut path: PathBuf) -> Result<HashSet<Uuid>> {
let env = self.env.clone();
let db = self.db;
// Write transaction to acquire a lock on the database.
let txn = env.write_txn()?;
let mut entries = HashSet::new();
for entry in db.iter(&txn)? {
let (_, uuid) = entry?;
let uuid = Uuid::from_slice(uuid)?;
entries.insert(uuid);
}
// only perform snapshot if there are indexes
if !entries.is_empty() {
path.push(UUIDS_DB_PATH);
create_dir_all(&path).unwrap();
path.push("data.mdb");
env.copy_to_path(path, CompactionOption::Enabled)?;
}
Ok(entries)
}
pub fn get_size(&self) -> Result<u64> {
Ok(self.env.size())
}
pub fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let dump_path = path.join(UUIDS_DB_PATH);
create_dir_all(&dump_path)?;
let dump_file_path = dump_path.join("data.jsonl");
let mut dump_file = File::create(&dump_file_path)?;
let mut uuids = HashSet::new();
let txn = self.env.read_txn()?;
for entry in self.db.iter(&txn)? {
let (uid, uuid) = entry?;
let uid = uid.to_string();
let uuid = Uuid::from_slice(uuid)?;
let entry = DumpEntry { uuid, uid };
serde_json::to_writer(&mut dump_file, &entry)?;
dump_file.write_all(b"\n").unwrap();
uuids.insert(uuid);
}
Ok(uuids)
}
pub fn load_dump(src: impl AsRef<Path>, dst: impl AsRef<Path>) -> Result<()> {
let uuid_resolver_path = dst.as_ref().join(UUIDS_DB_PATH);
std::fs::create_dir_all(&uuid_resolver_path)?;
let src_indexes = src.as_ref().join(UUIDS_DB_PATH).join("data.jsonl");
let indexes = File::open(&src_indexes)?;
let mut indexes = BufReader::new(indexes);
let mut line = String::new();
let db = Self::new(dst)?;
let mut txn = db.env.write_txn()?;
loop {
match indexes.read_line(&mut line) {
Ok(0) => break,
Ok(_) => {
let DumpEntry { uuid, uid } = serde_json::from_str(&line)?;
println!("importing {} {}", uid, uuid);
db.db.put(&mut txn, &uid, uuid.as_bytes())?;
}
Err(e) => return Err(e.into()),
}
line.clear();
}
txn.commit()?;
db.env.prepare_for_closing().wait();
Ok(())
}
}
#[async_trait::async_trait]
impl UuidStore for HeedUuidStore {
async fn get_uuid(&self, name: String) -> Result<Option<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.get_uuid(name)).await?
}
async fn delete(&self, uid: String) -> Result<Option<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.delete(uid)).await?
}
async fn list(&self) -> Result<Vec<(String, Uuid)>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.list()).await?
}
async fn insert(&self, name: String, uuid: Uuid) -> Result<()> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.insert(name, uuid)).await?
}
async fn snapshot(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.snapshot(path)).await?
}
async fn get_size(&self) -> Result<u64> {
self.get_size()
}
async fn dump(&self, path: PathBuf) -> Result<HashSet<Uuid>> {
let this = self.clone();
tokio::task::spawn_blocking(move || this.dump(path)).await?
}
}

View File

@ -1,69 +1,113 @@
pub mod data;
#![allow(rustdoc::private_intra_doc_links)]
#[macro_use]
pub mod error;
pub mod analytics;
mod task;
#[macro_use]
pub mod extractors;
pub mod helpers;
mod index;
mod index_controller;
pub mod option;
pub mod routes;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub mod analytics;
use std::sync::{atomic::AtomicBool, Arc};
use std::time::Duration;
use crate::extractors::authentication::AuthConfig;
pub use self::data::Data;
use crate::error::MeilisearchHttpError;
use actix_web::error::JsonPayloadError;
use analytics::Analytics;
use error::PayloadError;
use http::header::CONTENT_TYPE;
pub use option::Opt;
use actix_web::web;
use actix_web::{web, HttpRequest};
use extractors::authentication::policies::*;
use extractors::payload::PayloadConfig;
use meilisearch_auth::AuthController;
use meilisearch_lib::MeiliSearch;
pub fn configure_data(config: &mut web::ServiceConfig, data: Data) {
let http_payload_size_limit = data.http_payload_size_limit();
pub static AUTOBATCHING_ENABLED: AtomicBool = AtomicBool::new(false);
pub fn setup_meilisearch(opt: &Opt) -> anyhow::Result<MeiliSearch> {
let mut meilisearch = MeiliSearch::builder();
// enable autobatching?
let _ = AUTOBATCHING_ENABLED.store(
opt.scheduler_options.enable_auto_batching,
std::sync::atomic::Ordering::Relaxed,
);
meilisearch
.set_max_index_size(opt.max_index_size.get_bytes() as usize)
.set_max_task_store_size(opt.max_task_db_size.get_bytes() as usize)
// snapshot
.set_ignore_missing_snapshot(opt.ignore_missing_snapshot)
.set_ignore_snapshot_if_db_exists(opt.ignore_snapshot_if_db_exists)
.set_snapshot_interval(Duration::from_secs(opt.snapshot_interval_sec))
.set_snapshot_dir(opt.snapshot_dir.clone())
// dump
.set_ignore_missing_dump(opt.ignore_missing_dump)
.set_ignore_dump_if_db_exists(opt.ignore_dump_if_db_exists)
.set_dump_dst(opt.dumps_dir.clone());
if let Some(ref path) = opt.import_snapshot {
meilisearch.set_import_snapshot(path.clone());
}
if let Some(ref path) = opt.import_dump {
meilisearch.set_dump_src(path.clone());
}
if opt.schedule_snapshot {
meilisearch.set_schedule_snapshot();
}
meilisearch.build(
opt.db_path.clone(),
opt.indexer_options.clone(),
opt.scheduler_options.clone(),
)
}
pub fn configure_data(
config: &mut web::ServiceConfig,
data: MeiliSearch,
auth: AuthController,
opt: &Opt,
analytics: Arc<dyn Analytics>,
) {
let http_payload_size_limit = opt.http_payload_size_limit.get_bytes() as usize;
config
.data(data.clone())
.app_data(data)
.app_data(auth)
.app_data(web::Data::from(analytics))
.app_data(
web::JsonConfig::default()
.limit(http_payload_size_limit)
.content_type(|_mime| true) // Accept all mime types
.error_handler(|err, _req| error::payload_error_handler(err).into()),
.content_type(|mime| mime == mime::APPLICATION_JSON)
.error_handler(|err, req: &HttpRequest| match err {
JsonPayloadError::ContentType => match req.headers().get(CONTENT_TYPE) {
Some(content_type) => MeilisearchHttpError::InvalidContentType(
content_type.to_str().unwrap_or("unknown").to_string(),
vec![mime::APPLICATION_JSON.to_string()],
)
.into(),
None => MeilisearchHttpError::MissingContentType(vec![
mime::APPLICATION_JSON.to_string(),
])
.into(),
},
err => PayloadError::from(err).into(),
}),
)
.app_data(PayloadConfig::new(http_payload_size_limit))
.app_data(
web::QueryConfig::default()
.error_handler(|err, _req| error::payload_error_handler(err).into()),
web::QueryConfig::default().error_handler(|err, _req| PayloadError::from(err).into()),
);
}
pub fn configure_auth(config: &mut web::ServiceConfig, data: &Data) {
let keys = data.api_keys();
let auth_config = if let Some(ref master_key) = keys.master {
let private_key = keys.private.as_ref().unwrap();
let public_key = keys.public.as_ref().unwrap();
let mut policies = init_policies!(Public, Private, Admin);
create_users!(
policies,
master_key.as_bytes() => { Admin, Private, Public },
private_key.as_bytes() => { Private, Public },
public_key.as_bytes() => { Public }
);
AuthConfig::Auth(policies)
} else {
AuthConfig::NoAuth
};
config.app_data(auth_config);
}
#[cfg(feature = "mini-dashboard")]
pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
use actix_web::HttpResponse;
use actix_web_static_files::Resource;
use static_files::Resource;
mod generated {
include!(concat!(env!("OUT_DIR"), "/generated.rs"));
@ -71,7 +115,6 @@ pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
if enable_frontend {
let generated = generated::generate();
let mut scope = web::scope("/");
// Generate routes for mini-dashboard assets
for (path, resource) in generated.into_iter() {
let Resource {
@ -79,16 +122,15 @@ pub fn dashboard(config: &mut web::ServiceConfig, enable_frontend: bool) {
} = resource;
// Redirect index.html to /
if path == "index.html" {
config.service(web::resource("/").route(
web::get().to(move || HttpResponse::Ok().content_type(mime_type).body(data)),
));
config.service(web::resource("/").route(web::get().to(move || async move {
HttpResponse::Ok().content_type(mime_type).body(data)
})));
} else {
scope = scope.service(web::resource(path).route(
web::get().to(move || HttpResponse::Ok().content_type(mime_type).body(data)),
));
config.service(web::resource(path).route(web::get().to(move || async move {
HttpResponse::Ok().content_type(mime_type).body(data)
})));
}
}
config.service(scope);
} else {
config.service(web::resource("/").route(web::get().to(routes::running)));
}
@ -101,30 +143,24 @@ pub fn dashboard(config: &mut web::ServiceConfig, _enable_frontend: bool) {
#[macro_export]
macro_rules! create_app {
($data:expr, $enable_frontend:expr) => {{
($data:expr, $auth:expr, $enable_frontend:expr, $opt:expr, $analytics:expr) => {{
use actix_cors::Cors;
use actix_web::middleware::TrailingSlash;
use actix_web::App;
use actix_web::{middleware, web};
use meilisearch_http::routes::*;
use meilisearch_http::{configure_auth, configure_data, dashboard};
use meilisearch_error::ResponseError;
use meilisearch_http::error::MeilisearchHttpError;
use meilisearch_http::routes;
use meilisearch_http::{configure_data, dashboard};
App::new()
.configure(|s| configure_data(s, $data.clone()))
.configure(|s| configure_auth(s, &$data))
.configure(document::services)
.configure(index::services)
.configure(search::services)
.configure(settings::services)
.configure(health::services)
.configure(stats::services)
.configure(key::services)
.configure(dump::services)
.configure(|s| configure_data(s, $data.clone(), $auth.clone(), &$opt, $analytics))
.configure(routes::configure)
.configure(|s| dashboard(s, $enable_frontend))
.wrap(
Cors::default()
.send_wildcard()
.allowed_headers(vec!["content-type", "x-meili-api-key"])
.allow_any_header()
.allow_any_origin()
.allow_any_method()
.max_age(86_400), // 24h

View File

@ -1,26 +1,20 @@
use std::env;
use std::sync::Arc;
use actix_web::HttpServer;
use main_error::MainError;
use meilisearch_http::{create_app, Data, Opt};
use structopt::StructOpt;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
use clap::Parser;
use meilisearch_auth::AuthController;
use meilisearch_http::analytics;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
use std::sync::Arc;
use meilisearch_http::analytics::Analytics;
use meilisearch_http::{create_app, setup_meilisearch, Opt};
use meilisearch_lib::MeiliSearch;
#[cfg(target_os = "linux")]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
const SENTRY_DSN: &str = "https://5ddfa22b95f241198be2271aaf028653@sentry.io/3060337";
#[actix_web::main]
async fn main() -> Result<(), MainError> {
let opt = Opt::from_args();
static ALLOC: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
/// does all the setup before meilisearch is launched
fn setup(opt: &Opt) -> anyhow::Result<()> {
let mut log_builder = env_logger::Builder::new();
log_builder.parse_filters(&opt.log_level);
if opt.log_level == "info" {
@ -28,69 +22,68 @@ async fn main() -> Result<(), MainError> {
log_builder.filter_module("milli", log::LevelFilter::Warn);
}
match opt.env.as_ref() {
"production" => {
if opt.master_key.is_none() {
return Err(
"In production mode, the environment variable MEILI_MASTER_KEY is mandatory"
.into(),
);
}
#[cfg(all(not(debug_assertions), feature = "analytics"))]
if !opt.no_analytics {
let logger =
sentry::integrations::log::SentryLogger::with_dest(log_builder.build());
log::set_boxed_logger(Box::new(logger))
.map(|()| log::set_max_level(log::LevelFilter::Info))
.unwrap();
let sentry = sentry::init(sentry::ClientOptions {
release: sentry::release_name!(),
dsn: Some(SENTRY_DSN.parse()?),
before_send: Some(Arc::new(|event| {
event
.message
.as_ref()
.map(|msg| msg.to_lowercase().contains("no space left on device"))
.unwrap_or(false)
.then(|| event)
})),
..Default::default()
});
// sentry must stay alive as long as possible
std::mem::forget(sentry);
} else {
log_builder.init();
}
}
"development" => {
log_builder.init();
}
_ => unreachable!(),
}
let data = Data::new(opt.clone())?;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
if !opt.no_analytics {
let analytics_data = data.clone();
let analytics_opt = opt.clone();
tokio::task::spawn(analytics::analytics_sender(analytics_data, analytics_opt));
}
print_launch_resume(&opt, &data);
run_http(data, opt).await?;
log_builder.init();
Ok(())
}
async fn run_http(data: Data, opt: Opt) -> Result<(), Box<dyn std::error::Error>> {
#[actix_web::main]
async fn main() -> anyhow::Result<()> {
let opt = Opt::parse();
setup(&opt)?;
match opt.env.as_ref() {
"production" => {
if opt.master_key.is_none() {
anyhow::bail!(
"In production mode, the environment variable MEILI_MASTER_KEY is mandatory"
)
}
}
"development" => (),
_ => unreachable!(),
}
let meilisearch = setup_meilisearch(&opt)?;
let auth_controller = AuthController::new(&opt.db_path, &opt.master_key)?;
#[cfg(all(not(debug_assertions), feature = "analytics"))]
let (analytics, user) = if !opt.no_analytics {
analytics::SegmentAnalytics::new(&opt, &meilisearch).await
} else {
analytics::MockAnalytics::new(&opt)
};
#[cfg(any(debug_assertions, not(feature = "analytics")))]
let (analytics, user) = analytics::MockAnalytics::new(&opt);
print_launch_resume(&opt, &user);
run_http(meilisearch, auth_controller, opt, analytics).await?;
Ok(())
}
async fn run_http(
data: MeiliSearch,
auth_controller: AuthController,
opt: Opt,
analytics: Arc<dyn Analytics>,
) -> anyhow::Result<()> {
let _enable_dashboard = &opt.env == "development";
let http_server = HttpServer::new(move || create_app!(data, _enable_dashboard))
// Disable signals allows the server to terminate immediately when a user enter CTRL-C
.disable_signals();
let opt_clone = opt.clone();
let http_server = HttpServer::new(move || {
create_app!(
data,
auth_controller,
_enable_dashboard,
opt_clone,
analytics.clone()
)
})
// Disable signals allows the server to terminate immediately when a user enter CTRL-C
.disable_signals();
if let Some(config) = opt.get_ssl_config()? {
http_server
@ -98,30 +91,24 @@ async fn run_http(data: Data, opt: Opt) -> Result<(), Box<dyn std::error::Error>
.run()
.await?;
} else {
http_server.bind(opt.http_addr)?.run().await?;
http_server.bind(&opt.http_addr)?.run().await?;
}
Ok(())
}
pub fn print_launch_resume(opt: &Opt, data: &Data) {
let commit_sha = match option_env!("COMMIT_SHA") {
Some("") | None => env!("VERGEN_SHA"),
Some(commit_sha) => commit_sha,
};
let commit_date = match option_env!("COMMIT_DATE") {
Some("") | None => env!("VERGEN_COMMIT_DATE"),
Some(commit_date) => commit_date,
};
pub fn print_launch_resume(opt: &Opt, user: &str) {
let commit_sha = option_env!("VERGEN_GIT_SHA").unwrap_or("unknown");
let commit_date = option_env!("VERGEN_GIT_COMMIT_TIMESTAMP").unwrap_or("unknown");
let ascii_name = r#"
888b d888 d8b 888 d8b .d8888b. 888
8888b d8888 Y8P 888 Y8P d88P Y88b 888
88888b.d88888 888 Y88b. 888
888Y88888P888 .d88b. 888 888 888 "Y888b. .d88b. 8888b. 888d888 .d8888b 88888b.
888 Y888P 888 d8P Y8b 888 888 888 "Y88b. d8P Y8b "88b 888P" d88P" 888 "88b
888 Y8P 888 88888888 888 888 888 "888 88888888 .d888888 888 888 888 888
888 " 888 Y8b. 888 888 888 Y88b d88P Y8b. 888 888 888 Y88b. 888 888
888 888 "Y8888 888 888 888 "Y8888P" "Y8888 "Y888888 888 "Y8888P 888 888
888b d888 d8b 888 d8b 888
8888b d8888 Y8P 888 Y8P 888
88888b.d88888 888 888
888Y88888P888 .d88b. 888 888 888 .d8888b .d88b. 8888b. 888d888 .d8888b 88888b.
888 Y888P 888 d8P Y8b 888 888 888 88K d8P Y8b "88b 888P" d88P" 888 "88b
888 Y8P 888 88888888 888 888 888 "Y8888b. 88888888 .d888888 888 888 888 888
888 " 888 Y8b. 888 888 888 X88 Y8b. 888 888 888 Y88b. 888 888
888 888 "Y8888 888 888 888 88888P' "Y8888 "Y888888 888 "Y8888P 888 888
"#;
eprintln!("{}", ascii_name);
@ -138,24 +125,28 @@ pub fn print_launch_resume(opt: &Opt, data: &Data) {
#[cfg(all(not(debug_assertions), feature = "analytics"))]
{
if opt.no_analytics {
eprintln!("Anonymous telemetry:\t\"Disabled\"");
} else {
if !opt.no_analytics {
eprintln!(
"
Thank you for using MeiliSearch!
Thank you for using Meilisearch!
We collect anonymized analytics to improve our product and your experience. To learn more, including how to turn off analytics, visit our dedicated documentation page: https://docs.meilisearch.com/reference/features/configuration.html#analytics
We collect anonymized analytics to improve our product and your experience. To learn more, including how to turn off analytics, visit our dedicated documentation page: https://docs.meilisearch.com/learn/what_is_meilisearch/telemetry.html
Anonymous telemetry: \"Enabled\""
Anonymous telemetry:\t\"Enabled\""
);
} else {
eprintln!("Anonymous telemetry:\t\"Disabled\"");
}
}
if !user.is_empty() {
eprintln!("Instance UID:\t\t\"{}\"", user);
}
eprintln!();
if data.api_keys().master.is_some() {
eprintln!("A Master Key has been set. Requests to MeiliSearch won't be authorized unless you provide an authentication key.");
if opt.master_key.is_some() {
eprintln!("A Master Key has been set. Requests to Meilisearch won't be authorized unless you provide an authentication key.");
} else {
eprintln!("No master key found; The server will accept unidentified requests. \
If you need some protection in development mode, please export a key: export MEILI_MASTER_KEY=xxx");
@ -164,6 +155,6 @@ Anonymous telemetry: \"Enabled\""
eprintln!();
eprintln!("Documentation:\t\thttps://docs.meilisearch.com");
eprintln!("Source code:\t\thttps://github.com/meilisearch/meilisearch");
eprintln!("Contact:\t\thttps://docs.meilisearch.com/resources/contact.html or bonjour@meilisearch.com");
eprintln!("Contact:\t\thttps://docs.meilisearch.com/resources/contact.html");
eprintln!();
}

View File

@ -1,205 +1,171 @@
use std::fs;
use std::io::{BufReader, Read};
use std::path::PathBuf;
use std::sync::Arc;
use std::{error, fs};
use byte_unit::Byte;
use grenad::CompressionType;
use rustls::internal::pemfile::{certs, pkcs8_private_keys, rsa_private_keys};
use clap::Parser;
use meilisearch_lib::options::{IndexerOpts, SchedulerConfig};
use rustls::{
AllowAnyAnonymousOrAuthenticatedClient, AllowAnyAuthenticatedClient, NoClientAuth,
server::{
AllowAnyAnonymousOrAuthenticatedClient, AllowAnyAuthenticatedClient,
ServerSessionMemoryCache,
},
RootCertStore,
};
use structopt::StructOpt;
#[derive(Debug, Clone, StructOpt)]
pub struct IndexerOpts {
/// The amount of documents to skip before printing
/// a log regarding the indexing advancement.
#[structopt(long, default_value = "100000")] // 100k
pub log_every_n: usize,
/// Grenad max number of chunks in bytes.
#[structopt(long)]
pub max_nb_chunks: Option<usize>,
/// The maximum amount of memory to use for the Grenad buffer. It is recommended
/// to use something like 80%-90% of the available memory.
///
/// It is automatically split by the number of jobs e.g. if you use 7 jobs
/// and 7 GB of max memory, each thread will use a maximum of 1 GB.
#[structopt(long, default_value = "7 GiB")]
pub max_memory: Byte,
/// Size of the linked hash map cache when indexing.
/// The bigger it is, the faster the indexing is but the more memory it takes.
#[structopt(long, default_value = "500")]
pub linked_hash_map_size: usize,
/// The name of the compression algorithm to use when compressing intermediate
/// Grenad chunks while indexing documents.
///
/// Choosing a fast algorithm will make the indexing faster but may consume more memory.
#[structopt(long, default_value = "snappy", possible_values = &["snappy", "zlib", "lz4", "lz4hc", "zstd"])]
pub chunk_compression_type: CompressionType,
/// The level of compression of the chosen algorithm.
#[structopt(long, requires = "chunk-compression-type")]
pub chunk_compression_level: Option<u32>,
/// The number of bytes to remove from the begining of the chunks while reading/sorting
/// or merging them.
///
/// File fusing must only be enable on file systems that support the `FALLOC_FL_COLLAPSE_RANGE`,
/// (i.e. ext4 and XFS). File fusing will only work if the `enable-chunk-fusing` is set.
#[structopt(long, default_value = "4 GiB")]
pub chunk_fusing_shrink_size: Byte,
/// Enable the chunk fusing or not, this reduces the amount of disk space used.
#[structopt(long)]
pub enable_chunk_fusing: bool,
/// Number of parallel jobs for indexing, defaults to # of CPUs.
#[structopt(long)]
pub indexing_jobs: Option<usize>,
}
impl Default for IndexerOpts {
fn default() -> Self {
Self {
log_every_n: 100_000,
max_nb_chunks: None,
max_memory: Byte::from_str("1GiB").unwrap(),
linked_hash_map_size: 500,
chunk_compression_type: CompressionType::None,
chunk_compression_level: None,
chunk_fusing_shrink_size: Byte::from_str("4GiB").unwrap(),
enable_chunk_fusing: false,
indexing_jobs: None,
}
}
}
use rustls_pemfile::{certs, pkcs8_private_keys, rsa_private_keys};
use serde::Serialize;
const POSSIBLE_ENV: [&str; 2] = ["development", "production"];
#[derive(Debug, Clone, StructOpt)]
#[derive(Debug, Clone, Parser, Serialize)]
pub struct Opt {
/// The destination where the database must be created.
#[structopt(long, env = "MEILI_DB_PATH", default_value = "./data.ms")]
#[clap(long, env = "MEILI_DB_PATH", default_value = "./data.ms")]
pub db_path: PathBuf,
/// The address on which the http server will listen.
#[structopt(long, env = "MEILI_HTTP_ADDR", default_value = "127.0.0.1:7700")]
#[clap(long, env = "MEILI_HTTP_ADDR", default_value = "127.0.0.1:7700")]
pub http_addr: String,
/// The master key allowing you to do everything on the server.
#[structopt(long, env = "MEILI_MASTER_KEY")]
#[serde(skip)]
#[clap(long, env = "MEILI_MASTER_KEY")]
pub master_key: Option<String>,
/// This environment variable must be set to `production` if you are running in production.
/// If the server is running in development mode more logs will be displayed,
/// and the master key can be avoided which implies that there is no security on the updates routes.
/// This is useful to debug when integrating the engine with another service.
#[structopt(long, env = "MEILI_ENV", default_value = "development", possible_values = &POSSIBLE_ENV)]
#[clap(long, env = "MEILI_ENV", default_value = "development", possible_values = &POSSIBLE_ENV)]
pub env: String,
/// Do not send analytics to Meili.
#[cfg(all(not(debug_assertions), feature = "analytics"))]
#[structopt(long, env = "MEILI_NO_ANALYTICS")]
#[serde(skip)] // we can't send true
#[clap(long, env = "MEILI_NO_ANALYTICS")]
pub no_analytics: bool,
/// The maximum size, in bytes, of the main lmdb database directory
#[structopt(long, env = "MEILI_MAX_INDEX_SIZE", default_value = "100 GiB")]
#[clap(long, env = "MEILI_MAX_INDEX_SIZE", default_value = "100 GiB")]
pub max_index_size: Byte,
/// The maximum size, in bytes, of the update lmdb database directory
#[structopt(long, env = "MEILI_MAX_UDB_SIZE", default_value = "100 GiB")]
pub max_udb_size: Byte,
#[clap(long, env = "MEILI_MAX_TASK_DB_SIZE", default_value = "100 GiB")]
pub max_task_db_size: Byte,
/// The maximum size, in bytes, of accepted JSON payloads
#[structopt(long, env = "MEILI_HTTP_PAYLOAD_SIZE_LIMIT", default_value = "100 MB")]
#[clap(long, env = "MEILI_HTTP_PAYLOAD_SIZE_LIMIT", default_value = "100 MB")]
pub http_payload_size_limit: Byte,
/// Read server certificates from CERTFILE.
/// This should contain PEM-format certificates
/// in the right order (the first certificate should
/// certify KEYFILE, the last should be a root CA).
#[structopt(long, env = "MEILI_SSL_CERT_PATH", parse(from_os_str))]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_CERT_PATH", parse(from_os_str))]
pub ssl_cert_path: Option<PathBuf>,
/// Read private key from KEYFILE. This should be a RSA
/// private key or PKCS8-encoded private key, in PEM format.
#[structopt(long, env = "MEILI_SSL_KEY_PATH", parse(from_os_str))]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_KEY_PATH", parse(from_os_str))]
pub ssl_key_path: Option<PathBuf>,
/// Enable client authentication, and accept certificates
/// signed by those roots provided in CERTFILE.
#[structopt(long, env = "MEILI_SSL_AUTH_PATH", parse(from_os_str))]
#[clap(long, env = "MEILI_SSL_AUTH_PATH", parse(from_os_str))]
#[serde(skip)]
pub ssl_auth_path: Option<PathBuf>,
/// Read DER-encoded OCSP response from OCSPFILE and staple to certificate.
/// Optional
#[structopt(long, env = "MEILI_SSL_OCSP_PATH", parse(from_os_str))]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_OCSP_PATH", parse(from_os_str))]
pub ssl_ocsp_path: Option<PathBuf>,
/// Send a fatal alert if the client does not complete client authentication.
#[structopt(long, env = "MEILI_SSL_REQUIRE_AUTH")]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_REQUIRE_AUTH")]
pub ssl_require_auth: bool,
/// SSL support session resumption
#[structopt(long, env = "MEILI_SSL_RESUMPTION")]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_RESUMPTION")]
pub ssl_resumption: bool,
/// SSL support tickets.
#[structopt(long, env = "MEILI_SSL_TICKETS")]
#[serde(skip)]
#[clap(long, env = "MEILI_SSL_TICKETS")]
pub ssl_tickets: bool,
/// Defines the path of the snapshot file to import.
/// This option will, by default, stop the process if a database already exist or if no snapshot exists at
/// the given path. If this option is not specified no snapshot is imported.
#[structopt(long)]
#[clap(long)]
pub import_snapshot: Option<PathBuf>,
/// The engine will ignore a missing snapshot and not return an error in such case.
#[structopt(long, requires = "import-snapshot")]
#[clap(long, requires = "import-snapshot")]
pub ignore_missing_snapshot: bool,
/// The engine will skip snapshot importation and not return an error in such case.
#[structopt(long, requires = "import-snapshot")]
#[clap(long, requires = "import-snapshot")]
pub ignore_snapshot_if_db_exists: bool,
/// Defines the directory path where meilisearch will create snapshot each snapshot_time_gap.
#[structopt(long, env = "MEILI_SNAPSHOT_DIR", default_value = "snapshots/")]
#[clap(long, env = "MEILI_SNAPSHOT_DIR", default_value = "snapshots/")]
pub snapshot_dir: PathBuf,
/// Activate snapshot scheduling.
#[structopt(long, env = "MEILI_SCHEDULE_SNAPSHOT")]
#[clap(long, env = "MEILI_SCHEDULE_SNAPSHOT")]
pub schedule_snapshot: bool,
/// Defines time interval, in seconds, between each snapshot creation.
#[structopt(long, env = "MEILI_SNAPSHOT_INTERVAL_SEC", default_value = "86400")] // 24h
#[clap(long, env = "MEILI_SNAPSHOT_INTERVAL_SEC", default_value = "86400")] // 24h
pub snapshot_interval_sec: u64,
/// Folder where dumps are created when the dump route is called.
#[structopt(long, env = "MEILI_DUMPS_DIR", default_value = "dumps/")]
pub dumps_dir: PathBuf,
/// Import a dump from the specified path, must be a `.dump` file.
#[structopt(long, conflicts_with = "import-snapshot")]
#[clap(long, conflicts_with = "import-snapshot")]
pub import_dump: Option<PathBuf>,
/// If the dump doesn't exists, load or create the database specified by `db-path` instead.
#[clap(long, requires = "import-dump")]
pub ignore_missing_dump: bool,
/// Ignore the dump if a database already exists, and load that database instead.
#[clap(long, requires = "import-dump")]
pub ignore_dump_if_db_exists: bool,
/// Folder where dumps are created when the dump route is called.
#[clap(long, env = "MEILI_DUMPS_DIR", default_value = "dumps/")]
pub dumps_dir: PathBuf,
/// Set the log level
#[structopt(long, env = "MEILI_LOG_LEVEL", default_value = "info")]
#[clap(long, env = "MEILI_LOG_LEVEL", default_value = "info")]
pub log_level: String,
#[structopt(skip)]
#[serde(flatten)]
#[clap(flatten)]
pub indexer_options: IndexerOpts,
#[serde(flatten)]
#[clap(flatten)]
pub scheduler_options: SchedulerConfig,
}
impl Opt {
pub fn get_ssl_config(&self) -> Result<Option<rustls::ServerConfig>, Box<dyn error::Error>> {
/// Wether analytics should be enabled or not.
#[cfg(all(not(debug_assertions), feature = "analytics"))]
pub fn analytics(&self) -> bool {
!self.no_analytics
}
pub fn get_ssl_config(&self) -> anyhow::Result<Option<rustls::ServerConfig>> {
if let (Some(cert_path), Some(key_path)) = (&self.ssl_cert_path, &self.ssl_key_path) {
let client_auth = match &self.ssl_auth_path {
let config = rustls::ServerConfig::builder().with_safe_defaults();
let config = match &self.ssl_auth_path {
Some(auth_path) => {
let roots = load_certs(auth_path.to_path_buf())?;
let mut client_auth_roots = RootCertStore::empty();
@ -207,30 +173,32 @@ impl Opt {
client_auth_roots.add(&root).unwrap();
}
if self.ssl_require_auth {
AllowAnyAuthenticatedClient::new(client_auth_roots)
let verifier = AllowAnyAuthenticatedClient::new(client_auth_roots);
config.with_client_cert_verifier(verifier)
} else {
AllowAnyAnonymousOrAuthenticatedClient::new(client_auth_roots)
let verifier =
AllowAnyAnonymousOrAuthenticatedClient::new(client_auth_roots);
config.with_client_cert_verifier(verifier)
}
}
None => NoClientAuth::new(),
None => config.with_no_client_auth(),
};
let mut config = rustls::ServerConfig::new(client_auth);
config.key_log = Arc::new(rustls::KeyLogFile::new());
let certs = load_certs(cert_path.to_path_buf())?;
let privkey = load_private_key(key_path.to_path_buf())?;
let ocsp = load_ocsp(&self.ssl_ocsp_path)?;
config
.set_single_cert_with_ocsp_and_sct(certs, privkey, ocsp, vec![])
.map_err(|_| "bad certificates/private key")?;
let mut config = config
.with_single_cert_with_ocsp_and_sct(certs, privkey, ocsp, vec![])
.map_err(|_| anyhow::anyhow!("bad certificates/private key"))?;
config.key_log = Arc::new(rustls::KeyLogFile::new());
if self.ssl_resumption {
config.set_persistence(rustls::ServerSessionMemoryCache::new(256));
config.session_storage = ServerSessionMemoryCache::new(256);
}
if self.ssl_tickets {
config.ticketer = rustls::Ticketer::new();
config.ticketer = rustls::Ticketer::new().unwrap();
}
Ok(Some(config))
@ -240,45 +208,63 @@ impl Opt {
}
}
fn load_certs(filename: PathBuf) -> Result<Vec<rustls::Certificate>, Box<dyn error::Error>> {
let certfile = fs::File::open(filename).map_err(|_| "cannot open certificate file")?;
fn load_certs(filename: PathBuf) -> anyhow::Result<Vec<rustls::Certificate>> {
let certfile =
fs::File::open(filename).map_err(|_| anyhow::anyhow!("cannot open certificate file"))?;
let mut reader = BufReader::new(certfile);
Ok(certs(&mut reader).map_err(|_| "cannot read certificate file")?)
certs(&mut reader)
.map(|certs| certs.into_iter().map(rustls::Certificate).collect())
.map_err(|_| anyhow::anyhow!("cannot read certificate file"))
}
fn load_private_key(filename: PathBuf) -> Result<rustls::PrivateKey, Box<dyn error::Error>> {
fn load_private_key(filename: PathBuf) -> anyhow::Result<rustls::PrivateKey> {
let rsa_keys = {
let keyfile =
fs::File::open(filename.clone()).map_err(|_| "cannot open private key file")?;
let keyfile = fs::File::open(filename.clone())
.map_err(|_| anyhow::anyhow!("cannot open private key file"))?;
let mut reader = BufReader::new(keyfile);
rsa_private_keys(&mut reader).map_err(|_| "file contains invalid rsa private key")?
rsa_private_keys(&mut reader)
.map_err(|_| anyhow::anyhow!("file contains invalid rsa private key"))?
};
let pkcs8_keys = {
let keyfile = fs::File::open(filename).map_err(|_| "cannot open private key file")?;
let keyfile = fs::File::open(filename)
.map_err(|_| anyhow::anyhow!("cannot open private key file"))?;
let mut reader = BufReader::new(keyfile);
pkcs8_private_keys(&mut reader)
.map_err(|_| "file contains invalid pkcs8 private key (encrypted keys not supported)")?
pkcs8_private_keys(&mut reader).map_err(|_| {
anyhow::anyhow!(
"file contains invalid pkcs8 private key (encrypted keys not supported)"
)
})?
};
// prefer to load pkcs8 keys
if !pkcs8_keys.is_empty() {
Ok(pkcs8_keys[0].clone())
Ok(rustls::PrivateKey(pkcs8_keys[0].clone()))
} else {
assert!(!rsa_keys.is_empty());
Ok(rsa_keys[0].clone())
Ok(rustls::PrivateKey(rsa_keys[0].clone()))
}
}
fn load_ocsp(filename: &Option<PathBuf>) -> Result<Vec<u8>, Box<dyn error::Error>> {
fn load_ocsp(filename: &Option<PathBuf>) -> anyhow::Result<Vec<u8>> {
let mut ret = Vec::new();
if let Some(ref name) = filename {
fs::File::open(name)
.map_err(|_| "cannot open ocsp file")?
.map_err(|_| anyhow::anyhow!("cannot open ocsp file"))?
.read_to_end(&mut ret)
.map_err(|_| "cannot read oscp file")?;
.map_err(|_| anyhow::anyhow!("cannot read oscp file"))?;
}
Ok(ret)
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_valid_opt() {
assert!(Opt::try_parse_from(Some("")).is_ok());
}
}

View File

@ -0,0 +1,154 @@
use std::str;
use actix_web::{web, HttpRequest, HttpResponse};
use meilisearch_auth::{error::AuthControllerError, Action, AuthController, Key};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use time::OffsetDateTime;
use crate::extractors::{
authentication::{policies::*, GuardedData},
sequential_extractor::SeqHandler,
};
use meilisearch_error::{Code, ResponseError};
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::post().to(SeqHandler(create_api_key)))
.route(web::get().to(SeqHandler(list_api_keys))),
)
.service(
web::resource("/{api_key}")
.route(web::get().to(SeqHandler(get_api_key)))
.route(web::patch().to(SeqHandler(patch_api_key)))
.route(web::delete().to(SeqHandler(delete_api_key))),
);
}
pub async fn create_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
body: web::Json<Value>,
_req: HttpRequest,
) -> Result<HttpResponse, ResponseError> {
let v = body.into_inner();
let res = tokio::task::spawn_blocking(move || -> Result<_, AuthControllerError> {
let key = auth_controller.create_key(v)?;
Ok(KeyView::from_key(key, &auth_controller))
})
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))??;
Ok(HttpResponse::Created().json(res))
}
pub async fn list_api_keys(
auth_controller: GuardedData<MasterPolicy, AuthController>,
_req: HttpRequest,
) -> Result<HttpResponse, ResponseError> {
let res = tokio::task::spawn_blocking(move || -> Result<_, AuthControllerError> {
let keys = auth_controller.list_keys()?;
let res: Vec<_> = keys
.into_iter()
.map(|k| KeyView::from_key(k, &auth_controller))
.collect();
Ok(res)
})
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))??;
Ok(HttpResponse::Ok().json(KeyListView::from(res)))
}
pub async fn get_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
let api_key = path.into_inner().api_key;
let res = tokio::task::spawn_blocking(move || -> Result<_, AuthControllerError> {
let key = auth_controller.get_key(&api_key)?;
Ok(KeyView::from_key(key, &auth_controller))
})
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))??;
Ok(HttpResponse::Ok().json(res))
}
pub async fn patch_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
body: web::Json<Value>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
let api_key = path.into_inner().api_key;
let body = body.into_inner();
let res = tokio::task::spawn_blocking(move || -> Result<_, AuthControllerError> {
let key = auth_controller.update_key(&api_key, body)?;
Ok(KeyView::from_key(key, &auth_controller))
})
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))??;
Ok(HttpResponse::Ok().json(res))
}
pub async fn delete_api_key(
auth_controller: GuardedData<MasterPolicy, AuthController>,
path: web::Path<AuthParam>,
) -> Result<HttpResponse, ResponseError> {
let api_key = path.into_inner().api_key;
tokio::task::spawn_blocking(move || auth_controller.delete_key(&api_key))
.await
.map_err(|e| ResponseError::from_msg(e.to_string(), Code::Internal))??;
Ok(HttpResponse::NoContent().finish())
}
#[derive(Deserialize)]
pub struct AuthParam {
api_key: String,
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
struct KeyView {
description: Option<String>,
key: String,
actions: Vec<Action>,
indexes: Vec<String>,
#[serde(serialize_with = "time::serde::rfc3339::option::serialize")]
expires_at: Option<OffsetDateTime>,
#[serde(serialize_with = "time::serde::rfc3339::serialize")]
created_at: OffsetDateTime,
#[serde(serialize_with = "time::serde::rfc3339::serialize")]
updated_at: OffsetDateTime,
}
impl KeyView {
fn from_key(key: Key, auth: &AuthController) -> Self {
let key_id = str::from_utf8(&key.id).unwrap();
let generated_key = auth.generate_key(key_id).unwrap_or_default();
KeyView {
description: key.description,
key: generated_key,
actions: key.actions,
indexes: key.indexes,
expires_at: key.expires_at,
created_at: key.created_at,
updated_at: key.updated_at,
}
}
}
#[derive(Debug, Serialize)]
struct KeyListView {
results: Vec<KeyView>,
}
impl From<Vec<KeyView>> for KeyListView {
fn from(results: Vec<KeyView>) -> Self {
Self { results }
}
}

View File

@ -1,217 +0,0 @@
use actix_web::{web, HttpResponse};
use log::debug;
use milli::update::{IndexDocumentsMethod, UpdateFormat};
use serde::Deserialize;
use serde_json::Value;
use crate::error::ResponseError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::payload::Payload;
use crate::routes::IndexParam;
use crate::Data;
const DEFAULT_RETRIEVE_DOCUMENTS_OFFSET: usize = 0;
const DEFAULT_RETRIEVE_DOCUMENTS_LIMIT: usize = 20;
/*
macro_rules! guard_content_type {
($fn_name:ident, $guard_value:literal) => {
fn $fn_name(head: &actix_web::dev::RequestHead) -> bool {
if let Some(content_type) = head.headers.get("Content-Type") {
content_type
.to_str()
.map(|v| v.contains($guard_value))
.unwrap_or(false)
} else {
false
}
}
};
}
guard_content_type!(guard_json, "application/json");
*/
fn guard_json(head: &actix_web::dev::RequestHead) -> bool {
if let Some(_content_type) = head.headers.get("Content-Type") {
// CURRENTLY AND FOR THIS RELEASE ONLY WE DECIDED TO INTERPRET ALL CONTENT-TYPES AS JSON
true
/*
content_type
.to_str()
.map(|v| v.contains("application/json"))
.unwrap_or(false)
*/
} else {
// if no content-type is specified we still accept the data as json!
true
}
}
#[derive(Deserialize)]
struct DocumentParam {
index_uid: String,
document_id: String,
}
pub fn services(cfg: &mut web::ServiceConfig) {
cfg.service(
web::scope("/indexes/{index_uid}/documents")
.service(
web::resource("")
.route(web::get().to(get_all_documents))
.route(web::post().guard(guard_json).to(add_documents))
.route(web::put().guard(guard_json).to(update_documents))
.route(web::delete().to(clear_all_documents)),
)
// this route needs to be before the /documents/{document_id} to match properly
.service(web::resource("/delete-batch").route(web::post().to(delete_documents)))
.service(
web::resource("/{document_id}")
.route(web::get().to(get_document))
.route(web::delete().to(delete_document)),
),
);
}
async fn get_document(
data: GuardedData<Public, Data>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let index = path.index_uid.clone();
let id = path.document_id.clone();
let document = data
.retrieve_document(index, id, None as Option<Vec<String>>)
.await?;
debug!("returns: {:?}", document);
Ok(HttpResponse::Ok().json(document))
}
async fn delete_document(
data: GuardedData<Private, Data>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let update_status = data
.delete_documents(path.index_uid.clone(), vec![path.document_id.clone()])
.await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct BrowseQuery {
offset: Option<usize>,
limit: Option<usize>,
attributes_to_retrieve: Option<String>,
}
async fn get_all_documents(
data: GuardedData<Public, Data>,
path: web::Path<IndexParam>,
params: web::Query<BrowseQuery>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let attributes_to_retrieve = params.attributes_to_retrieve.as_ref().and_then(|attrs| {
let mut names = Vec::new();
for name in attrs.split(',').map(String::from) {
if name == "*" {
return None;
}
names.push(name);
}
Some(names)
});
let documents = data
.retrieve_documents(
path.index_uid.clone(),
params.offset.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_OFFSET),
params.limit.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_LIMIT),
attributes_to_retrieve,
)
.await?;
debug!("returns: {:?}", documents);
Ok(HttpResponse::Ok().json(documents))
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct UpdateDocumentsQuery {
primary_key: Option<String>,
}
/// Route used when the payload type is "application/json"
/// Used to add or replace documents
async fn add_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let update_status = data
.add_documents(
path.into_inner().index_uid,
IndexDocumentsMethod::ReplaceDocuments,
UpdateFormat::Json,
body,
params.primary_key.clone(),
)
.await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
}
/// Route used when the payload type is "application/json"
/// Used to add or replace documents
async fn update_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let update = data
.add_documents(
path.into_inner().index_uid,
IndexDocumentsMethod::UpdateDocuments,
UpdateFormat::Json,
body,
params.primary_key.clone(),
)
.await?;
debug!("returns: {:?}", update);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update.id() })))
}
async fn delete_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
body: web::Json<Vec<Value>>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
let ids = body
.iter()
.map(|v| {
v.as_str()
.map(String::from)
.unwrap_or_else(|| v.to_string())
})
.collect();
let update_status = data.delete_documents(path.index_uid.clone(), ids).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
}
async fn clear_all_documents(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
) -> Result<HttpResponse, ResponseError> {
let update_status = data.clear_documents(path.index_uid.clone()).await?;
debug!("returns: {:?}", update_status);
Ok(HttpResponse::Accepted().json(serde_json::json!({ "updateId": update_status.id() })))
}

View File

@ -1,18 +1,29 @@
use actix_web::{web, HttpResponse};
use actix_web::{web, HttpRequest, HttpResponse};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::MeiliSearch;
use serde::{Deserialize, Serialize};
use serde_json::json;
use crate::error::ResponseError;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::Data;
use crate::extractors::sequential_extractor::SeqHandler;
pub fn services(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("/dumps").route(web::post().to(create_dump)))
.service(web::resource("/dumps/{dump_uid}/status").route(web::get().to(get_dump_status)));
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("").route(web::post().to(SeqHandler(create_dump))))
.service(
web::resource("/{dump_uid}/status").route(web::get().to(SeqHandler(get_dump_status))),
);
}
async fn create_dump(data: GuardedData<Private, Data>) -> Result<HttpResponse, ResponseError> {
let res = data.create_dump().await?;
pub async fn create_dump(
meilisearch: GuardedData<ActionPolicy<{ actions::DUMPS_CREATE }>, MeiliSearch>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
analytics.publish("Dump Created".to_string(), json!({}), Some(&req));
let res = meilisearch.create_dump().await?;
debug!("returns: {:?}", res);
Ok(HttpResponse::Accepted().json(res))
@ -30,10 +41,10 @@ struct DumpParam {
}
async fn get_dump_status(
data: GuardedData<Private, Data>,
meilisearch: GuardedData<ActionPolicy<{ actions::DUMPS_GET }>, MeiliSearch>,
path: web::Path<DumpParam>,
) -> Result<HttpResponse, ResponseError> {
let res = data.dump_status(path.dump_uid.clone()).await?;
let res = meilisearch.dump_info(path.dump_uid.clone()).await?;
debug!("returns: {:?}", res);
Ok(HttpResponse::Ok().json(res))

View File

@ -1,11 +0,0 @@
use actix_web::{web, HttpResponse};
use crate::error::ResponseError;
pub fn services(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("/health").route(web::get().to(get_health)));
}
async fn get_health() -> Result<HttpResponse, ResponseError> {
Ok(HttpResponse::Ok().json(serde_json::json!({ "status": "available" })))
}

View File

@ -1,133 +0,0 @@
use actix_web::{web, HttpResponse};
use chrono::{DateTime, Utc};
use log::debug;
use serde::{Deserialize, Serialize};
use super::{IndexParam, UpdateStatusResponse};
use crate::error::ResponseError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::Data;
pub fn services(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("indexes")
.route(web::get().to(list_indexes))
.route(web::post().to(create_index)),
)
.service(
web::resource("/indexes/{index_uid}")
.route(web::get().to(get_index))
.route(web::put().to(update_index))
.route(web::delete().to(delete_index)),
)
.service(
web::resource("/indexes/{index_uid}/updates").route(web::get().to(get_all_updates_status)),
)
.service(
web::resource("/indexes/{index_uid}/updates/{update_id}")
.route(web::get().to(get_update_status)),
);
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct IndexCreateRequest {
uid: String,
primary_key: Option<String>,
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
struct UpdateIndexRequest {
uid: Option<String>,
primary_key: Option<String>,
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct UpdateIndexResponse {
name: String,
uid: String,
created_at: DateTime<Utc>,
updated_at: DateTime<Utc>,
primary_key: Option<String>,
}
async fn list_indexes(data: GuardedData<Private, Data>) -> Result<HttpResponse, ResponseError> {
let indexes = data.list_indexes().await?;
debug!("returns: {:?}", indexes);
Ok(HttpResponse::Ok().json(indexes))
}
async fn create_index(
data: GuardedData<Private, Data>,
body: web::Json<IndexCreateRequest>,
) -> Result<HttpResponse, ResponseError> {
let body = body.into_inner();
let meta = data.create_index(body.uid, body.primary_key).await?;
Ok(HttpResponse::Ok().json(meta))
}
async fn get_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
) -> Result<HttpResponse, ResponseError> {
let meta = data.index(path.index_uid.clone()).await?;
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
async fn update_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
body: web::Json<UpdateIndexRequest>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
let body = body.into_inner();
let meta = data
.update_index(path.into_inner().index_uid, body.primary_key, body.uid)
.await?;
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
async fn delete_index(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
) -> Result<HttpResponse, ResponseError> {
data.delete_index(path.index_uid.clone()).await?;
Ok(HttpResponse::NoContent().finish())
}
#[derive(Deserialize)]
struct UpdateParam {
index_uid: String,
update_id: u64,
}
async fn get_update_status(
data: GuardedData<Private, Data>,
path: web::Path<UpdateParam>,
) -> Result<HttpResponse, ResponseError> {
let params = path.into_inner();
let meta = data
.get_update_status(params.index_uid, params.update_id)
.await?;
let meta = UpdateStatusResponse::from(meta);
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
async fn get_all_updates_status(
data: GuardedData<Private, Data>,
path: web::Path<IndexParam>,
) -> Result<HttpResponse, ResponseError> {
let metas = data.get_updates_status(path.into_inner().index_uid).await?;
let metas = metas
.into_iter()
.map(UpdateStatusResponse::from)
.collect::<Vec<_>>();
debug!("returns: {:?}", metas);
Ok(HttpResponse::Ok().json(metas))
}

View File

@ -0,0 +1,305 @@
use actix_web::error::PayloadError;
use actix_web::http::header::CONTENT_TYPE;
use actix_web::web::Bytes;
use actix_web::HttpMessage;
use actix_web::{web, HttpRequest, HttpResponse};
use bstr::ByteSlice;
use futures::{Stream, StreamExt};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::index_controller::{DocumentAdditionFormat, Update};
use meilisearch_lib::milli::update::IndexDocumentsMethod;
use meilisearch_lib::MeiliSearch;
use mime::Mime;
use once_cell::sync::Lazy;
use serde::Deserialize;
use serde_json::Value;
use tokio::sync::mpsc;
use crate::analytics::Analytics;
use crate::error::MeilisearchHttpError;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::payload::Payload;
use crate::extractors::sequential_extractor::SeqHandler;
use crate::task::SummarizedTaskView;
const DEFAULT_RETRIEVE_DOCUMENTS_OFFSET: usize = 0;
const DEFAULT_RETRIEVE_DOCUMENTS_LIMIT: usize = 20;
static ACCEPTED_CONTENT_TYPE: Lazy<Vec<String>> = Lazy::new(|| {
vec![
"application/json".to_string(),
"application/x-ndjson".to_string(),
"text/csv".to_string(),
]
});
/// This is required because Payload is not Sync nor Send
fn payload_to_stream(mut payload: Payload) -> impl Stream<Item = Result<Bytes, PayloadError>> {
let (snd, recv) = mpsc::channel(1);
tokio::task::spawn_local(async move {
while let Some(data) = payload.next().await {
let _ = snd.send(data).await;
}
});
tokio_stream::wrappers::ReceiverStream::new(recv)
}
/// Extracts the mime type from the content type and return
/// a meilisearch error if anyhthing bad happen.
fn extract_mime_type(req: &HttpRequest) -> Result<Option<Mime>, MeilisearchHttpError> {
match req.mime_type() {
Ok(Some(mime)) => Ok(Some(mime)),
Ok(None) => Ok(None),
Err(_) => match req.headers().get(CONTENT_TYPE) {
Some(content_type) => Err(MeilisearchHttpError::InvalidContentType(
content_type.as_bytes().as_bstr().to_string(),
ACCEPTED_CONTENT_TYPE.clone(),
)),
None => Err(MeilisearchHttpError::MissingContentType(
ACCEPTED_CONTENT_TYPE.clone(),
)),
},
}
}
#[derive(Deserialize)]
pub struct DocumentParam {
index_uid: String,
document_id: String,
}
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::get().to(SeqHandler(get_all_documents)))
.route(web::post().to(SeqHandler(add_documents)))
.route(web::put().to(SeqHandler(update_documents)))
.route(web::delete().to(SeqHandler(clear_all_documents))),
)
// this route needs to be before the /documents/{document_id} to match properly
.service(web::resource("/delete-batch").route(web::post().to(SeqHandler(delete_documents))))
.service(
web::resource("/{document_id}")
.route(web::get().to(SeqHandler(get_document)))
.route(web::delete().to(SeqHandler(delete_document))),
);
}
pub async fn get_document(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, MeiliSearch>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let index = path.index_uid.clone();
let id = path.document_id.clone();
let document = meilisearch
.document(index, id, None as Option<Vec<String>>)
.await?;
debug!("returns: {:?}", document);
Ok(HttpResponse::Ok().json(document))
}
pub async fn delete_document(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<DocumentParam>,
) -> Result<HttpResponse, ResponseError> {
let DocumentParam {
document_id,
index_uid,
} = path.into_inner();
let update = Update::DeleteDocuments(vec![document_id]);
let task: SummarizedTaskView = meilisearch.register_update(index_uid, update).await?.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct BrowseQuery {
offset: Option<usize>,
limit: Option<usize>,
attributes_to_retrieve: Option<String>,
}
pub async fn get_all_documents(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_GET }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<BrowseQuery>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let attributes_to_retrieve = params.attributes_to_retrieve.as_ref().and_then(|attrs| {
let mut names = Vec::new();
for name in attrs.split(',').map(String::from) {
if name == "*" {
return None;
}
names.push(name);
}
Some(names)
});
let documents = meilisearch
.documents(
path.into_inner(),
params.offset.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_OFFSET),
params.limit.unwrap_or(DEFAULT_RETRIEVE_DOCUMENTS_LIMIT),
attributes_to_retrieve,
)
.await?;
debug!("returns: {:?}", documents);
Ok(HttpResponse::Ok().json(documents))
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct UpdateDocumentsQuery {
pub primary_key: Option<String>,
}
pub async fn add_documents(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let params = params.into_inner();
let index_uid = path.into_inner();
analytics.add_documents(
&params,
meilisearch.get_index(index_uid.clone()).await.is_err(),
&req,
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let task = document_addition(
extract_mime_type(&req)?,
meilisearch,
index_uid,
params.primary_key,
body,
IndexDocumentsMethod::ReplaceDocuments,
allow_index_creation,
)
.await?;
Ok(HttpResponse::Accepted().json(task))
}
pub async fn update_documents(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<UpdateDocumentsQuery>,
body: Payload,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let index_uid = path.into_inner();
analytics.update_documents(
&params,
meilisearch.get_index(index_uid.clone()).await.is_err(),
&req,
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let task = document_addition(
extract_mime_type(&req)?,
meilisearch,
index_uid,
params.into_inner().primary_key,
body,
IndexDocumentsMethod::UpdateDocuments,
allow_index_creation,
)
.await?;
Ok(HttpResponse::Accepted().json(task))
}
async fn document_addition(
mime_type: Option<Mime>,
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_ADD }>, MeiliSearch>,
index_uid: String,
primary_key: Option<String>,
body: Payload,
method: IndexDocumentsMethod,
allow_index_creation: bool,
) -> Result<SummarizedTaskView, ResponseError> {
let format = match mime_type
.as_ref()
.map(|m| (m.type_().as_str(), m.subtype().as_str()))
{
Some(("application", "json")) => DocumentAdditionFormat::Json,
Some(("application", "x-ndjson")) => DocumentAdditionFormat::Ndjson,
Some(("text", "csv")) => DocumentAdditionFormat::Csv,
Some((type_, subtype)) => {
return Err(MeilisearchHttpError::InvalidContentType(
format!("{}/{}", type_, subtype),
ACCEPTED_CONTENT_TYPE.clone(),
)
.into())
}
None => {
return Err(
MeilisearchHttpError::MissingContentType(ACCEPTED_CONTENT_TYPE.clone()).into(),
)
}
};
let update = Update::DocumentAddition {
payload: Box::new(payload_to_stream(body)),
primary_key,
method,
format,
allow_index_creation,
};
let task = meilisearch.register_update(index_uid, update).await?.into();
debug!("returns: {:?}", task);
Ok(task)
}
pub async fn delete_documents(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<String>,
body: web::Json<Vec<Value>>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
let ids = body
.iter()
.map(|v| {
v.as_str()
.map(String::from)
.unwrap_or_else(|| v.to_string())
})
.collect();
let update = Update::DeleteDocuments(ids);
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn clear_all_documents(
meilisearch: GuardedData<ActionPolicy<{ actions::DOCUMENTS_DELETE }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let update = Update::ClearDocuments;
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}

View File

@ -0,0 +1,163 @@
use actix_web::{web, HttpRequest, HttpResponse};
use log::debug;
use meilisearch_error::ResponseError;
use meilisearch_lib::index_controller::Update;
use meilisearch_lib::MeiliSearch;
use serde::{Deserialize, Serialize};
use serde_json::json;
use time::OffsetDateTime;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::sequential_extractor::SeqHandler;
use crate::task::SummarizedTaskView;
pub mod documents;
pub mod search;
pub mod settings;
pub mod tasks;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::get().to(list_indexes))
.route(web::post().to(SeqHandler(create_index))),
)
.service(
web::scope("/{index_uid}")
.service(
web::resource("")
.route(web::get().to(SeqHandler(get_index)))
.route(web::put().to(SeqHandler(update_index)))
.route(web::delete().to(SeqHandler(delete_index))),
)
.service(web::resource("/stats").route(web::get().to(SeqHandler(get_index_stats))))
.service(web::scope("/documents").configure(documents::configure))
.service(web::scope("/search").configure(search::configure))
.service(web::scope("/tasks").configure(tasks::configure))
.service(web::scope("/settings").configure(settings::configure)),
);
}
pub async fn list_indexes(
data: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, MeiliSearch>,
) -> Result<HttpResponse, ResponseError> {
let search_rules = &data.filters().search_rules;
let indexes: Vec<_> = data
.list_indexes()
.await?
.into_iter()
.filter(|i| search_rules.is_index_authorized(&i.uid))
.collect();
debug!("returns: {:?}", indexes);
Ok(HttpResponse::Ok().json(indexes))
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct IndexCreateRequest {
uid: String,
primary_key: Option<String>,
}
pub async fn create_index(
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_CREATE }>, MeiliSearch>,
body: web::Json<IndexCreateRequest>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let IndexCreateRequest {
primary_key, uid, ..
} = body.into_inner();
analytics.publish(
"Index Created".to_string(),
json!({ "primary_key": primary_key }),
Some(&req),
);
let update = Update::CreateIndex { primary_key };
let task: SummarizedTaskView = meilisearch.register_update(uid, update).await?.into();
Ok(HttpResponse::Accepted().json(task))
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
#[allow(dead_code)]
pub struct UpdateIndexRequest {
uid: Option<String>,
primary_key: Option<String>,
}
#[derive(Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct UpdateIndexResponse {
name: String,
uid: String,
#[serde(serialize_with = "time::serde::rfc3339::serialize")]
created_at: OffsetDateTime,
#[serde(serialize_with = "time::serde::rfc3339::serialize")]
updated_at: OffsetDateTime,
#[serde(serialize_with = "time::serde::rfc3339::serialize")]
primary_key: OffsetDateTime,
}
pub async fn get_index(
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_GET }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let meta = meilisearch.get_index(path.into_inner()).await?;
debug!("returns: {:?}", meta);
Ok(HttpResponse::Ok().json(meta))
}
pub async fn update_index(
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_UPDATE }>, MeiliSearch>,
path: web::Path<String>,
body: web::Json<UpdateIndexRequest>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", body);
let body = body.into_inner();
analytics.publish(
"Index Updated".to_string(),
json!({ "primary_key": body.primary_key}),
Some(&req),
);
let update = Update::UpdateIndex {
primary_key: body.primary_key,
};
let task: SummarizedTaskView = meilisearch
.register_update(path.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn delete_index(
meilisearch: GuardedData<ActionPolicy<{ actions::INDEXES_DELETE }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let uid = path.into_inner();
let update = Update::DeleteIndex;
let task: SummarizedTaskView = meilisearch.register_update(uid, update).await?.into();
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get_index_stats(
meilisearch: GuardedData<ActionPolicy<{ actions::STATS_GET }>, MeiliSearch>,
path: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let response = meilisearch.get_index_stats(path.into_inner()).await?;
debug!("returns: {:?}", response);
Ok(HttpResponse::Ok().json(response))
}

View File

@ -0,0 +1,255 @@
use actix_web::{web, HttpRequest, HttpResponse};
use log::debug;
use meilisearch_auth::IndexSearchRules;
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{
default_crop_length, default_crop_marker, default_highlight_post_tag,
default_highlight_pre_tag, SearchQuery, DEFAULT_SEARCH_LIMIT,
};
use meilisearch_lib::MeiliSearch;
use serde::Deserialize;
use serde_json::Value;
use crate::analytics::{Analytics, SearchAggregator};
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::sequential_extractor::SeqHandler;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::get().to(SeqHandler(search_with_url_query)))
.route(web::post().to(SeqHandler(search_with_post))),
);
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct SearchQueryGet {
q: Option<String>,
offset: Option<usize>,
limit: Option<usize>,
attributes_to_retrieve: Option<String>,
attributes_to_crop: Option<String>,
#[serde(default = "default_crop_length")]
crop_length: usize,
attributes_to_highlight: Option<String>,
filter: Option<String>,
sort: Option<String>,
#[serde(default = "Default::default")]
matches: bool,
facets_distribution: Option<String>,
#[serde(default = "default_highlight_pre_tag")]
highlight_pre_tag: String,
#[serde(default = "default_highlight_post_tag")]
highlight_post_tag: String,
#[serde(default = "default_crop_marker")]
crop_marker: String,
}
impl From<SearchQueryGet> for SearchQuery {
fn from(other: SearchQueryGet) -> Self {
let attributes_to_retrieve = other
.attributes_to_retrieve
.map(|attrs| attrs.split(',').map(String::from).collect());
let attributes_to_crop = other
.attributes_to_crop
.map(|attrs| attrs.split(',').map(String::from).collect());
let attributes_to_highlight = other
.attributes_to_highlight
.map(|attrs| attrs.split(',').map(String::from).collect());
let facets_distribution = other
.facets_distribution
.map(|attrs| attrs.split(',').map(String::from).collect());
let filter = match other.filter {
Some(f) => match serde_json::from_str(&f) {
Ok(v) => Some(v),
_ => Some(Value::String(f)),
},
None => None,
};
let sort = other.sort.map(|attr| fix_sort_query_parameters(&attr));
Self {
q: other.q,
offset: other.offset,
limit: other.limit.unwrap_or(DEFAULT_SEARCH_LIMIT),
attributes_to_retrieve,
attributes_to_crop,
crop_length: other.crop_length,
attributes_to_highlight,
filter,
sort,
matches: other.matches,
facets_distribution,
highlight_pre_tag: other.highlight_pre_tag,
highlight_post_tag: other.highlight_post_tag,
crop_marker: other.crop_marker,
}
}
}
/// Incorporate search rules in search query
fn add_search_rules(query: &mut SearchQuery, rules: IndexSearchRules) {
query.filter = match (query.filter.take(), rules.filter) {
(None, rules_filter) => rules_filter,
(filter, None) => filter,
(Some(filter), Some(rules_filter)) => {
let filter = match filter {
Value::Array(filter) => filter,
filter => vec![filter],
};
let rules_filter = match rules_filter {
Value::Array(rules_filter) => rules_filter,
rules_filter => vec![rules_filter],
};
Some(Value::Array([filter, rules_filter].concat()))
}
}
}
// TODO: TAMO: split on :asc, and :desc, instead of doing some weird things
/// Transform the sort query parameter into something that matches the post expected format.
fn fix_sort_query_parameters(sort_query: &str) -> Vec<String> {
let mut sort_parameters = Vec::new();
let mut merge = false;
for current_sort in sort_query.trim_matches('"').split(',').map(|s| s.trim()) {
if current_sort.starts_with("_geoPoint(") {
sort_parameters.push(current_sort.to_string());
merge = true;
} else if merge && !sort_parameters.is_empty() {
sort_parameters
.last_mut()
.unwrap()
.push_str(&format!(",{}", current_sort));
if current_sort.ends_with("):desc") || current_sort.ends_with("):asc") {
merge = false;
}
} else {
sort_parameters.push(current_sort.to_string());
merge = false;
}
}
sort_parameters
}
pub async fn search_with_url_query(
meilisearch: GuardedData<ActionPolicy<{ actions::SEARCH }>, MeiliSearch>,
path: web::Path<String>,
params: web::Query<SearchQueryGet>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
debug!("called with params: {:?}", params);
let mut query: SearchQuery = params.into_inner().into();
let index_uid = path.into_inner();
// Tenant token search_rules.
if let Some(search_rules) = meilisearch
.filters()
.search_rules
.get_index_search_rules(&index_uid)
{
add_search_rules(&mut query, search_rules);
}
let mut aggregate = SearchAggregator::from_query(&query, &req);
let search_result = meilisearch.search(index_uid, query).await;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
analytics.get_search(aggregate);
let search_result = search_result?;
// Tests that the nb_hits is always set to false
#[cfg(test)]
assert!(!search_result.exhaustive_nb_hits);
debug!("returns: {:?}", search_result);
Ok(HttpResponse::Ok().json(search_result))
}
pub async fn search_with_post(
meilisearch: GuardedData<ActionPolicy<{ actions::SEARCH }>, MeiliSearch>,
path: web::Path<String>,
params: web::Json<SearchQuery>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let mut query = params.into_inner();
debug!("search called with params: {:?}", query);
let index_uid = path.into_inner();
// Tenant token search_rules.
if let Some(search_rules) = meilisearch
.filters()
.search_rules
.get_index_search_rules(&index_uid)
{
add_search_rules(&mut query, search_rules);
}
let mut aggregate = SearchAggregator::from_query(&query, &req);
let search_result = meilisearch.search(index_uid, query).await;
if let Ok(ref search_result) = search_result {
aggregate.succeed(search_result);
}
analytics.post_search(aggregate);
let search_result = search_result?;
// Tests that the nb_hits is always set to false
#[cfg(test)]
assert!(!search_result.exhaustive_nb_hits);
debug!("returns: {:?}", search_result);
Ok(HttpResponse::Ok().json(search_result))
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_fix_sort_query_parameters() {
let sort = fix_sort_query_parameters("_geoPoint(12, 13):asc");
assert_eq!(sort, vec!["_geoPoint(12,13):asc".to_string()]);
let sort = fix_sort_query_parameters("doggo:asc,_geoPoint(12.45,13.56):desc");
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(12.45,13.56):desc".to_string(),
]
);
let sort = fix_sort_query_parameters(
"doggo:asc , _geoPoint(12.45, 13.56, 2590352):desc , catto:desc",
);
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(12.45,13.56,2590352):desc".to_string(),
"catto:desc".to_string(),
]
);
let sort = fix_sort_query_parameters("doggo:asc , _geoPoint(1, 2), catto:desc");
// This is ugly but eh, I don't want to write a full parser just for this unused route
assert_eq!(
sort,
vec![
"doggo:asc".to_string(),
"_geoPoint(1,2),catto:desc".to_string(),
]
);
}
}

View File

@ -0,0 +1,333 @@
use log::debug;
use actix_web::{web, HttpRequest, HttpResponse};
use meilisearch_error::ResponseError;
use meilisearch_lib::index::{Settings, Unchecked};
use meilisearch_lib::index_controller::Update;
use meilisearch_lib::MeiliSearch;
use serde_json::json;
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::task::SummarizedTaskView;
#[macro_export]
macro_rules! make_setting_route {
($route:literal, $type:ty, $attr:ident, $camelcase_attr:literal, $analytics_var:ident, $analytics:expr) => {
pub mod $attr {
use actix_web::{web, HttpRequest, HttpResponse, Resource};
use log::debug;
use meilisearch_lib::milli::update::Setting;
use meilisearch_lib::{index::Settings, index_controller::Update, MeiliSearch};
use crate::analytics::Analytics;
use crate::extractors::authentication::{policies::*, GuardedData};
use crate::extractors::sequential_extractor::SeqHandler;
use crate::task::SummarizedTaskView;
use meilisearch_error::ResponseError;
pub async fn delete(
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let settings = Settings {
$attr: Setting::Reset,
..Default::default()
};
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: true,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn update(
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: actix_web::web::Path<String>,
body: actix_web::web::Json<Option<$type>>,
req: HttpRequest,
$analytics_var: web::Data<dyn Analytics>,
) -> std::result::Result<HttpResponse, ResponseError> {
let body = body.into_inner();
$analytics(&body, &req);
let settings = Settings {
$attr: match body {
Some(inner_body) => Setting::Set(inner_body),
None => Setting::Reset,
},
..Default::default()
};
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: false,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get(
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_GET }>, MeiliSearch>,
index_uid: actix_web::web::Path<String>,
) -> std::result::Result<HttpResponse, ResponseError> {
let settings = meilisearch.settings(index_uid.into_inner()).await?;
debug!("returns: {:?}", settings);
let mut json = serde_json::json!(&settings);
let val = json[$camelcase_attr].take();
Ok(HttpResponse::Ok().json(val))
}
pub fn resources() -> Resource {
Resource::new($route)
.route(web::get().to(SeqHandler(get)))
.route(web::post().to(SeqHandler(update)))
.route(web::delete().to(SeqHandler(delete)))
}
}
};
($route:literal, $type:ty, $attr:ident, $camelcase_attr:literal) => {
make_setting_route!($route, $type, $attr, $camelcase_attr, _analytics, |_, _| {});
};
}
make_setting_route!(
"/filterable-attributes",
std::collections::BTreeSet<String>,
filterable_attributes,
"filterableAttributes",
analytics,
|setting: &Option<std::collections::BTreeSet<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"FilterableAttributes Updated".to_string(),
json!({
"filterable_attributes": {
"total": setting.as_ref().map(|filter| filter.len()).unwrap_or(0),
"has_geo": setting.as_ref().map(|filter| filter.contains("_geo")).unwrap_or(false),
}
}),
Some(req),
);
}
);
make_setting_route!(
"/sortable-attributes",
std::collections::BTreeSet<String>,
sortable_attributes,
"sortableAttributes",
analytics,
|setting: &Option<std::collections::BTreeSet<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"SortableAttributes Updated".to_string(),
json!({
"sortable_attributes": {
"total": setting.as_ref().map(|sort| sort.len()).unwrap_or(0),
"has_geo": setting.as_ref().map(|sort| sort.contains("_geo")).unwrap_or(false),
},
}),
Some(req),
);
}
);
make_setting_route!(
"/displayed-attributes",
Vec<String>,
displayed_attributes,
"displayedAttributes"
);
make_setting_route!(
"/typo",
meilisearch_lib::index::updates::TypoSettings,
typo,
"typo"
);
make_setting_route!(
"/searchable-attributes",
Vec<String>,
searchable_attributes,
"searchableAttributes",
analytics,
|setting: &Option<Vec<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"SearchableAttributes Updated".to_string(),
json!({
"searchable_attributes": {
"total": setting.as_ref().map(|searchable| searchable.len()).unwrap_or(0),
},
}),
Some(req),
);
}
);
make_setting_route!(
"/stop-words",
std::collections::BTreeSet<String>,
stop_words,
"stopWords"
);
make_setting_route!(
"/synonyms",
std::collections::BTreeMap<String, Vec<String>>,
synonyms,
"synonyms"
);
make_setting_route!(
"/distinct-attribute",
String,
distinct_attribute,
"distinctAttribute"
);
make_setting_route!(
"/ranking-rules",
Vec<String>,
ranking_rules,
"rankingRules",
analytics,
|setting: &Option<Vec<String>>, req: &HttpRequest| {
use serde_json::json;
analytics.publish(
"RankingRules Updated".to_string(),
json!({
"ranking_rules": {
"sort_position": setting.as_ref().map(|sort| sort.iter().position(|s| s == "sort")),
}
}),
Some(req),
);
}
);
macro_rules! generate_configure {
($($mod:ident),*) => {
pub fn configure(cfg: &mut web::ServiceConfig) {
use crate::extractors::sequential_extractor::SeqHandler;
cfg.service(
web::resource("")
.route(web::post().to(SeqHandler(update_all)))
.route(web::get().to(SeqHandler(get_all)))
.route(web::delete().to(SeqHandler(delete_all))))
$(.service($mod::resources()))*;
}
};
}
generate_configure!(
filterable_attributes,
sortable_attributes,
displayed_attributes,
searchable_attributes,
distinct_attribute,
stop_words,
synonyms,
ranking_rules,
typo
);
pub async fn update_all(
meilisearch: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
body: web::Json<Settings<Unchecked>>,
req: HttpRequest,
analytics: web::Data<dyn Analytics>,
) -> Result<HttpResponse, ResponseError> {
let settings = body.into_inner();
analytics.publish(
"Settings Updated".to_string(),
json!({
"ranking_rules": {
"sort_position": settings.ranking_rules.as_ref().set().map(|sort| sort.iter().position(|s| s == "sort")),
},
"searchable_attributes": {
"total": settings.searchable_attributes.as_ref().set().map(|searchable| searchable.len()).unwrap_or(0),
},
"sortable_attributes": {
"total": settings.sortable_attributes.as_ref().set().map(|sort| sort.len()).unwrap_or(0),
"has_geo": settings.sortable_attributes.as_ref().set().map(|sort| sort.iter().any(|s| s == "_geo")).unwrap_or(false),
},
"filterable_attributes": {
"total": settings.filterable_attributes.as_ref().set().map(|filter| filter.len()).unwrap_or(0),
"has_geo": settings.filterable_attributes.as_ref().set().map(|filter| filter.iter().any(|s| s == "_geo")).unwrap_or(false),
},
}),
Some(&req),
);
let allow_index_creation = meilisearch.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: false,
allow_index_creation,
};
let task: SummarizedTaskView = meilisearch
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}
pub async fn get_all(
data: GuardedData<ActionPolicy<{ actions::SETTINGS_GET }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let settings = data.settings(index_uid.into_inner()).await?;
debug!("returns: {:?}", settings);
Ok(HttpResponse::Ok().json(settings))
}
pub async fn delete_all(
data: GuardedData<ActionPolicy<{ actions::SETTINGS_UPDATE }>, MeiliSearch>,
index_uid: web::Path<String>,
) -> Result<HttpResponse, ResponseError> {
let settings = Settings::cleared().into_unchecked();
let allow_index_creation = data.filters().allow_index_creation;
let update = Update::Settings {
settings,
is_deletion: true,
allow_index_creation,
};
let task: SummarizedTaskView = data
.register_update(index_uid.into_inner(), update)
.await?
.into();
debug!("returns: {:?}", task);
Ok(HttpResponse::Accepted().json(task))
}

Some files were not shown because too many files have changed in this diff Show More