Commit Graph

7200 Commits

Author SHA1 Message Date
58e2903177 first try 2022-03-04 10:46:59 +08:00
af8a5f2c21 async auth 2022-03-02 19:25:51 +01:00
d6400aef27 remove async from meilsearch-authentication 2022-03-02 18:22:34 +01:00
81fe65afed Merge #2200
2200: Fix(dumps): Explicitly define serde for time r=ManyTheFish a=ManyTheFish

fix #2199 


Co-authored-by: ManyTheFish <many@meilisearch.com>
v0.26.0rc4
2022-03-02 10:39:24 +00:00
c2b58720d1 Fix(dumps): Explicitly define serde for time 2022-03-02 11:37:48 +01:00
df518d8b0b Merge #459
459: Update heed link in cargo toml r=Kerollmops a=curquiza

Since grenad and heed have been moved to the meilisearch orga, this PR changes the link.
This is a minor change since GitHub handles automatically the redirection. This PR is only for consisitency.

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-03-01 19:47:14 +00:00
d9ed9de2b0 Update heed link in cargo toml 2022-03-01 19:45:29 +01:00
51cf44d6fd Merge #456
456: Remove useless grenad merging r=Kerollmops a=Kerollmops

This PR must be merged after #454.

This PR removes the part of code that was merging all of the grenad Readers merging that we don't need as the indexer should have merged them and, therefore, we should only have one final grenad Reader. We reduce the amount of CPU usage and memory pressure we were doing uselessly.

`@ManyTheFish` are you sure I can skip merging the `word_docids` database?

Here is the benchmark comparison with the previously merged PR #454:
```
group                                              indexing_reintroduce-appending-sorted-values_c05e42a8    indexing_remove-useless-grenad-merging_d5b8b5a2
-----                                              -----------------------------------------------------    -----------------------------------------------
indexing/Indexing movies with default settings     1.06      16.6±1.04s        ? ?/sec                      1.00      15.7±0.93s        ? ?/sec
indexing/Indexing songs with default settings      1.16      60.1±7.07s        ? ?/sec                      1.00      51.7±5.98s        ? ?/sec
indexing/Indexing songs without faceted numbers    1.06      55.4±6.14s        ? ?/sec                      1.00      52.2±4.13s        ? ?/sec
```

And the comparison with multi-batch indexing before #436, we can see that we gain time for benchmarks that index datasets in multiple batches but there is _so much_ variance that it's not clear.

```
group                                                             indexing_benchmark-multi-batch-indexing-before-speed-up_45f52620    indexing_remove-useless-grenad-merging_d5b8b5a2
-----                                                             ----------------------------------------------------------------    -----------------------------------------------
indexing/Indexing geo_point                                       1.07       6.6±0.08s        ? ?/sec                                 1.00       6.2±0.11s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.12      57.7±2.14s        ? ?/sec                                 1.00      51.5±3.80s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      47.5±2.52s        ? ?/sec                                 1.09      51.7±5.98s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      43.5±1.43s        ? ?/sec                                 1.12      48.8±3.73s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      47.1±2.23s        ? ?/sec                                 1.11      52.2±4.13s        ? ?/sec
indexing/Indexing wiki                                            1.00    917.3±30.01s        ? ?/sec                                 1.09    998.7±38.92s        ? ?/sec
indexing/Indexing wiki in three batches                           1.09   1091.2±32.73s        ? ?/sec                                 1.00    996.5±15.70s        ? ?/sec
```

What do you think `@irevoire?` Should we change the benchmarks to make them do more runs?

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-03-01 16:48:08 +00:00
5515aa5045 Merge #2197
2197: Additions to 0.26 (Update actix-web dependency to 4.0) r=curquiza a=MarinPostma

- `@robjtede`
`@MarinPostma`
[update actix-web dependency to 4.0](3b2e467ca6)

From main to release-v0.26.0


Co-authored-by: Rob Ede <robjtede@icloud.com>
v0.26.0rc3
2022-02-28 18:09:43 +00:00
15150db957 clippy 2022-02-28 19:03:38 +01:00
3b2e467ca6 update actix-web dependency to 4.0 2022-02-28 19:03:37 +01:00
d5b8b5a2f8 Replace the ugly unwraps by clean if let Somes 2022-02-28 16:31:33 +01:00
8d26f3040c Remove a useless grenad file merging 2022-02-28 16:31:33 +01:00
0c9e8cdf8d Merge #2194
2194: update actix-web dependency to 4.0 r=irevoire a=robjtede

# Pull Request

## What does this PR do?
Updates Actix Web ecosystem crates to 4.0 stable.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] ~~Does this PR fix an existing issue?~~
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?




Co-authored-by: Rob Ede <robjtede@icloud.com>
2022-02-28 15:23:42 +00:00
21898ffc60 Merge #454
454: Reintroduce appending sorted entries when possible r=Kerollmops a=Kerollmops

This PR modifies the `sorter_into_lmdb_database` function to append values into the database instead of get-put-merging them, it should improve the indexation speed for when the database is empty.

```txt
group                                             indexing_main_25123af3                 indexing_reintroduce-appending-sorted-values_c05e42a8
-----                                             ----------------------                 -----------------------------------------------------
indexing/Indexing movies with default settings    1.07      17.8±0.99s        ? ?/sec    1.00      16.6±1.04s        ? ?/sec
indexing/Indexing songs with default settings     1.00      57.0±6.01s        ? ?/sec    1.05      60.1±7.07s        ? ?/sec
indexing/Indexing songs without any facets        1.10      51.8±5.36s        ? ?/sec    1.00      47.3±3.30s        ? ?/sec
```

Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-02-28 14:55:37 +00:00
8d624b3800 clippy 2022-02-28 13:43:22 +00:00
4fbb83a34d bug(snapshot): Correctly open environments in snapshots 2022-02-28 12:37:30 +01:00
961e22493c update actix-web dependency to 4.0 2022-02-25 23:28:55 +00:00
04b1bbf932 Reintroduce appending sorted entries when possible 2022-02-24 14:50:45 +01:00
382be56d36 Merge #453
453: Benchmark multi batch indexing r=Kerollmops a=Kerollmops

Hey `@irevoire,` could you please add the new benchmarks into influx?

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-02-24 12:33:13 +00:00
09ee8e34a5 Merge #2192
2192: Fix max dbs error r=Kerollmops a=MarinPostma

Factor the way we open environments to make sure they are always opened with the same options.


The issue was that indexes were first opened in snapshots with incorrect options, and heed cache returned an environment with incorrect open options on subsequent index open.

fix #2190


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-02-23 16:18:23 +00:00
7e832105d7 bug(snapshot): Correctly open environments in snapshots 2022-02-23 17:12:08 +01:00
acfc96525c Apply GitHub suggestions 2022-02-23 16:20:29 +01:00
a820aa11e6 Add a new movies benchmark to test multi batch indexing 2022-02-23 16:20:29 +01:00
8d2e3e4aba Add a new wiki benchmark to test multi batch indexing 2022-02-23 16:20:29 +01:00
ab5247dc64 Add a new songs benchmark to test multi batch indexing 2022-02-23 16:20:28 +01:00
acd9535588 Merge #455
455: Raise the GitHub CI timeout limit to 72h r=irevoire a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-02-23 14:33:31 +00:00
19bfb2649b Raise the GitHub CI timeout limit to 72h 2022-02-23 15:27:51 +01:00
ff6a7b6007 Merge #2191
2191: fix(analytics): flatten the scheduler options r=curquiza a=irevoire

Implement missing part of [this spec](https://github.com/meilisearch/specifications/blob/develop/text/0034-telemetry-policies.md) by flattening the scheduler options.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-02-22 16:19:07 +00:00
6312e7f1f3 fix(analytics): flatten the scheduler options 2022-02-22 15:55:50 +01:00
bfb375ac87 Merge #2189
2189: fix(all): fix two dates that were wrongly formatted r=irevoire a=irevoire

Fix #2188 and fix #2187 

Co-authored-by: Tamo <tamo@meilisearch.com>
v0.26.0rc2
2022-02-22 10:48:45 +00:00
21d277a0ef fix(all): fix two dates that were wrongly formatted 2022-02-22 11:29:11 +01:00
c3e3c900f2 Merge #2173
2173: chore(all): replace chrono with time r=irevoire a=irevoire

Chrono has been unmaintained for a few month now and there is a CVE on it.

Also I updated all the error messages related to the API key as you can see here: https://github.com/meilisearch/specifications/pull/114

fix #2172

Co-authored-by: Irevoire <tamo@meilisearch.com>
v0.26.0rc1
2022-02-17 14:12:23 +00:00
05c8d81e65 chore: get rid of chrono in favor of time
Chrono has been unmaintened for a few month now and there is a CVE on it.

make clippy happy

bump milli
2022-02-16 18:14:29 +01:00
cd6276eef9 Merge #2174
2174: Update dashboard with v0.1.8 r=curquiza a=mdubus



Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2022-02-16 16:15:50 +00:00
25123af3b8 Merge #436
436: Speed up the word prefix databases computation time r=Kerollmops a=Kerollmops

This PR depends on the fixes done in #431 and must be merged after it.

In this PR we will bring the `WordPrefixPairProximityDocids`, `WordPrefixDocids` and, `WordPrefixPositionDocids` update structures to a new era, a better era, where computing the word prefix pair proximities costs much fewer CPU cycles, an era where this update structure can use the, previously computed, set of new word docids from the newly indexed batch of documents.

---

The `WordPrefixPairProximityDocids` is an update structure, which means that it is an object that we feed with some parameters and which modifies the LMDB database of an index when asked for. This structure specifically computes the list of word prefix pair proximities, which correspond to a list of pairs of words associated with a proximity (the distance between both words) where the second word is not a word but a prefix e.g. `s`, `se`, `a`. This word prefix pair proximity is associated with the list of documents ids which contains the pair of words and prefix at the given proximity.

The origin of the performances issue that this struct brings is related to the fact that it starts its job from the beginning, it clears the LMDB database before rewriting everything from scratch, using the other LMDB databases to achieve that. I hope you understand that this is absolutely not an optimized way of doing things.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-02-16 15:41:14 +00:00
7bcaa2fd13 Update dashboard to v0.1.9 2022-02-16 15:53:15 +01:00
ff8d7a810d Change the behavior of the as_cloneable_grenad by taking a ref 2022-02-16 15:40:08 +01:00
f367cc2e75 Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
f2984f66e6 Merge #452
452: bump milli r=curquiza a=irevoire



Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-02-16 13:49:14 +00:00
67ecd7c147 Update dashboard with v0.1.8 2022-02-16 14:10:47 +01:00
0defeb268c bump milli 2022-02-16 13:27:41 +01:00
ce6ff294cf Merge #2171
2171: Update LICENSE with Meili SAS r=curquiza a=curquiza

Check with thomas, we must put the real name of the company

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-02-15 18:29:03 +00:00
030064da25 Merge #451
451: Update LICENSE with Meili SAS name r=Kerollmops a=curquiza

Check with thomas, we must put the real name of the company

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-02-15 16:18:47 +00:00
b318ab46cb Update LICENSE 2022-02-15 15:54:45 +01:00
84035a27f5 Update LICENSE 2022-02-15 15:52:50 +01:00
0885fcf973 Merge #450
450: Get rid of chrono in favor of time r=Kerollmops a=irevoire

We only use `chrono` as a wrapper around `time`, and since there has been an [open CVE on `chrono` for at least 3 months now](https://github.com/chronotope/chrono/pull/632) and the repo seems to be [struggling with maintenance](https://github.com/chronotope/chrono/pull/639), I think we should use `time` directly which is way more active and sufficient for our use case.

EDIT: Actually the CVE status has been known for more than 6 months: https://github.com/chronotope/chrono/issues/602

Co-authored-by: Irevoire <tamo@meilisearch.com>
2022-02-15 10:54:46 +00:00
48542ac8fd get rid of chrono in favor of time 2022-02-15 11:41:55 +01:00
5890600101 Merge #2170
2170: Update version (v0.26.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
v0.26.0rc0
2022-02-14 15:22:09 +00:00
e2a9414c7a Update version (v0.26.0) 2022-02-14 16:11:07 +01:00