Commit Graph

1627 Commits

Author SHA1 Message Date
bors[bot]
5cbe879325 Merge #308
308: Implement a better parallel indexer r=Kerollmops a=ManyTheFish

Rewrite the indexer:
- enhance memory consumption control
- optimize parallelism using rayon and crossbeam channel
- factorize the different parts and make new DB implementation easier
- optimize and fix prefix databases


Co-authored-by: many <maxime@meilisearch.com>
2021-09-02 15:03:52 +00:00
many
741a4444a9 Remove log in chunk generator 2021-09-02 16:57:46 +02:00
many
7f7fafb857 Make document_chunk_size settable from update builder 2021-09-02 15:25:39 +02:00
many
db0c681bae Fix Pr comments 2021-09-02 15:17:52 +02:00
bors[bot]
46f7df232a Merge #337
337: Update version for the next release (v0.12.0) r=Kerollmops a=curquiza

Breaking because of the new indexer that implies DB changes #308 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-02 10:13:31 +00:00
Clémentine Urquizar
285849e3a6 Update version for the next release (v0.12.0) 2021-09-02 10:08:41 +02:00
bors[bot]
a589f6c60b Merge #335
335: Get sortable_fields from index only if criteria present in query r=Kerollmops a=shekhirin

Seems like we don't need to retrieve `sortable_fields` from the index if there's no any `sort_criteria` in the query.

Small 🤏  optimization opportunity out there.

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-09-01 16:01:00 +00:00
bors[bot]
3e0a78acf3 Merge #329
329: Run all benchmarks once every friday r=irevoire a=irevoire

All the benchmarks run every Friday on the `main` branch.
To avoid having pending benchmarks everywhere, we execute one benchmark every 8 hours.
Then the results are uploaded as if it was a normal user-run benchmark.

This PR closes #314 and #321 

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-01 15:20:49 +00:00
many
4860fd4529 Ignore empty facet values 2021-09-01 16:48:40 +02:00
many
b3a22f31f6 Fix memory consuption in word pair proximity extractor 2021-09-01 16:48:40 +02:00
many
9452fabfb2 Optimize cbo roaring bitmaps merge 2021-09-01 16:48:40 +02:00
many
8f702828ca Ignore errors comming from crossbeam channel senders 2021-09-01 16:48:40 +02:00
many
e09eec37bc Handle distance addition with hard separators 2021-09-01 16:48:40 +02:00
many
fc7cc770d4 Add logging timers 2021-09-01 16:48:40 +02:00
many
a2f59a28f7 Remove unwrap sending errors in channel 2021-09-01 16:48:40 +02:00
many
5c962c03dd Fix and optimize word_prefix_pair_proximity_docids database 2021-09-01 16:48:40 +02:00
many
2d1727697d Take stop word in account 2021-09-01 16:48:40 +02:00
many
823da19745 Fix test and use progress callback 2021-09-01 16:48:39 +02:00
many
1d314328f0 Plug new indexer 2021-09-01 16:48:36 +02:00
many
3aaf1d62f3 Publish grenad CompressionType type in milli 2021-09-01 16:42:08 +02:00
Alexey Shekhirin
0e379558a1 fix(search): get sortable_fields only if criteria present 2021-08-31 21:35:41 +03:00
bors[bot]
d6bba0663a Merge #334
334: Wrap long values into BStr for warn logs r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/milli/issues/263

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-08-31 17:38:54 +00:00
Alexey Shekhirin
0b02eb456c chore(update): wrap long values into BStr for warn logs 2021-08-31 20:28:16 +03:00
bors[bot]
df38794c7d Merge #330
330: Introduce the reset_sortable_fields Settings method r=irevoire a=Kerollmops

I forgot to add the `reset_sortable_fields` method on the `Settings` builder, it is no big deal as the library user (like MeiliSearch) can always call `set_sortable_fields` with an empty list of fields, it is equivalent.

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2021-08-30 23:26:54 +00:00
bors[bot]
6cdb6722d1 Merge #332
332: Sortable attributes in http-ui r=Kerollmops a=irevoire

- Add a `reset_sortable_attribute` method
- Add the `sortable_attributes` to http-ui
- Fix some broken test in http-ui

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-08-30 15:31:05 +00:00
Tamo
d106eb5b90 add the sortable attributes to http-ui and fix the tests 2021-08-30 16:25:10 +02:00
Tamo
5e639bc0c1 postfix all action name with (cron) 2021-08-30 13:55:00 +02:00
Irevoire
49a6d2d5f1 run all benchmarks once every friday 2021-08-30 13:55:00 +02:00
Kerollmops
f230ae6fd5 Introduce the reset_sortable_fields Settings method 2021-08-25 17:44:16 +02:00
bors[bot]
c8930781eb Merge #328
328: Remove `beta` compilation in CI r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/milli/issues/326

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-08-25 08:45:18 +00:00
Alexey Shekhirin
01461af333 chore(ci): remove Rust beta from tests job 2021-08-24 22:18:13 +03:00
bors[bot]
c51bb6789c Merge #325
325: Update milli version to v0.11.0 r=curquiza a=Kerollmops

This PR also clean-up some dependencies in the Cargo.toml.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-08-24 16:18:49 +00:00
Kerollmops
af65485ba7 Reexport the grenad CompressionType from milli 2021-08-24 18:15:31 +02:00
Kerollmops
f2e1591826 Remove the unused tinytemplate dependency 2021-08-24 18:10:58 +02:00
Kerollmops
2f20257070 Update milli to the v0.11.0 2021-08-24 18:10:11 +02:00
bors[bot]
794c0f64a9 Merge #315
315: Rewrite the indexing benchmarks r=Kerollmops a=irevoire

There was a panic on the benchmark and while I was trying to understand what was happening I decided to rewrite the way the benchmarks were working.

Before we were creating a database with the good setting, and then for each benchmarks we were:
1. Deleting all documents in the database
2. Indexing a batch of documents

Now for each iteration we recreate entirely a new database from scratch.
Since deleting all the documents in a database may not be the same as starting with a fresh new database I prefer this solution.

Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-08-24 15:34:50 +00:00
bors[bot]
731e0e5321 Merge #320
320: Sort at query time r=Kerollmops a=Kerollmops

Re-introduce the Sort at the query time (https://github.com/meilisearch/milli/issues/305)

Co-authored-by: Clément Renault <renault.cle@gmail.com>
2021-08-24 14:19:43 +00:00
Clément Renault
89d0758713 Revert "Revert "Sort at query time"" 2021-08-24 11:55:16 +02:00
bors[bot]
879d5e8799 Merge #319
319: Update version for the next release (v0.10.2) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-23 10:03:23 +00:00
Clémentine Urquizar
88f6c18665 Update version for the next release (v0.10.2) 2021-08-23 11:33:30 +02:00
bors[bot]
aa1ce97748 Merge #317
317: Fix the facet string docids filterable deletion bug r=Kerollmops a=Kerollmops

Fixes a bug where the deletion of documents was returning a decoding error. But only when the settings are set with filterable attributes.

This bug was introduced in #254 in which we made the engine faster in returning the facet distribution. We changed the way we were storing the inverted index, we were no more storing only documents ids with the original values but also groups identified with integers, depending on the facet level we were using. This is similar to how facet numbers are already stored.

⚠️ As `@curquiza` already said, we must first revert #309 before merging this!

Related to https://github.com/meilisearch/MeiliSearch/issues/1601.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-08-23 08:57:16 +00:00
Clément Renault
c084f7f731 Fix the facet string docids filterable deletion bug 2021-08-23 10:50:39 +02:00
bors[bot]
0d1f83ba4b Merge #318
318: Revert "Sort at query time" r=Kerollmops a=curquiza

Reverts meilisearch/milli#309

We revert this from `main` not because this leads to a bug, but because we don't want to release it now and we have to merge and release an hotfix on `main`.
Cf:
- https://github.com/meilisearch/milli/issues/316
- https://github.com/meilisearch/milli/pull/317

Once the v0.21.0 is released, we should merge again this awesome addition 👌 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-21 08:25:17 +00:00
Clémentine Urquizar
922f9fd4d5 Revert "Sort at query time" 2021-08-20 18:09:17 +02:00
Irevoire
4b99d8cb91 rewrite the indexing benchmarks 2021-08-19 15:02:43 +02:00
bors[bot]
41fc0dcb62 Merge #309
309: Sort at query time r=Kerollmops a=Kerollmops

This PR:
 - Makes the `Asc/Desc` criteria work with strings too, it first returns documents ordered by numbers then by strings, and finally the documents that can't be ordered. Note that it is lexicographically ordered and not ordered by character, which means that it doesn't know about wide and short characters i.e. `a`, `丹`, `▲`.
 - Changes the syntax for the `Asc/Desc` criterion by now using a colon to separate the name and the order i.e. `title:asc`, `price:desc`.
 - Add the `Sort` criterion at the third position in the ranking rules by default.
 - Add the `sort_criteria` method to the `Search` builder struct to let the users define the `Asc/Desc` sortable attributes they want to use at query time. Note that we need to check that the fields are registered in the sortable attributes before performing the search.
 - Introduce a new `InvalidSortableAttribute` user error that is raised when the sort criteria declared at query time are not part of the sortable attributes.
 - `@ManyTheFish` introduced integration tests for the dynamic Sort criterion.

Fixes #305.

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
2021-08-18 16:55:32 +00:00
many
d1df0d20f9 Add integration test of SortBy criterion 2021-08-18 16:21:51 +02:00
Kerollmops
1b7f6ea1e7 Return a new error when the sort criteria is not sortable 2021-08-18 15:04:07 +02:00
Kerollmops
71602e0f1b Add the sortable fields into the settings and in the index 2021-08-18 15:04:07 +02:00
Kerollmops
407f53872a Add a sort_criteria method to the Search builder struct 2021-08-18 15:04:07 +02:00