Commit Graph

835 Commits

Author SHA1 Message Date
many
2d1727697d Take stop word in account 2021-09-01 16:48:40 +02:00
many
823da19745 Fix test and use progress callback 2021-09-01 16:48:39 +02:00
many
1d314328f0 Plug new indexer 2021-09-01 16:48:36 +02:00
many
3aaf1d62f3 Publish grenad CompressionType type in milli 2021-09-01 16:42:08 +02:00
Alexey Shekhirin
0e379558a1 fix(search): get sortable_fields only if criteria present 2021-08-31 21:35:41 +03:00
bors[bot]
d6bba0663a Merge #334
334: Wrap long values into BStr for warn logs r=Kerollmops a=shekhirin

Resolves https://github.com/meilisearch/milli/issues/263

Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
2021-08-31 17:38:54 +00:00
Alexey Shekhirin
0b02eb456c chore(update): wrap long values into BStr for warn logs 2021-08-31 20:28:16 +03:00
Kerollmops
f230ae6fd5 Introduce the reset_sortable_fields Settings method 2021-08-25 17:44:16 +02:00
Kerollmops
af65485ba7 Reexport the grenad CompressionType from milli 2021-08-24 18:15:31 +02:00
Clément Renault
89d0758713 Revert "Revert "Sort at query time"" 2021-08-24 11:55:16 +02:00
Clément Renault
c084f7f731 Fix the facet string docids filterable deletion bug 2021-08-23 10:50:39 +02:00
Clémentine Urquizar
922f9fd4d5 Revert "Sort at query time" 2021-08-20 18:09:17 +02:00
many
d1df0d20f9 Add integration test of SortBy criterion 2021-08-18 16:21:51 +02:00
Kerollmops
1b7f6ea1e7 Return a new error when the sort criteria is not sortable 2021-08-18 15:04:07 +02:00
Kerollmops
71602e0f1b Add the sortable fields into the settings and in the index 2021-08-18 15:04:07 +02:00
Kerollmops
407f53872a Add a sort_criteria method to the Search builder struct 2021-08-18 15:04:07 +02:00
Kerollmops
687cd2e205 Introduce the new Sort criterion and AscDesc enum 2021-08-18 15:04:07 +02:00
Kerollmops
5b88df508e Use the new Asc/Desc syntax everywhere 2021-08-17 14:15:22 +02:00
Kerollmops
fcedff95e8 Change the Asc/Desc criterion syntax to use a colon (:) 2021-08-17 14:03:21 +02:00
Kerollmops
e9ada44509 AscDesc criterion returns documents ordered by numbers then by strings 2021-08-17 13:21:31 +02:00
Kerollmops
110bf6b778 Make the FacetStringIter work in both, ascending and descending orders 2021-08-17 11:18:40 +02:00
Kerollmops
22ebd2658f Introduce the EitherString/RevRange private aliases 2021-08-17 10:47:15 +02:00
Kerollmops
7a5889bc5a Introduce the highest_reverse_iter private method 2021-08-17 10:45:26 +02:00
Kerollmops
ad0d311f8a Introduce the FacetStringLevelZeroRevRange struct 2021-08-17 10:44:43 +02:00
Kerollmops
6214c38da9 Introduce the FacetStringGroupRevRange struct 2021-08-17 10:44:27 +02:00
Kerollmops
1c604de158 Introduce the highest_iter private method on the FacetStringIter struct 2021-08-17 10:41:11 +02:00
Kerollmops
64df159057 Introduce the new_reducing constructor on the FacetStringIter struct 2021-08-17 10:35:06 +02:00
Kerollmops
01a4052828 Move the FacetStringIter creation logic into a private new method 2021-08-17 10:29:43 +02:00
many
7dbefae1e3 Make facet string iterator non reducing 2021-08-12 17:23:39 +02:00
many
8fdf860c17 Remove max values by facet limit for facet distribution 2021-08-12 11:29:20 +02:00
bors[bot]
89b9b61840 Merge #300
300: Fix prefix level position docids database r=curquiza a=ManyTheFish

The prefix search was inverted when we generated the DB.
Instead of searching if word had a prefix in prefix fst,
we were searching if the word was a prefix of a prefix contained in the prefix fst.
The indexer, now, iterate over prefix contained in the fst
and search them by prefix in the word-level-position-docids database,
aggregating matches in a sorter.

Fix #299

Co-authored-by: many <maxime@meilisearch.com>
2021-08-04 16:52:09 +00:00
many
cdeb07f0fd Fix prefix level position docids database
The prefix search was inverted when we generated the DB.
Instead of searching if word had a prefix in prefix fst,
we were searching if the word was a prefix of a prefix contained in the prefix fst.
The indexer, now, iterate over prefix contained in the fst
and search them by prefix in the word-level-position-docids database,
aggregating matches in a sorter.

Fix #299
2021-08-04 14:11:49 +02:00
Kerollmops
90514e03d1 Fix invalid faceted documents ids buffer size 2021-07-29 15:49:23 +02:00
bors[bot]
200e98c211 Merge #293
293: Make sure that the relevancy is not impacted by other settings r=Kerollmops a=Kerollmops

Fix https://github.com/meilisearch/meilisearch/issues/1505.

fix https://github.com/meilisearch/MeiliSearch/issues/1529

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-07-27 16:04:52 +00:00
Kerollmops
dc2b63abdf Introduce an empty FilterCondition variant to support unknown fields 2021-07-27 16:34:04 +02:00
Kerollmops
b12738cfe9 Use the right DB prefixes to store the faceted fields 2021-07-22 19:18:22 +02:00
Kerollmops
7aa6cc9b04 Do not insert fields in the map when changing the settings 2021-07-22 18:40:12 +02:00
bors[bot]
ee3a49cfba Merge #291
291: Fix a bug about zero bytes in the inputs r=irevoire a=Kerollmops

Ok, good news, after a little session of debugging with `@irevoire` we found out that the bug seems to be related to zeroes in the input update. The engine wasn't designed to accept those. The chosen solution is to update the tokenizer to remove those zeroes. We are waiting on https://github.com/meilisearch/tokenizer/pull/52 to be merged and a new version to be released.

It is not an undefined behavior, I repeat: it is a "normal" bug 🎉 👏

----

This PR tries to fix a bug where we use LMDB in the wrong way, leading to panic due to an undefined behavior on the Rust side. I thought [we fixed it in a previous PR](https://github.com/meilisearch/milli/pull/264) but we found out that _a similar_ bug was still present. `@bb` found a way to trigger this bug and helped us find the origin of it.

As I don't have a minimal reproducible example of this bug I bet on the unsafe `put_current` calls when we index new documents as the bug was trigger after a big indexation on a clean database, thus not triggering a deletion update. I only replaced the unsafe `put_current` with two safe calls to `get`/`put`.

I hope it helps and fixes the bug, only `@bb` can help us check that. I am not even sure how I can create a custom Docker image and expose it for testing purposes.

<details>
  <summary>The backtrace leading us to a panic in grenad.</summary>

```
meilisearch_1    | thread 'tokio-runtime-worker' panicked at 'assertion failed: key > &last_key', /root/.cargo/git/checkouts/grenad-e2cb77f65d31bb02/3adcb26/src/block_builder.rs:38:17
meilisearch_1    | stack backtrace:
meilisearch_1    |    0: rust_begin_unwind
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:493:5
meilisearch_1    |    1: core::panicking::panic_fmt
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:92:14
meilisearch_1    |    2: core::panicking::panic
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:50:5
meilisearch_1    |    3: grenad::block_builder::BlockBuilder::insert
meilisearch_1    |              at ./root/.cargo/git/checkouts/grenad-e2cb77f65d31bb02/3adcb26/src/block_builder.rs:38:17
meilisearch_1    |    4: grenad::writer::Writer<W>::insert
meilisearch_1    |              at ./root/.cargo/git/checkouts/grenad-e2cb77f65d31bb02/3adcb26/src/writer.rs:92:12
meilisearch_1    |    5: milli::update::words_level_positions::write_level_entry
meilisearch_1    |              at ./root/.cargo/git/checkouts/milli-00376cd5db949a15/007fec2/milli/src/update/words_level_positions.rs:262:5
meilisearch_1    |    6: milli::update::words_level_positions::compute_positions_levels
meilisearch_1    |              at ./root/.cargo/git/checkouts/milli-00376cd5db949a15/007fec2/milli/src/update/words_level_positions.rs:211:13
meilisearch_1    |    7: milli::update::words_level_positions::WordsLevelPositions::execute
meilisearch_1    |              at ./root/.cargo/git/checkouts/milli-00376cd5db949a15/007fec2/milli/src/update/words_level_positions.rs:65:23
meilisearch_1    |    8: milli::update::index_documents::IndexDocuments::execute_raw
meilisearch_1    |              at ./root/.cargo/git/checkouts/milli-00376cd5db949a15/007fec2/milli/src/update/index_documents/mod.rs:831:9
meilisearch_1    |    9: milli::update::index_documents::IndexDocuments::execute
meilisearch_1    |              at ./root/.cargo/git/checkouts/milli-00376cd5db949a15/007fec2/milli/src/update/index_documents/mod.rs:372:9
meilisearch_1    |   10: meilisearch_http::index::updates::<impl meilisearch_http::index::Index>::update_documents_txn
meilisearch_1    |              at ./meilisearch/meilisearch-http/src/index/updates.rs:225:30
meilisearch_1    |   11: meilisearch_http::index::updates::<impl meilisearch_http::index::Index>::update_documents
meilisearch_1    |              at ./meilisearch/meilisearch-http/src/index/updates.rs:183:22
meilisearch_1    |   12: meilisearch_http::index::update_handler::UpdateHandler::handle_update
meilisearch_1    |              at ./meilisearch/meilisearch-http/src/index/update_handler.rs:75:18
meilisearch_1    |   13: meilisearch_http::index_controller::index_actor::actor::IndexActor<S>::handle_update::{{closure}}::{{closure}}
meilisearch_1    |              at ./meilisearch/meilisearch-http/src/index_controller/index_actor/actor.rs:174:35
meilisearch_1    |   14: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/blocking/task.rs:42:21
meilisearch_1    |   15: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/core.rs:243:17
meilisearch_1    |   16: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/loom/std/unsafe_cell.rs:14:9
meilisearch_1    |   17: tokio::runtime::task::core::CoreStage<T>::poll
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/core.rs:233:13
meilisearch_1    |   18: tokio::runtime::task::harness::poll_future::{{closure}}
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/harness.rs:427:23
meilisearch_1    |   19: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panic.rs:344:9
meilisearch_1    |   20: std::panicking::try::do_call
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:379:40
meilisearch_1    |   21: std::panicking::try
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:343:19
meilisearch_1    |   22: std::panic::catch_unwind
meilisearch_1    |              at ./rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panic.rs:431:14
meilisearch_1    |   23: tokio::runtime::task::harness::poll_future
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/harness.rs:414:19
meilisearch_1    |   24: tokio::runtime::task::harness::Harness<T,S>::poll_inner
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/harness.rs:89:9
meilisearch_1    |   25: tokio::runtime::task::harness::Harness<T,S>::poll
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/harness.rs:59:15
meilisearch_1    |   26: tokio::runtime::task::raw::RawTask::poll
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/raw.rs:66:18
meilisearch_1    |   27: tokio::runtime::task::Notified<S>::run
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/task/mod.rs:171:9
meilisearch_1    |   28: tokio::runtime::blocking::pool::Inner::run
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/blocking/pool.rs:265:17
meilisearch_1    |   29: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
meilisearch_1    |              at ./root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.1/src/runtime/blocking/pool.rs:245:17
meilisearch_1    | note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```

</details>

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-07-22 16:14:35 +00:00
Kerollmops
92c0a2cdc1 Add a test that triggers a panic when indexing zeroes 2021-07-22 17:14:44 +02:00
Kerollmops
aa02a7fdd8 Add a test to check that we indeed impact the relevancy 2021-07-22 17:04:38 +02:00
Clément Renault
0227254a65 Return the original string values for the inverted facet index database 2021-07-21 16:59:39 +02:00
Kerollmops
03a01166ba Display the original facet string value from the linear facet database 2021-07-21 16:59:39 +02:00
Clément Renault
d23c250ad5 Fix a bound error in the facet string range construction 2021-07-21 16:59:39 +02:00
Clément Renault
081278dfd6 Use the facet string levels when computing the facet distribution 2021-07-21 16:59:39 +02:00
Clément Renault
5676b204dd Fix the facet string levels codecs 2021-07-21 16:59:38 +02:00
Kerollmops
8c86348119 Indexing the facet strings levels 2021-07-21 16:59:38 +02:00
Kerollmops
a7ae552ba7 Fix the FacetStringLevelZeroRange range when unbounded 2021-07-21 16:59:38 +02:00
Kerollmops
757b2b502a Remove the FacetValueStringCodec 2021-07-21 16:59:38 +02:00
Kerollmops
adfd4da24c Introduce the FacetStringIter iterator 2021-07-21 16:59:38 +02:00
Kerollmops
a79661c6dc Introduce a lot of facet string helper iterators 2021-07-21 16:59:38 +02:00