Commit Graph

11460 Commits

Author SHA1 Message Date
ad hoc
6b2c2509b2 fix bug in exact search 2022-04-04 20:54:03 +02:00
ad hoc
56b4f5dce2 add exact prefix to query_docids 2022-04-04 20:54:03 +02:00
ad hoc
21ae4143b1 add exact_word_prefix to Context 2022-04-04 20:54:03 +02:00
ad hoc
e8f06f6c06 extract exact_word_prefix_docids 2022-04-04 20:54:03 +02:00
ad hoc
6dd2e4ffbd introduce exact_word_prefix database in index 2022-04-04 20:54:03 +02:00
ad hoc
ba0bb29cd8 refactor WordPrefixDocids to take dbs instead of indexes 2022-04-04 20:54:02 +02:00
ad hoc
c4c6e35352 query exact_word_docids in resolve_query_tree 2022-04-04 20:54:02 +02:00
ad hoc
8d46a5b0b5 extract exact word docids 2022-04-04 20:54:02 +02:00
ad hoc
5451c64d5d increase criteria asc desc test map size 2022-04-04 20:54:02 +02:00
ad hoc
0a77be4ec0 introduce exact_word_docids db 2022-04-04 20:54:02 +02:00
ad hoc
5f9f82757d refactor spawn_extraction_task 2022-04-04 20:54:02 +02:00
ad hoc
f82d4b36eb introduce exact attribute setting 2022-04-04 20:54:02 +02:00
ad hoc
c882d8daf0 add test for exact words 2022-04-04 20:54:01 +02:00
ad hoc
7e9d56a9e7 disable typos on exact words 2022-04-04 20:54:01 +02:00
bors[bot]
900825bac0 Merge #474
474: Disable typos on exact word r=MarinPostma a=MarinPostma

This PR introduces the `exact_word` setting to disable typo tolerance on custom words.

If a user query contains a word from `exact_words`, no typo derivation will be made for that particular word.

I have chosen to store the words in a FST, to save on deserialization, and allow for fast lookups.

I had some trouble with the `serde` module, and had to rename it `serde_impl`.

## steps:
- [x] introduce new settings to register words to disable typos on
- [x] in `typos`, return exact match is the current word is part of the word to disable typos for.
- [x] update `Context` to return the exact words dictionary.
- [x] merge #473 


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 18:39:43 +00:00
ad hoc
3e67d8818c fix typo in test comment 2022-04-04 20:34:23 +02:00
ad hoc
284d8a24e0 add intergration test for disabled typon on word 2022-04-04 20:15:51 +02:00
ad hoc
30a2711bac rename serde module to serde_impl module
needed because of issues with rustfmt
2022-04-04 20:10:55 +02:00
ad hoc
0fd55db21c fmt 2022-04-04 20:10:55 +02:00
ad hoc
559e46be5e fix bad rebase bug 2022-04-04 20:10:55 +02:00
ad hoc
8b1e5d9c6d add test for exact words 2022-04-04 20:10:55 +02:00
ad hoc
774fa8f065 disable typos on exact words 2022-04-04 20:10:55 +02:00
ad hoc
9bbffb8fee add exact words setting 2022-04-04 20:10:54 +02:00
bors[bot]
48a5ce7434 Merge #473
473: set minimum word len for typos r=MarinPostma a=MarinPostma

this PR allows the configuration on the minimum word length for typos.

The default values are the same as previously.

## steps
- [x] introduce settings for the minimum word length for 1 and 2 typos
- [x] update the settings update flow to set this setting
- [x] create a structure `TypoConfig` to configure typo tolerance in the query builder
- [x] in `typo`, use the configuration to create the appropriate query tree node.
- [x] extend `Context` to return the setting for minimum word length for typos
- [x] return correct error message for wrong settings.
- [x] merge #469 

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 17:53:14 +00:00
bors[bot]
9e344f6576 Merge #2207
2207: Fix: avoid embedding the user input into the error response. r=Kerollmops a=CNLHC

# Pull Request

## What does this PR do?
Fix #2107. 

The problem is meilisearch embeds the user input to the error message. 

The reason for this problem is `milli` throws a `serde_json: Error` whose `Display` implementation will do this embedding.  

I tried to solve this problem in this PR by manually implementing the `Display` trait for `DocumentFormatError` instead of deriving automatically.

<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Liu Hancheng <cn_lhc@qq.com>
Co-authored-by: LiuHanCheng <2463765697@qq.com>
2022-04-04 17:35:17 +00:00
bors[bot]
09a72cee03 Merge #2281
2281: Hard limit the number of results returned by a search r=Kerollmops a=Kerollmops

This PR fixes #2133 by hard-limiting the number of results that a search request can return at any time. I would like the guidance of `@MarinPostma` to test that, should I use a mocking test here? Or should I do anything else?

I talked about touching the _nb_hits_ value with `@qdequele` and we concluded that it was not correct to do so.

Could you please confirm that it is the right place to change that?

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-04-04 17:19:05 +00:00
bors[bot]
6bf9824fec Merge #485
485: fix bug on 2 typos derivation r=Kerollmops a=MarinPostma

I found a bug while working on #473. This pr fixes it and add the missing tests on word derivations.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-04 17:17:53 +00:00
ad hoc
853b4a520f fmt 2022-04-04 10:41:46 +02:00
ad hoc
2cb71dff4a add typo integration tests 2022-04-04 10:41:46 +02:00
ad hoc
1941072bb2 implement Copy on Setting 2022-04-04 10:41:46 +02:00
ad hoc
fdaf45aab2 replace hardcoded value with constant in TestContext 2022-04-04 10:41:46 +02:00
ad hoc
950a740bd4 refactor typos for readability 2022-04-04 10:41:46 +02:00
ad hoc
66020cd923 rename min_word_len* to use plain letter numbers 2022-04-04 10:41:46 +02:00
ad hoc
4c4b336ecb rename min word len for typo error 2022-04-01 11:17:03 +02:00
ad hoc
286dd7b2e4 rename min_word_len_2_typo 2022-04-01 11:17:03 +02:00
ad hoc
55af85db3c add tests for min_word_len_for_typo 2022-04-01 11:17:02 +02:00
ad hoc
9102de5500 fix error message 2022-04-01 11:17:02 +02:00
ad hoc
a1a3a49bc9 dynamic minimum word len for typos in query tree builder 2022-04-01 11:17:02 +02:00
ad hoc
5a24e60572 introduce word len for typo setting 2022-04-01 11:17:02 +02:00
ad hoc
9fe40df960 add word derivations tests 2022-04-01 11:05:18 +02:00
ad hoc
d5ddc6b080 fix 2 typos word derivation bug 2022-04-01 10:51:22 +02:00
LiuHanCheng
6fc6b83632 Update meilisearch-http/tests/documents/add_documents.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-04-01 09:30:40 +08:00
LiuHanCheng
eee2cd5abf Update meilisearch-http/tests/documents/add_documents.rs
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2022-04-01 09:30:32 +08:00
bors[bot]
87e4125875 Merge #2267
2267: Add instance options for RAM and CPU usage r=Kerollmops a=2shiori17

# Pull Request

## What does this PR do?
Fixes #2212 
<!-- Please link the issue you're trying to fix with this PR, if none then please create an issue first. -->

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: 2shiori17 <98276492+2shiori17@users.noreply.github.com>
Co-authored-by: shiori <98276492+2shiori17@users.noreply.github.com>
2022-03-31 15:29:18 +00:00
bors[bot]
d2d930dd3f Merge #469
469: add authorize typo setting r=Kerollmops a=MarinPostma

This PR adds support for an authorize typo settings. This makes is possible to disable typos for a whole index. Typos are enabled by default.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-03-31 15:18:08 +00:00
ad hoc
3e34981d9b add test for authorize_typos in update 2022-03-31 14:12:00 +02:00
ad hoc
6ef3bb9d83 fmt 2022-03-31 14:06:23 +02:00
ad hoc
f782fe2062 add authorize_typo_test 2022-03-31 10:08:39 +02:00
ad hoc
c4653347fd add authorize typo setting 2022-03-31 10:05:44 +02:00
Liu Hancheng
7ece7a9d9e change truncate strategy and coresponding test 2022-03-31 10:39:21 +08:00