Commit Graph

353 Commits

Author SHA1 Message Date
mpostma
8760beed1c bump meilisearch 2021-02-02 14:23:33 +01:00
bors[bot]
c984fa1071 Merge #1176
1176: fix race condition in  document addition r=Kerollmops a=MarinPostma

As described in #1160, there was a race condition when updating settings and adding documents simultaneously. This was due to the schema being updated and document addition being processed in two different transactions. This PR moves the schema update logic for the primary key in the same transaction as the document addition, while maintaining the input checks for the validity of the primary key in the http route, in order not to break the error reporting for the document addition route.

close #1160.

Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: marin <postma.marin@protonmail.com>
2021-02-02 09:26:32 +00:00
bors[bot]
81e9fd8933 Merge #1184
1184: normalize synonyms during indexation r=MarinPostma a=LegendreM

fix #1135 #964

Normalizes the synonyms before indexing them, so they are not case sensitive anymore. Then normalization also involves deunicoding is some cases, such as accents, so `été` and `ete` are considered equivalent in a search for synonyms.

Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-02-01 14:12:57 +00:00
Many
940f83698c Update meilisearch-core/src/update/settings_update.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-02-01 12:06:48 +01:00
bors[bot]
f37a420a04 Merge #1174
1174: Limit query words number r=MarinPostma a=MarinPostma

This pr adds a limit to the number of words taken into account in a search query. Using query string that are too long leads to huge performance hits and ressources consumtion, that occasionally crashes the machine. The limit has been hard set to 10, and tests have been added to make sure that it is taken into account.

close #941

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-28 17:38:34 +00:00
many
eeccdce33a update tokenizer to v0.1.3 2021-01-28 10:33:44 +01:00
marin
1d910dbb42 Update meilisearch-core/src/update/documents_addition.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-01-15 00:55:31 +01:00
many
7a7cb9bcbf update dependencies 2021-01-13 15:48:53 +01:00
many
9b47bbc1ac bump meilisearch 2021-01-13 15:37:15 +01:00
mpostma
430a5f902b fix race condition in document addition 2021-01-13 13:17:52 +01:00
Many
bc0d53e819 Update meilisearch-core/src/update/settings_update.rs
Co-authored-by: marin <postma.marin@protonmail.com>
2021-01-13 13:17:19 +01:00
many
06b2a587af normalize synonyms during indexation 2021-01-12 13:53:32 +01:00
mpostma
81f343a46a add word limit to search queries 2021-01-08 16:23:23 +01:00
mpostma
948c89c26f bump meilisearch 2021-01-06 11:41:44 +01:00
many
677627586c fix test set
fix dump tests
2021-01-05 21:37:05 +01:00
mpostma
0731971300 fix style 2021-01-05 15:21:06 +01:00
mpostma
c290719984 remove byte offset in index_seq 2021-01-05 15:21:06 +01:00
mpostma
2a145e288c fix style 2021-01-05 15:21:06 +01:00
many
aeb676e757 skip indexation while token is not a word 2021-01-05 15:21:06 +01:00
many
2852349e68 update tokenizer version 2021-01-05 15:21:06 +01:00
many
748a8240dd fix highlight shifting bug 2021-01-05 15:21:05 +01:00
mpostma
808be4678a fix style 2021-01-05 15:21:05 +01:00
mpostma
398577f116 bump tokenizer 2021-01-05 15:21:05 +01:00
mpostma
8e64a24d19 fix suggestions 2021-01-05 15:21:05 +01:00
mpostma
8b149c9aa3 update tokenizer dep to release 2021-01-05 15:21:05 +01:00
mpostma
a7c88c7951 restore synonyms tests 2021-01-05 15:21:05 +01:00
mpostma
db64e19b8d all tests pass 2021-01-05 15:21:05 +01:00
mpostma
b574960755 fix split_query_string 2021-01-05 15:21:05 +01:00
mpostma
c6434f609c fix indexing length 2021-01-05 15:21:05 +01:00
mpostma
206308c1aa replace hashset with fst::Set 2021-01-05 15:21:05 +01:00
mpostma
6527d3e492 better separator handling 2021-01-05 15:21:05 +01:00
mpostma
e616b1e356 hard separator offset 2021-01-05 15:21:05 +01:00
mpostma
8843062604 fix indexer tests 2021-01-05 15:21:05 +01:00
mpostma
5e00842087 integration with new tokenizer wip 2021-01-05 15:21:05 +01:00
mpostma
8a4d05b7bb remove meilisearch tokenizer 2021-01-05 15:21:05 +01:00
bors[bot]
061832af7f Merge #1163
1163: remove benches r=LegendreM a=MarinPostma

remove unused benches, that did not compile either


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-05 13:27:42 +00:00
mpostma
0e04c90abe remove benches 2021-01-05 10:54:19 +01:00
mpostma
48eb78b14d bump deps 2021-01-04 16:56:28 +01:00
mpostma
5fe0e06342 fix clippy warnings 2020-12-15 12:42:19 +01:00
mpostma
2904ca7f57 update codebase with shcema refactor 2020-12-15 12:04:51 +01:00
mpostma
d45c794a9e bump rustyline 2020-12-09 12:41:36 +01:00
mpostma
c9dd7e10b9 bump ordered floats 2020-12-09 12:40:24 +01:00
mpostma
56ad400c49 update heed 2020-12-09 11:27:38 +01:00
mpostma
e2b0402cf5 bump regex 2020-12-09 10:28:22 +01:00
mpostma
0c7fffeaf6 update env-logger 2020-12-09 10:25:17 +01:00
mpostma
5f8dc21dd2 bump once-cell 2020-12-09 10:22:14 +01:00
mpostma
3ec76ac33d bump meilisearch 2020-11-30 16:35:56 +01:00
bors[bot]
b8e677efd2 Merge #1100
1100: [fix] Remove some clippy warnings r=MarinPostma a=woshilapin

fix #1099 

I'm also wondering if I should add `-- --deny warnings` to the modified line in `test.yml`.

Co-authored-by: Jean SIMARD <woshilapin@tuziwo.info>
2020-11-30 15:02:26 +00:00
bors[bot]
f564a9ce51 Merge #849
849: Update nbHits count with filtered documents r=MarinPostma a=balajisivaraman

Closes #764 
close #1039

After discussing with @MarinPostma on Slack, this is my first attempt at implementing this for the basic flow that will go through `bucket_sort_with_distinct`.

A few thoughts here: 

- For getting the count of filtered documents alone, I originally thought of using `filter_map.values().filter(|&&v| !v).count()`. In a few cases, this was the same as what I have now implemented. But I realised I couldn't do something similar for `distinct`. So for being consistent, I have implemented both in a similar fashion.
- I also needed the `contains_key` check to ensure we're not counting the same document ID twice.

@MarinPostma also mentioned that this will be an approximation since the sort is lazy. In the test example that I've updated, the actual filtered count will be just 19 (for `male` records), but due to the `limit` in play, it returns 32 (filtering out 11 records overall).

Please let me know if this is the kind of fix we are looking for, and I can implement it in the placeholder search also.

Co-authored-by: Balaji Sivaraman <balaji@balajisivaraman.com>
2020-11-26 09:53:13 +00:00
Jean SIMARD
85d0a914ac [fix] Remove some clippy warnings 2020-11-23 23:24:40 +01:00