Commit Graph

715 Commits

Author SHA1 Message Date
59f58c15f7 Implement attribute criterion
* Implement WordLevelIterator
* Implement QueryLevelIterator
* Implement set algorithm based on iterators

Not tested + Some TODO to fix
2021-04-27 14:39:52 +02:00
361193099f Reduce the amount of branches when query tree flattened 2021-04-27 14:39:52 +02:00
7ff4a2a708 Display the number of entries in the infos crate 2021-04-27 14:39:52 +02:00
1aad66bdaa Compute stats about the word prefix level positions database in the infos crate 2021-04-27 14:39:52 +02:00
e65bad16cc Compute the words prefixes at the end of an update 2021-04-27 14:39:52 +02:00
ab92c814c3 Fix attributes score 2021-04-27 14:35:43 +02:00
0ad9499b93 Fix an indexing bug in the words level positions 2021-04-27 14:35:43 +02:00
7aa5753ed2 Make the attribute positions range bounds to be fixed 2021-04-27 14:35:43 +02:00
658f316511 Introduce the Initial Criterion 2021-04-27 14:35:43 +02:00
89ee2cf576 Introduce the TreeLevel struct 2021-04-27 14:25:35 +02:00
bd1a371c62 Compute the WordsLevelPositions only once 2021-04-27 14:25:34 +02:00
8bd4f5d93e Compute the biggest values of the words_level_positions_docids 2021-04-27 14:25:34 +02:00
f713828406 Implement the clear and delete documents for the word-level-positions database 2021-04-27 14:25:34 +02:00
3069bf4f4a Fix and improve the words-level-positions computation 2021-04-27 14:25:34 +02:00
6b1b42b928 Introduce an infos wordsLevelPositionsDocids subcommand 2021-04-27 14:25:34 +02:00
e8cc7f9cee Expose a route in the http-ui to update the WordsLevelPositions 2021-04-27 14:25:34 +02:00
3a25137ee4 Expose and use the WordsLevelPositions update 2021-04-27 14:25:34 +02:00
c765f277a3 Introduce the WordsLevelPositions update 2021-04-27 14:25:34 +02:00
9242f2f1d4 Store the first word positions levels 2021-04-27 14:25:34 +02:00
b0a417f342 Introduce the word_level_position_docids Index database 2021-04-27 14:25:34 +02:00
75e7b1e3da Implement test Context methods 2021-04-27 14:25:34 +02:00
4ff67ec2ee Implement attribute criterion for small amounts of candidates 2021-04-27 14:25:34 +02:00
0f4c0beffd Introduce the Attribute criterion 2021-04-27 14:25:34 +02:00
3bcc1c0560 Merge pull request #164 from meilisearch/clippy-fixes
Make clippy happy
2021-04-21 13:32:29 +02:00
f8dee1b402 [makes clippy happy] search/criteria/proximity.rs 2021-04-21 12:36:45 +02:00
7fa3a1d23e makes clippy happy http-ui 2021-04-21 12:36:45 +02:00
28a8df2f0a Merge pull request #160 from shekhirin/query-words-limit
Support query words limit
2021-04-21 11:14:09 +02:00
6fa00c61d2 feat(search): support words_limit 2021-04-20 12:22:04 +03:00
726fcf015a Merge pull request #146 from meilisearch/facet-float-integer-becomes-number
Facet float-integer becomes facet number
2021-04-20 10:31:47 +02:00
c9b2d3ae1a Warn instead of returning an error when a conversion fails 2021-04-20 10:23:31 +02:00
2aeef09316 Remove debug logs while iterating through the facet levels 2021-04-20 10:23:31 +02:00
51767725b2 Simplify integer and float functions trait bounds 2021-04-20 10:23:31 +02:00
efbfa81fa7 Merge the Float and Integer enum variant into the Number one 2021-04-20 10:23:30 +02:00
f5ec14c54c Merge pull request #163 from meilisearch/next-release-v0.1.1
Update version for the next release (v0.1.1)
2021-04-19 15:52:13 +02:00
127d3d028e Update version for the next release (v0.1.1) 2021-04-19 14:48:13 +02:00
1095874e7e Merge pull request #158 from shekhirin/synonyms
Support synonyms
2021-04-18 11:00:13 +02:00
33860bc3b7 test(update, settings): set & reset synonyms
fixes after review

more fixes after review
2021-04-18 11:24:17 +03:00
e39aabbfe6 feat(search, update): synonyms 2021-04-18 11:24:17 +03:00
995d1a07d4 Merge pull request #162 from michaelchiche/patch-1 2021-04-17 09:47:08 +02:00
f6b06d6e5d typo: wrong command in example 2021-04-16 20:08:43 +02:00
19b6620a92 Merge pull request #125 from meilisearch/distinct
Implement distinct attribute
2021-04-15 16:33:49 +02:00
9c4660d3d6 add tests 2021-04-15 16:25:56 +02:00
75464a1baa review fixes 2021-04-15 16:25:56 +02:00
2f73fa55ae add documentation 2021-04-15 16:25:55 +02:00
45c45e11dd implement distinct attribute
distinct can return error

facet distinct on numbers

return distinct error

review fixes

make get_facet_value more generic

fixes
2021-04-15 16:25:55 +02:00
6e126c96a9 Merge pull request #159 from meilisearch/upd-tokenizer-v0.2.1
Update Tokenizer version to v0.2.1
2021-04-14 19:02:36 +02:00
2c5c79d68e Update Tokenizer version to v0.2.1 2021-04-14 18:54:04 +02:00
c2df51aa95 Merge pull request #156 from meilisearch/stop-words
Stop words
2021-04-14 17:33:06 +02:00
dcb00b2e54 test a new implementation of the stop_words 2021-04-12 18:35:33 +02:00
da036dcc3e Revert "Integrate the stop_words in the querytree"
This reverts commit 12fb509d84.
We revert this commit because it's causing the bug #150.
The initial algorithm we implemented for the stop_words was:

1. remove the stop_words from the dataset
2. keep the stop_words in the query to see if we can generate new words by
   integrating typos or if the word was a prefix
=> This was causing the bug since, in the case of “The hobbit”, we were
   **always** looking for something starting with “t he” or “th e”
   instead of ignoring the word completely.

For now we are going to fix the bug by completely ignoring the
stop_words in the query.
This could cause another problem were someone mistyped a normal word and
ended up typing a stop_word.

For example imagine someone searching for the music “Won't he do it”.
If that person misplace one space and write “Won' the do it” then we
will loose a part of the request.

One fix would be to update our query tree to something like that:

---------------------
OR
  OR
    TOLERANT hobbit # the first option is to ignore the stop_word
    AND
      CONSECUTIVE   # the second option is to do as we are doing
        EXACT t	    # currently
        EXACT he
      TOLERANT hobbit
---------------------

This would increase drastically the size of our query tree on request
with a lot of stop_words. For example think of “The Lord Of The Rings”.

For now whatsoever we decided we were going to ignore this problem and consider
that it doesn't reduce too much the relevancy of the search to do that
while it improves the performances.
2021-04-12 18:35:33 +02:00