Commit Graph

913 Commits

Author SHA1 Message Date
Alexey Shekhirin
2658c5c545 feat(index): update fields distribution in clear & delete operations
fixes after review

bump the version of the tokenizer

implement a first version of the stop_words

The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests

Integrate the stop_words in the querytree

remove the stop_words from the querytree except if it was a prefix or a typo

more fixes after review
2021-04-01 19:12:35 +03:00
Alexey Shekhirin
27c7ab6e00 feat(index): store fields distribution in index 2021-04-01 18:35:19 +03:00
tamo
12fb509d84 Integrate the stop_words in the querytree
remove the stop_words from the querytree except if it was a prefix or a typo
2021-04-01 13:57:55 +02:00
tamo
a2f46029c7 implement a first version of the stop_words
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
2021-04-01 13:57:55 +02:00
tamo
62a8f1d707 bump the version of the tokenizer 2021-04-01 13:49:22 +02:00
Alexey Shekhirin
9205b640a4 feat(index): introduce fields_ids_distribution 2021-03-31 18:44:47 +03:00
Alexey Shekhirin
2cb32edaa9 fix(criterion): compile asc/desc regex only once
use once_cell instead of lazy_static

reorder imports
2021-03-30 16:07:14 +03:00
Alexey Shekhirin
1e3f05db8f use fixed number of candidates as a threshold 2021-03-30 11:57:10 +03:00
Alexey Shekhirin
a776ec9718 fix division 2021-03-29 19:16:58 +03:00
Alexey Shekhirin
522e79f2e0 feat(search, criteria): introduce a percentage threshold to the asc/desc 2021-03-29 19:08:31 +03:00
tamo
73dcdb27f6 select a specific release of the tokenizer instead of using the latests git commit 2021-03-25 15:00:18 +01:00
mpostma
9c27183876 fix broken offset 2021-03-15 20:23:50 +01:00
mpostma
f0210453a6 add updated at on put primary key 2021-03-15 14:05:48 +01:00
mpostma
615fe095e1 update index updated at on index writes 2021-03-15 14:05:47 +01:00
mpostma
80d0f9c49d methods to update index time metadata 2021-03-15 14:05:47 +01:00
Kerollmops
d48008339e Introduce two new optional_words and authorize_typos Search options 2021-03-10 11:16:30 +01:00
Kerollmops
54b97ed8e1 Update the fetcher comments 2021-03-10 10:56:26 +01:00
Kerollmops
d301859bbd Introduce a special word_derivations function for Proximity 2021-03-10 10:42:53 +01:00
Kerollmops
facfb4b615 Fix the bucket candidates 2021-03-10 10:42:53 +01:00
Kerollmops
42fd7dea78 Remove the useless typo cache 2021-03-10 10:42:53 +01:00
many
62a70c300d Optimize words criterion 2021-03-10 10:42:53 +01:00
Kerollmops
f51eb46c69 Use the RoaringBitmapLenCodec to retrieve the count of documents 2021-03-09 10:25:39 +01:00
Kerollmops
d781a6164a Rewrite some code with idiomatic Rust 2021-03-08 16:27:52 +01:00
Clément Renault
b18ec00a7a Add a logging_timer macro to te criterion next methods 2021-03-08 16:12:06 +01:00
Kerollmops
82a0f678fb Introduce a cache on the docid_word_positions database method 2021-03-08 16:12:03 +01:00
Clément Renault
5fcaedb880 Introduce a WordDerivationsCache struct 2021-03-08 16:00:53 +01:00
many
2606c92ef9 use plain sweep in proximity criterion 2021-03-08 15:58:39 +01:00
many
ae47bb3594 Introduce plane_sweep function in proximity criterion 2021-03-08 15:58:38 +01:00
Kerollmops
636a9df177 Temporarily fix the tinytemplate doc hidden issue 2021-03-08 15:57:45 +01:00
Clément Renault
3c76b3548d Rework the Asc/Desc criteria to be facet iterator based 2021-03-08 13:32:25 +01:00
Clément Renault
a58d2b6137 Print the Asc/Desc criterion field name in the debug prints 2021-03-08 13:32:25 +01:00
mpostma
e3095be85c Remove Debug use in Display impl 2021-03-08 12:09:09 +01:00
mpostma
9e1eb25232 implement display for criterion
Update milli/src/criterion.rs

Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-03-08 11:00:30 +01:00
Clément Renault
e5bb96bc3b Fix the searchable settings test 2021-03-06 12:48:41 +01:00
Kerollmops
9b6b35d9b7 Clean up some comments 2021-03-03 18:19:10 +01:00
Kerollmops
2cc4a467a6 Change the criterion output that cannot fail 2021-03-03 18:18:33 +01:00
Kerollmops
1fc25148da Remove useless where clauses for the criteria 2021-03-03 18:09:19 +01:00
Kerollmops
07784c8990 Tune the words prefixes threshold to compute for 1/1000 instead 2021-03-03 15:51:28 +01:00
Kerollmops
f376c6a728 Make sure we retrieve the docid word positions 2021-03-03 15:45:03 +01:00
Kerollmops
5c5e51095c Fix the Asc/Desc criteria to alsways return the QueryTree when available 2021-03-03 15:45:03 +01:00
many
cdaa96df63 optimize proximity criterion 2021-03-03 15:45:03 +01:00
many
246286f0eb take hard separator into account 2021-03-03 15:45:03 +01:00
Kerollmops
6bf6b40495 Remove unused files 2021-03-03 15:45:03 +01:00
Kerollmops
f118d7e067 build criteria from settings 2021-03-03 15:45:03 +01:00
Kerollmops
025835c5b2 Fix the criteria to avoid always returning a placeholder 2021-03-03 15:45:03 +01:00
Kerollmops
36c1f93ceb Do an union of the bucket candidates 2021-03-03 15:45:03 +01:00
many
b0e0c5eba0 remove option of bucket_candidates 2021-03-03 15:45:03 +01:00
Kerollmops
daf126a638 Introduce the final Fetcher criterion 2021-03-03 15:45:03 +01:00
many
7ac09d7b7c remove option of bucket_candidates 2021-03-03 15:45:03 +01:00
Kerollmops
5af63c74e0 Speed-up the MatchingWords highlighting struct 2021-03-03 15:45:03 +01:00