Commit Graph

939 Commits

Author SHA1 Message Date
Kerollmops
f713828406 Implement the clear and delete documents for the word-level-positions database 2021-04-27 14:25:34 +02:00
Kerollmops
3069bf4f4a Fix and improve the words-level-positions computation 2021-04-27 14:25:34 +02:00
Kerollmops
3a25137ee4 Expose and use the WordsLevelPositions update 2021-04-27 14:25:34 +02:00
Kerollmops
c765f277a3 Introduce the WordsLevelPositions update 2021-04-27 14:25:34 +02:00
Kerollmops
9242f2f1d4 Store the first word positions levels 2021-04-27 14:25:34 +02:00
Kerollmops
b0a417f342 Introduce the word_level_position_docids Index database 2021-04-27 14:25:34 +02:00
Kerollmops
c9b2d3ae1a Warn instead of returning an error when a conversion fails 2021-04-20 10:23:31 +02:00
Kerollmops
51767725b2 Simplify integer and float functions trait bounds 2021-04-20 10:23:31 +02:00
Alexey Shekhirin
33860bc3b7 test(update, settings): set & reset synonyms
fixes after review

more fixes after review
2021-04-18 11:24:17 +03:00
Alexey Shekhirin
e39aabbfe6 feat(search, update): synonyms 2021-04-18 11:24:17 +03:00
Marin Postma
45c45e11dd implement distinct attribute
distinct can return error

facet distinct on numbers

return distinct error

review fixes

make get_facet_value more generic

fixes
2021-04-15 16:25:55 +02:00
tamo
dcb00b2e54 test a new implementation of the stop_words 2021-04-12 18:35:33 +02:00
Alexey Shekhirin
84c1dda39d test(http): setting enum serialize/deserialize 2021-04-08 17:03:40 +03:00
Alexey Shekhirin
dc636d190d refactor(http, update): introduce setting enum 2021-04-08 17:03:40 +03:00
Alexey Shekhirin
2658c5c545 feat(index): update fields distribution in clear & delete operations
fixes after review

bump the version of the tokenizer

implement a first version of the stop_words

The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests

Integrate the stop_words in the querytree

remove the stop_words from the querytree except if it was a prefix or a typo

more fixes after review
2021-04-01 19:12:35 +03:00
Alexey Shekhirin
27c7ab6e00 feat(index): store fields distribution in index 2021-04-01 18:35:19 +03:00
tamo
a2f46029c7 implement a first version of the stop_words
The front must provide a BTreeSet containing the stop words
The stop_words are set at None if an empty Set is provided
add the stop-words in the http-ui interface

Use maplit in the test
and remove all the useless drop(rtxn) at the end of all tests
2021-04-01 13:57:55 +02:00
Alexey Shekhirin
9205b640a4 feat(index): introduce fields_ids_distribution 2021-03-31 18:44:47 +03:00
mpostma
615fe095e1 update index updated at on index writes 2021-03-15 14:05:47 +01:00
Kerollmops
f51eb46c69 Use the RoaringBitmapLenCodec to retrieve the count of documents 2021-03-09 10:25:39 +01:00
Clément Renault
e5bb96bc3b Fix the searchable settings test 2021-03-06 12:48:41 +01:00
Kerollmops
07784c8990 Tune the words prefixes threshold to compute for 1/1000 instead 2021-03-03 15:51:28 +01:00
Kerollmops
f376c6a728 Make sure we retrieve the docid word positions 2021-03-03 15:45:03 +01:00
many
246286f0eb take hard separator into account 2021-03-03 15:45:03 +01:00
mpostma
e08b6b3ec7 add primary key to fields_id_map when not present 2021-03-01 16:10:16 +01:00
Clément Renault
c318373b88 Expose the WordsPrefixes update on the UpdateBuilder 2021-02-21 12:15:35 +01:00
Kerollmops
a4a48be923 Run the words prefixes update inside of the indexing documents update 2021-02-17 11:22:26 +01:00
Kerollmops
616ed8f73c Clean up the word prefix pair proximities when deleting documents 2021-02-17 11:22:26 +01:00
Clément Renault
ea37fd821d Clean up the words prefixes when deleting documents and words 2021-02-17 11:22:25 +01:00
Clément Renault
62eee9c69e Introduce the sorter_into_lmdb_database helper function 2021-02-17 11:12:39 +01:00
Clément Renault
b5b89990eb Compute and write the word prefix pair proximities database 2021-02-17 11:12:38 +01:00
Kerollmops
9b03b0a1b2 Introduce the word prefix pair proximity docids database 2021-02-17 11:12:38 +01:00
Clément Renault
f365de636f Compute and write the word-prefix-docids database 2021-02-17 11:12:38 +01:00
Clément Renault
ee5a60e1c5 Clear the words prefixes when clearing an index 2021-02-17 10:45:17 +01:00
Clément Renault
b3a21d5a50 Introduce the getters and setters for the words prefixes FST 2021-02-17 10:45:17 +01:00
Clément Renault
89ce4e74fe Do not change the primary key type when we serialize documents 2021-02-15 21:24:36 +01:00
Clément Renault
69acdd437e Deserialize documents ids into JSON Values on deletion 2021-02-15 21:24:36 +01:00
Clément Renault
b3776598d8 Add a test to check deletion of documents with number as primary key 2021-02-15 21:24:35 +01:00
Clément Renault
e8639517da Change the project to become a workspace with milli as a default-member 2021-02-12 16:15:09 +01:00