Commit Graph

30 Commits

Author SHA1 Message Date
Clément Renault
48e8778881 Clean up the modules declarations 2019-12-13 14:38:25 +01:00
Clément Renault
4be23efe66 Remove the AttrCount type
Could probably be reintroduced later
2019-12-13 14:38:25 +01:00
Clément Renault
7d67750865 Reintroduce exacteness for one word document field 2019-12-13 14:38:25 +01:00
Clément Renault
746e6e170c Make the test pass again 2019-12-13 14:38:24 +01:00
Clément Renault
d93e35cace Introduce ContextMut and Context structs 2019-12-13 14:38:24 +01:00
Clément Renault
d75339a271 Prefer summing the attribute 2019-12-13 14:38:24 +01:00
Clément Renault
86ee0cbd6e Introduce bucket_sort_with_distinct function 2019-12-13 14:38:24 +01:00
Clément Renault
248ccfc0d8 Update the criteria to the new ones 2019-12-13 14:38:24 +01:00
Clément Renault
ea148575cf Remove the raw_query functions 2019-12-13 14:38:23 +01:00
Clément Renault
efc2be0b7b Bump the sdset dependency to 0.3.6 2019-12-13 14:38:23 +01:00
Clément Renault
8d71112dcb Rewrite the phrase query postings lists
This simplified the multiword_rewrite_matches function a little bit.
2019-12-13 14:38:23 +01:00
Clément Renault
dd03a6256a Debug pre filtered number of documents 2019-12-13 14:38:23 +01:00
Clément Renault
9c03bb3428 First probably working phrase query doc filtering 2019-12-13 14:38:23 +01:00
Clément Renault
22b19c0d93 Fix the processed distance algorithm 2019-12-13 14:38:22 +01:00
Clément Renault
0f698d6bd9 Work in progress: Bad Typo detection
I have an issue where "speakers" is split into "speaker" and "s",
when I compute the distances for the Typo criterion,
it takes "s" into account and put a distance of zero in the bucket 0
(the "speakers" bucket), therefore it reports any document matching "s"
without typos as best results.

I need to make sure to ignore "s" when its associated part "speaker"
doesn't even exist in the document and is not in the place
it should be ("speaker" followed by "s").

This is hard to think that it will had much computation time to
the Typo criterion like in the previous algorithm where I computed
the real query/words indexes based and removed the invalid ones
before sending the documents to the bucket sort.
2019-12-13 14:38:22 +01:00
Clément Renault
4e91b31b1f Make the Typo and Words work with synonyms 2019-12-13 14:38:22 +01:00
Clément Renault
f87c67fcad Improve the QueryEnhancer by doing a single lookup 2019-12-13 14:38:22 +01:00
Clément Renault
902625601a Work in progress: It seems like we support synonyms, split and concat words 2019-12-13 14:38:22 +01:00
Clément Renault
d17d4dc5ec Add more debug infos 2019-12-13 14:38:21 +01:00
Clément Renault
ef6a4db182 Before improving fields AttrCount
Removing the fields_count fetching reduced by 2 times the serach time, we should look at lazily pulling them form the criterions in needs

ugly-test: Make the fields_count fetching lazy

Just before running the exactness criterion
2019-12-13 14:38:21 +01:00
Clément Renault
11f3d7782d Introduce the AttrCount type 2019-12-13 14:38:21 +01:00
Quentin de Quelen
3a4130f344 Allow to index files with null or boolean 2019-12-12 19:25:05 +01:00
Quentin de Quelen
88b3c05155 Stop words; Do not reindex all documents if there is no documents 2019-12-12 15:31:39 +01:00
Quentin de Quelen
a4f26e8e48 Rewrite the synonym endpoint 2019-12-12 12:47:02 +01:00
qdequele
773a51e7d0 Rename 'update_type' to 'type' on EnqueuedUpdateResult 2019-11-29 15:09:48 +01:00
qdequele
7923752513 Serialize updates results to camelCase 2019-11-29 15:05:54 +01:00
qdequele
3a90233a3d Add status failed on UpdateStatus 2019-11-28 18:41:11 +01:00
Clément Renault
1def56ea11 Change the update loop to be more explicit on index clear 2019-11-27 13:43:28 +01:00
Clément Renault
d08b76a323 Separate the update and main databases
We used the heed typed transaction to make it safe (https://github.com/Kerollmops/heed/pull/27).
2019-11-27 11:29:06 +01:00
Clément Renault
7cc096e0a2 Rename MeiliDB into MeiliSearch 2019-11-26 11:12:30 +01:00