Commit Graph

1824 Commits

Author SHA1 Message Date
Loïc Lecrenier
b8a1caad5e Add range search and incremental indexing algorithm 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
63ef0aba18 Start porting facet distribution and sort to new database structure 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
7913d6365c Update Facets indexing to be compatible with new database structure 2022-10-26 13:46:14 +02:00
Loïc Lecrenier
c3f49f766d Prepare refactor of facets database
Prepare refactor of facets database
2022-10-26 13:46:14 +02:00
curquiza
e883bccc76 Update version for the next release (v0.35.0) in Cargo.toml files 2022-10-26 11:43:54 +00:00
bors[bot]
c8f16530d5 Merge #616
616: Introduce an indexation abortion function when indexing documents r=Kerollmops a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-26 11:41:18 +00:00
Ewan Higgs
9d27ac8a2e Ignore too many arguments to functions. 2022-10-25 21:22:53 +02:00
Ewan Higgs
42cdc38c7b Allow weird ranges like 1..=0 to pass clippy.
Everything else is just a warning and exit code will be 0.
2022-10-25 21:12:59 +02:00
Ewan Higgs
2ce025a906 Fixes after rebase to fix new issues. 2022-10-25 20:58:31 +02:00
Ewan Higgs
17f7922bfc Remove unneeded lifetimes. 2022-10-25 20:49:04 +02:00
Ewan Higgs
6b2fe94192 Fixes for clippy bringing us down to 18 remaining issues.
This brings us a step closer to enforcing clippy on each build.
2022-10-25 20:49:02 +02:00
Loïc Lecrenier
36bd66281d Add method to create a new Index with specific creation dates 2022-10-25 14:37:56 +02:00
Loïc Lecrenier
9a569d73d1 Minor code style change 2022-10-24 15:30:43 +02:00
Loïc Lecrenier
be302fd250 Remove outdated workaround for duplicate words in phrase search 2022-10-24 15:27:06 +02:00
Loïc Lecrenier
d76d0cb1bf Merge branch 'main' into word-pair-proximity-docids-refactor 2022-10-24 15:23:00 +02:00
curquiza
f3874d58b9 Update version for the next release (v0.34.0) in Cargo.toml files 2022-10-24 10:13:25 +00:00
Loïc Lecrenier
a983129613 Apply suggestions from code review 2022-10-20 09:49:37 +02:00
bors[bot]
f11a4087da Merge #665
665: Fixing piles of clippy errors. r=ManyTheFish a=ehiggs

## Related issue
No issue fixed. Simply cleaning up some code for clippy on the march towards a clean build when #659 is merged.

## What does this PR do?
Most of these are calling clone when the struct supports Copy.

Many are using & and &mut on `self` when the function they are called from already has an immutable or mutable borrow so this isn't needed.

I tried to stay away from actual changes or places where I'd have to name fresh variables.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Ewan Higgs <ewan.higgs@gmail.com>
2022-10-20 07:19:46 +00:00
Loïc Lecrenier
176ffd23f5 Fix compile error after rebasing wppd-refactor 2022-10-18 10:40:26 +02:00
Loïc Lecrenier
ab2f6f3aa4 Refine some details in word_prefix_pair_proximity indexing code 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
e6e76fbefe Improve performance of resolve_phrase at the cost of some relevancy 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
178d00f93a Cargo fmt 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
830a7c0c7a Use resolve_phrase function for exactness criteria as well 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
18d578dfc4 Adjust some algorithms using DBs of word pair proximities 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
072b576514 Fix proximity value in keys of prefix_word_pair_proximity_docids 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
6c3a5d69e1 Update snapshots 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
a7de4f5b85 Don't add swapped word pairs to the word_pair_proximity_docids db 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
264a04922d Add prefix_word_pair_proximity database
Similar to the word_prefix_pair_proximity one but instead the keys are:
(proximity, prefix, word2)
2022-10-18 10:37:34 +02:00
Loïc Lecrenier
1dbbd8694f Rename StrStrU8Codec to U8StrStrCodec and reorder its fields 2022-10-18 10:37:34 +02:00
Loïc Lecrenier
bdeb47305e Change encoding of word_pair_proximity DB to (proximity, word1, word2)
Same for word_prefix_pair_proximity
2022-10-18 10:37:34 +02:00
Many the fish
81919a35a2 Update milli/src/search/criteria/initial.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-17 18:23:20 +02:00
Many the fish
516e838eb4 Update milli/src/search/criteria/initial.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-17 18:23:15 +02:00
Clément Renault
fc03e53615 Add a test to check that we can abort an indexation 2022-10-17 17:28:03 +02:00
Kerollmops
6603437cb1 Introduce an indexation abortion function when indexing documents 2022-10-17 17:28:03 +02:00
ManyTheFish
6f55e7844c Add some code comments 2022-10-17 14:41:57 +02:00
ManyTheFish
cf203b7fde Take filter in account when computing the pages candidates 2022-10-17 14:13:44 +02:00
ManyTheFish
d71bc1e69f Compute an exact count when using distinct 2022-10-17 14:13:44 +02:00
ManyTheFish
a396806343 Add settings to force milli to exhaustively compute the total number of hits 2022-10-17 14:13:44 +02:00
Loïc Lecrenier
4c481a8947 Upgrade all dependencies 2022-10-17 13:05:56 +02:00
Ewan Higgs
beb987d3d1 Fixing piles of clippy errors.
Most of these are calling clone when the struct supports Copy.

Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.

I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
bors[bot]
f30979d021 Merge #662
662: Enhance word splitting strategy r=ManyTheFish a=akki1306

# Pull Request

## Related issue
Fixes #648 

## What does this PR do?
- [split_best_frequency](55d889522b/milli/src/search/query_tree.rs (L282-L301)) to use frequency of word pairs near together with proximity value of 1 instead of considering the frequency of individual words. Word pairs having max frequency are considered.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Co-authored-by: Akshay Kulkarni <akshayk.gj@gmail.com>
2022-10-13 08:14:22 +00:00
Akshay Kulkarni
85f3028317 remove underscore and introduce back word_documents_count 2022-10-13 13:21:59 +05:30
Akshay Kulkarni
8195fc6141 revert removal of word_documents_count method 2022-10-13 13:14:27 +05:30
Akshay Kulkarni
32f825d442 move default implementation of word_pair_frequency to TestContext 2022-10-13 12:57:50 +05:30
Akshay Kulkarni
ff8b2d4422 formatting 2022-10-13 12:44:08 +05:30
Akshay Kulkarni
6cb8b46900 use word_pair_frequency and remove word_documents_count 2022-10-13 12:43:11 +05:30
Akshay Kulkarni
8c9245149e format file 2022-10-12 15:27:56 +05:30
Akshay Kulkarni
63e79a9039 update comment 2022-10-12 13:36:48 +05:30
Akshay Kulkarni
7f9680f0a0 Enhance word splitting strategy 2022-10-12 13:18:23 +05:30
Loïc Lecrenier
6fbf5dac68 Simplify documents! macro to reduce compile times 2022-10-12 09:22:05 +02:00