Commit Graph

2205 Commits

Author SHA1 Message Date
bors[bot]
c965200010 Merge #664
664: Fix phrase search containing stop words r=ManyTheFish a=Samyak2

# Pull Request

This a WIP draft PR I wanted to create to let other potential contributors know that I'm working on this issue. I'll be completing this in a few hours from opening this.

## Related issue
Fixes #661 and towards fixing meilisearch/meilisearch#2905

## What does this PR do?
- [x] Change Phrase Operation to use a `Vec<Option<String>>` instead of `Vec<String>` where `None` corresponds to a stop word
- [x] Update all other uses of phrase operation
- [x] Update `resolve_phrase`
- [x] Update `create_primitive_query`?
- [x] Add test

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Samyak S Sarnayak <samyak201@gmail.com>
Co-authored-by: Samyak Sarnayak <samyak201@gmail.com>
2022-10-29 13:42:52 +00:00
unvalley
d55f0e2e53 Execute cargo fmt 2022-10-28 23:42:23 +09:00
unvalley
d53a80b408 Fix clippy error 2022-10-28 23:41:35 +09:00
Samyak Sarnayak
ecb88143f9 Run cargo fmt 2022-10-28 19:37:02 +05:30
Samyak Sarnayak
03eb5d87c1 Only call plane_sweep on subgroups when 2 or more are present 2022-10-28 19:32:05 +05:30
unvalley
a1d7ed1258 fix clippy error and remove clippy job from ci
Remove clippy job

Fix clippy error type_complexity

Restore ambiguous change
2022-10-28 22:33:50 +09:00
unvalley
f3c0b05ae8 Fix rust fmt 2022-10-28 09:32:31 +09:00
unvalley
f4ec1abb9b Fix all clippy error after conflicts 2022-10-27 23:58:13 +09:00
Samyak S Sarnayak
d35afa0cf5 Change consecutive phrase search grouping logic
Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-10-26 23:10:48 +05:30
Samyak S Sarnayak
752d031010 Update phrase search to use new execute method 2022-10-26 23:07:20 +05:30
unvalley
c7322f704c Fix cargo clippy errors
Dont apply clippy for tests for now

Fix clippy warnings of filter-parser package

parent 8352febd646ec4bcf56a44161e5c4dce0e55111f
author unvalley <38400669+unvalley@users.noreply.github.com> 1666325847 +0900
committer unvalley <kirohi.code@gmail.com> 1666791316 +0900

Update .github/workflows/rust.yml

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>

Allow clippy lint too_many_argments

Allow clippy lint needless_collect

Allow clippy lint too_many_arguments and type_complexity

Fix for clippy warnings comparison_chains

Fix for clippy warnings vec_init_then_push

Allow clippy lint should_implement_trait

Allow clippy lint drop_non_drop

Fix lifetime clipy warnings in filter-paprser

Execute cargo fmt

Fix clippy remaining warnings

Fix clippy remaining warnings again and allow lint on each place
2022-10-27 01:04:23 +09:00
unvalley
811f156031 Execute cargo clippy --fix 2022-10-27 01:00:00 +09:00
Samyak S Sarnayak
488d31ecdf Run cargo fmt 2022-10-26 19:09:45 +05:30
Samyak S Sarnayak
af33d22f25 Consecutive is false when at least 1 stop word is surrounded by words 2022-10-26 19:09:45 +05:30
Samyak S Sarnayak
f1da623af3 Add test for phrase search with stop words and all criteria at once
Moved the actual test into a separate function used by both the existing
test and the new test.
2022-10-26 19:09:44 +05:30
Samyak S Sarnayak
77f1ff019b Simplify stop word checking in create_primitive_query 2022-10-26 19:09:44 +05:30
Samyak S Sarnayak
2aa11afb87 Fix panic when phrase contains only one stop word and nothing else 2022-10-26 19:09:42 +05:30
Samyak S Sarnayak
bb9ce3c5c5 Run cargo fmt 2022-10-26 19:09:03 +05:30
Samyak S Sarnayak
d187b32a28 Fix snapshots to use new phrase type 2022-10-26 19:09:03 +05:30
Samyak S Sarnayak
c8c666c6a6 Use resolve_phrase in exactness and typo criteria 2022-10-26 19:09:01 +05:30
Samyak S Sarnayak
3e190503e6 Search for closest non-stop words in proximity criteria 2022-10-26 19:08:34 +05:30
Samyak S Sarnayak
709ab3c14c Increment position even when it's a stop word in exactness criteria 2022-10-26 19:08:33 +05:30
Samyak S Sarnayak
ef13c6a5b6 Perform filter after enumerate to keep origin indices 2022-10-26 19:08:33 +05:30
Samyak S Sarnayak
6a10b679ca Add test for phrase search with stop words
Originally written by ManyTheFish here:
https://gist.github.com/ManyTheFish/f840e37cb2d2e029ce05396b4d540762

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-10-26 19:08:32 +05:30
Samyak S Sarnayak
62816dddde [WIP] Fix phrase search containing stop words
Fixes #661 and meilisearch/meilisearch#2905
2022-10-26 19:08:06 +05:30
Loïc Lecrenier
54c0cf93fe Merge remote-tracking branch 'origin/main' into facet-levels-refactor 2022-10-26 15:13:34 +02:00
bors[bot]
365f44c39b Merge #668
668: Fix many Clippy errors part 2 r=ManyTheFish a=ehiggs

This brings us a step closer to enforcing clippy on each build.

# Pull Request

## Related issue
This does not fix any issue outright, but it is a second round of fixes for clippy after https://github.com/meilisearch/milli/pull/665. This should contribute to fixing https://github.com/meilisearch/milli/pull/659.

## What does this PR do?

Satisfies many issues for clippy. The complaints are mostly:

* Passing reference where a variable is already a reference.
* Using clone where a struct already implements `Copy`
* Using `ok_or_else` when it is a closure that returns a value instead of using the closure to call function (hence we use `ok_or`)
* Unambiguous lifetimes don't need names, so we can just use `'_`
* Using `return` when it is not needed as we are on the last expression of a function.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Ewan Higgs <ewan.higgs@gmail.com>
2022-10-26 12:16:24 +00:00
Loïc Lecrenier
631e9910da Depend on released version of fuzzcheck from crates.io 2022-10-26 14:06:59 +02:00
Loïc Lecrenier
2741756248 Merge remote-tracking branch 'origin/main' into facet-levels-refactor 2022-10-26 14:03:23 +02:00
bors[bot]
d3f95e6c69 Merge #671
671: Update version for the next release (v0.35.0) in Cargo.toml files r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2022-10-26 11:58:05 +00:00
Loïc Lecrenier
b7f2428961 Fix formatting and warning after rebasing from main 2022-10-26 13:49:33 +02:00
Loïc Lecrenier
3b1f908e5e Revert behaviour of facet distribution to what it was before
Where the docid that is used to get the original facet string value
definitely belongs to the candidates
2022-10-26 13:48:01 +02:00
Loïc Lecrenier
14ca8048a8 Add some documentation on how to run the facet db fuzzer 2022-10-26 13:48:01 +02:00
Loïc Lecrenier
206a3e00e5 cargo fmt 2022-10-26 13:48:01 +02:00
Loïc Lecrenier
f198b20c42 Add facet deletion tests that use both the incremental and bulk methods
+ update deletion snapshots to the new database format
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
e3ba1fc883 Make deletion tests for both soft-deletion and hard-deletion 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
ab5e56fd16 Add document deletion snapshot tests and tests for hard-deletion 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
d885de1600 Add option to avoid soft deletion of documents 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
2295e0e3ce Use real delete function in facet indexing fuzz tests
By deleting multiple docids at once instead of one-by-one
2022-10-26 13:47:46 +02:00
Loïc Lecrenier
acc8caebe6 Add link to GitHub PR to document of update/facet module 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
a034a1e628 Move StrRefCodec and ByteSliceRefCodec to their own files 2022-10-26 13:47:46 +02:00
Loïc Lecrenier
1165ba2171 Make facet deletion incremental 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
0ade699873 Don't crash when failing to decode using StrRef codec 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
d0109627b9 Fix a bug in facet_range_search and add documentation 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
a2270b7432 Change fuzzcheck dependency to point to git repository 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
1ecd3bb822 Fix bug in FieldDocIdFacetCodec 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
51961e1064 Polish some details 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
cb8442a119 Further unify facet databases of f64s and strings 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
3baa34d842 Fix compiler errors/warnings 2022-10-26 13:47:04 +02:00
Loïc Lecrenier
86d9f50b9c Fix bugs in incremental facet indexing with variable parameters
e.g. add one facet value incrementally with a group_size = X and then
add another one with group_size = Y

It is not actually possible to do so with the public API of milli,
but I wanted to make sure the algorithm worked well in those cases
anyway.

The bugs were found by fuzzing the code with fuzzcheck, which I've added
to milli as a conditional dev-dependency. But it can be removed later.
2022-10-26 13:47:04 +02:00