11ea5acff9
Fix
2023-10-30 11:13:10 +01:00
8d77736a67
Fix fid_word_docids
2023-10-30 11:13:10 +01:00
748b333161
Add usefull debug assert before key insertion in database
2023-10-30 11:13:10 +01:00
17b647dfe5
Wip
2023-10-30 11:13:08 +01:00
2614e7d9ca
Merge #4174
...
4174: Fix warnings r=dureuill a=irevoire
Fix all the warnings found in the CI: https://github.com/meilisearch/meilisearch/actions/runs/6622576021/job/17988323623
Co-authored-by: Tamo <tamo@meilisearch.com >
2023-10-30 10:12:54 +00:00
e7244aa485
fix warnings
2023-10-30 11:00:46 +01:00
4c6fddb1cb
update charabia
2023-10-26 17:01:10 +02:00
2bae9550c8
Add explanatory comment
2023-10-23 12:06:28 +02:00
32c78ac8b1
add/update tests when search with distinct attribute & pagination with no ranking
2023-10-23 12:06:27 +02:00
5fe7c4545a
compute all candidates correctly when skipping
2023-10-23 12:02:45 +02:00
5e0485d8dd
Merge #4131
...
4131: Reduce proximity range from 7 to 3 r=Kerollmops a=ManyTheFish
## Summary
This PR aims to reduce the impact of the proximity databases on the indexing time and on the database size by reducing the maximum distance between two words to be indexed in the proximity database.
## Stats
### Impact on database size and indexing time

### Impact on search relevancy
<details>
| dataset_name | host_name | Relevancy rate (Precision) | completion_rate 25.00% | completion_rate 50.00% | completion_rate 75.00% | completion_rate 100.00% |
|--------------|------------------|------------------------------------|-----------------|-----------------|-----------------|-----------------|
| FBIS | 1_4_0 | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FBIS | 1_4_0 | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FBIS | 1_4_0 | percentile-50 | 0.00% | 0.00% | 5.00% | 5.56% |
| FBIS | 1_4_0 | percentile-75 | 0.00% | 12.50% | 35.00% | 45.00% |
| FBIS | 1_4_0 | percentile-90 | 20.00% | 40.00% | | 100.00% |
| FBIS | 1_4_0 | average | 5.78% | 11.16% | 21.90% | 26.29% |
| FBIS | reduce_proximity | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FBIS | reduce_proximity | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FBIS | reduce_proximity | percentile-50 | 0.00% | 0.00% | 5.00% | 5.56% |
| FBIS | reduce_proximity | percentile-75 | 0.00% | 15.00% | 35.00% | 40.00% |
| FBIS | reduce_proximity | percentile-90 | 20.00% | 40.00% | 85.00% | 100.00% |
| FBIS | reduce_proximity | average | 5.55% | 11.34% | 21.75% | 26.14% |
| FR94 | 1_4_0 | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | 1_4_0 | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | 1_4_0 | percentile-50 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | 1_4_0 | percentile-75 | 0.00% | 5.00% | 15.00% | 42.11% |
| FR94 | 1_4_0 | percentile-90 | 15.00% | 54.55% | 100.00% | 100.00% |
| FR94 | 1_4_0 | average | 5.95% | 12.07% | 18.70% | 25.57% |
| FR94 | reduce_proximity | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | reduce_proximity | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | reduce_proximity | percentile-50 | 0.00% | 0.00% | 0.00% | 0.00% |
| FR94 | reduce_proximity | percentile-75 | 0.00% | 5.00% | 15.00% | 42.11% |
| FR94 | reduce_proximity | percentile-90 | 15.00% | 54.55% | 100.00% | 100.00% |
| FR94 | reduce_proximity | average | 5.79% | 12.00% | 18.70% | 25.53% |
| FT | 1_4_0 | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FT | 1_4_0 | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FT | 1_4_0 | percentile-50 | 0.00% | 0.00% | 5.00% | 10.00% |
| FT | 1_4_0 | percentile-75 | 0.00% | 15.00% | 30.00% | 40.00% |
| FT | 1_4_0 | percentile-90 | 20.00% | 50.00% | 65.00% | 100.00% |
| FT | 1_4_0 | average | 5.08% | 12.58% | 20.00% | 25.49% |
| FT | reduce_proximity | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| FT | reduce_proximity | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| FT | reduce_proximity | percentile-50 | 0.00% | 0.00% | 5.00% | 10.00% |
| FT | reduce_proximity | percentile-75 | 0.00% | 15.00% | 30.00% | 40.00% |
| FT | reduce_proximity | percentile-90 | 10.00% | 45.00% | 60.00% | 100.00% |
| FT | reduce_proximity | average | 5.01% | 12.64% | 20.10% | 25.53% |
| LAT | 1_4_0 | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| LAT | 1_4_0 | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| LAT | 1_4_0 | percentile-50 | 0.00% | 0.00% | 5.00% | 5.00% |
| LAT | 1_4_0 | percentile-75 | 5.00% | 15.00% | 30.00% | 30.00% |
| LAT | 1_4_0 | percentile-90 | 15.00% | 45.00% | 60.00% | 80.00% |
| LAT | 1_4_0 | average | 4.80% | 11.80% | 17.88% | 21.62% |
| LAT | reduce_proximity | percentile-10 | 0.00% | 0.00% | 0.00% | 0.00% |
| LAT | reduce_proximity | percentile-25 | 0.00% | 0.00% | 0.00% | 0.00% |
| LAT | reduce_proximity | percentile-50 | 0.00% | 0.00% | 5.00% | 5.00% |
| LAT | reduce_proximity | percentile-75 | 0.00% | 11.11% | 25.00% | 35.00% |
| LAT | reduce_proximity | percentile-90 | 15.00% | 45.00% | 55.00% | 80.00% |
| LAT | reduce_proximity | average | 4.43% | 11.23% | 17.32% | 21.45% |
</details>
### Impact on Search time
| dataset_name | host_name | 25.00% | 50.00% | 75.00% | 100.00% | Average |
|--------------|------------------|------------:|------------:|------------:|------------:|-------------|
| FBIS | 1_4_0 | 3.45 | 7.446666667 | 9.773489933 | 9.620300752 | 7.572614338 |
| FBIS | reduce_proximity | 2.983333333 | 5.316666667 | 6.911073826 | 7.637218045 | 5.712072968 |
| FR94 | 1_4_0 | 2.236666667 | 4.45 | 5.523489933 | 4.560150376 | 4.192576744 |
| FR94 | reduce_proximity | 2.09 | 3.991666667 | 4.981543624 | 4.266917293 | 3.832531896 |
| FT | 1_4_0 | 5.956666667 | 9.656666667 | 13.86912752 | 10.83270677 | 10.0787919 |
| FT | reduce_proximity | 4.51 | 5.981666667 | 7.701342282 | 6.766917293 | 6.23998156 |
| LAT | 1_4_0 | 5.856666667 | 9.233333333 | 12.98322148 | 10.78759398 | 9.715203865 |
| LAT | reduce_proximity | 6.91 | 6.706666667 | 8.463087248 | 8.265037594 | 7.586197877 |
## Technical approach
- Ensure the MAX_DISTANCE constant is used everywhere needed
- Reduce the MAX_DISTANCE from 8 to 4
## Related
TBD
Co-authored-by: ManyTheFish <many@meilisearch.com >
2023-10-18 14:56:08 +00:00
27eec21415
Fix tests
2023-10-18 16:03:22 +02:00
62dfd09dc6
Add more puffin logs to the deletion functions
2023-10-13 13:11:09 +02:00
f343ef5f2f
Merge #4108
...
4108: Fix bug where search with distinct attribute and no ranking, returns offset+limit hits r=curquiza a=vivek-26
# Pull Request
## Related issue
Fixes #4078
## What does this PR do?
This PR -
- Fixes bug where search with distinct attribute and no ranking, returns offset+limit hits.
- Adds unit and integration tests.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Vivek Kumar <vivek.26@outlook.com >
2023-10-12 07:51:29 +00:00
19ba129165
add unit test for distinct search with no ranking
2023-10-11 19:02:27 +05:30
d4da06ff47
fix bug where distinct search with no ranking returns offset+limit hits
2023-10-11 19:02:16 +05:30
c0f2724c2d
get rids of the new introduced error code in favor of an io::Error
2023-10-10 15:12:23 +02:00
d772073dfa
use a bufreader everytime there is a grenad<file>
2023-10-10 15:00:30 +02:00
43989fe2e4
Reduce porximity range from 7 to 3
2023-10-03 12:16:48 +02:00
487d493f49
Merge #4043
...
4043: Bring back hotfixes from v1.3.3 into v1.4.0 r=Kerollmops a=curquiza
Co-authored-by: curquiza <curquiza@users.noreply.github.com >
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Kerollmops <clement@meilisearch.com >
Co-authored-by: curquiza <clementine@meilisearch.com >
2023-09-11 12:27:34 +00:00
abfa7ded25
use a new temp index in the test
2023-09-08 12:32:47 +05:30
f2837aaec2
add another test case
2023-09-08 11:39:54 +05:30
11df155598
fix highlighting bug when searching for a phrase with cropping
2023-09-08 11:39:52 +05:30
256cf33bca
Merge #4039
...
4039: Fix multiple vectors dimensions r=ManyTheFish a=Kerollmops
This PR fixes #4035 , making providing multiple vectors in documents possible. This is fixed by extracting the vectors from the non-flattened version of the documents.
Co-authored-by: Kerollmops <clement@meilisearch.com >
2023-09-07 09:25:58 +00:00
679c0b0f97
Extract the vectors from the non-flattened version of the documents
2023-09-06 12:26:00 +02:00
e02d0064bd
Add a test case scenario
2023-09-06 12:26:00 +02:00
dc3d9c90d9
Merge #3994
...
3994: Fix synonyms with separators r=Kerollmops a=ManyTheFish
# Pull Request
## Related issue
Fixes #3977
## Available prototype
```
$ docker pull getmeili/meilisearch:prototype-fix-synonyms-with-separators-0
```
## What does this PR do?
- add a new test
- filter the empty synonyms after normalization
Co-authored-by: ManyTheFish <many@meilisearch.com >
2023-09-05 14:42:46 +00:00
66aa6d5871
Ignore tokens with empty normalized value during indexing process
2023-09-05 15:44:14 +02:00
8ac5b765bc
Fix synonyms normalization
2023-09-04 16:12:48 +02:00
085aad0a94
Add a test
2023-09-04 14:39:33 +02:00
ccf3ba3f32
Merge #4019
...
4019: Bringing back changes from `v1.3.2` onto `main` r=irevoire a=Kerollmops
Co-authored-by: Kerollmops <clement@meilisearch.com >
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: irevoire <irevoire@users.noreply.github.com >
Co-authored-by: Clément Renault <clement@meilisearch.com >
2023-08-28 12:14:11 +00:00
8c0ebd1331
Update milli/src/search/new/bucket_sort.rs
...
Co-authored-by: Louis Dureuil <louis@meilisearch.com >
2023-08-23 16:40:39 +02:00
5130e06b41
Temporarily disable an assert in the ranking rules
2023-08-23 16:11:54 +02:00
914b125c5f
Merge #3945
...
3945: Do not leak field information on error r=Kerollmops a=vivek-26
# Pull Request
## Related issue
Fixes #3865
## What does this PR do?
This PR ensures that `InvalidSortableAttribute`and `InvalidFacetSearchFacetName` errors do not leak field information i.e. fields which are not part of `displayedAttributes` in the settings are hidden from the error message.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Vivek Kumar <vivek.26@outlook.com >
2023-08-22 18:55:27 +00:00
717b069907
Bump charabia to 0.8.3
2023-08-22 16:25:00 +02:00
c53841e166
Accept the null JSON value as the value of _vectors
2023-08-14 16:03:55 +02:00
cab27c2ab4
upgrade indexmap = "2.0.0"
2023-08-10 18:09:02 +02:00
624fa9052f
upgrade deserr = "0.6.0"
2023-08-10 18:09:02 +02:00
60c11dbdbd
upgrade rstar - "0.11.0"
2023-08-10 18:09:02 +02:00
dacee40ebc
upgrade memmap2 = "0.7.1"
2023-08-10 18:09:02 +02:00
cc2c19d4c3
upgrade itertools = "0.10.5"
2023-08-10 18:09:02 +02:00
e4e49e63d0
Merge #3993
...
3993: Bringing back changes from v1.3.1 to `main` r=irevoire a=curquiza
Co-authored-by: irevoire <irevoire@users.noreply.github.com >
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com >
Co-authored-by: ManyTheFish <many@meilisearch.com >
2023-08-10 14:30:02 +00:00
5a7c1bde84
Fix clippy
2023-08-10 11:27:56 +02:00
6b2d671be7
Fix PR comments
2023-08-10 10:44:07 +02:00
43c13faeda
Update milli/src/update/index_documents/extract/extract_docid_word_positions.rs
...
Co-authored-by: Tamo <tamo@meilisearch.com >
2023-08-10 10:05:03 +02:00
44c1900f36
Merge #3986
...
3986: Fix geo bounding box with strings r=ManyTheFish a=irevoire
# Pull Request
When sending a document with one geofield of type string (i.e.: `{ "_geo": { "lat": 12, "lng": "13" }}`), the geobounding box would exclude this document.
This PR fixes this issue by automatically parsing the string value in case we're working on a geofield.
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3973
## What does this PR do?
- Automatically parse the facet value iif we're working on a geofield.
- Make insta works with snapshots in loops or closure executed multiple times. (you may need to update your cli if it panics after this PR: `cargo install cargo-insta`).
- Add one integration test in milli and in meilisearch to ensure it works forever.
- Add three snapshots for the dump that mysteriously disappeared I don't know how
Co-authored-by: Tamo <tamo@meilisearch.com >
2023-08-09 07:58:15 +00:00
8dc5acf998
Try fix
2023-08-08 16:52:36 +02:00
35758db9ec
Truncate the the normalized long facets used in search for facet value
2023-08-08 16:38:30 +02:00
4988199bb9
ensure the geoboundingbox works with strings and int geofields in milli and meilisearch
2023-08-08 16:29:25 +02:00
9d061cec26
automatically parse the filterable attribute to float if it's a geo field
2023-08-08 16:28:07 +02:00