Commit Graph

1239 Commits

Author SHA1 Message Date
Kerollmops
c10469ddb6 Patch the http-ui crate to support filterable fields 2021-06-02 16:24:58 +02:00
Marin Postma
1e366dae3e remove useless lifetime on Distinct Trait 2021-06-02 16:24:58 +02:00
Kerollmops
187c713de5 Remove the MapDistinct struct as now distinct attributes are faceted 2021-06-02 16:24:57 +02:00
Kerollmops
ff440c1d9d Introduce the faceted fields method to retrieve those that needs faceting 2021-06-02 16:24:57 +02:00
Kerollmops
2a3f9b32ff Rename the faceted fields into filterable fields 2021-06-02 16:24:57 +02:00
Irevoire
f346805c0c Update benchmarks/Cargo.toml
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-02 15:47:03 +02:00
Clémentine Urquizar
ef1ac8a0cb Update README 2021-06-02 11:13:22 +02:00
Clémentine Urquizar
edfcdb171c Update benchmarks/scripts/list.sh
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
3c91a9a551 Update following reviews 2021-06-02 11:13:22 +02:00
Tamo
bc4f4ee829 remove s3cmd as a dependency and provide a script to list all the available benchmarks 2021-06-02 11:13:22 +02:00
Clémentine Urquizar
61fe422a88 Update benchmarks/scripts/compare.sh
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
57ed96622b Update benchmarks/scripts/compare.sh
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
b3c0d43890 Update benchmarks/scripts/compare.sh
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-06-02 11:13:22 +02:00
Clémentine Urquizar
0d0e900158 Add CI for benchmarks 2021-06-02 11:13:22 +02:00
tamo
4536dfccd0 add a way to provide primary_key or autogenerate documents ids 2021-06-02 11:13:20 +02:00
tamo
06c414a753 move the benchmarks to another crate so we can download the datasets automatically without adding overhead to the build of milli 2021-06-02 11:11:50 +02:00
tamo
3c84075d2d uses an env variable to find the datasets 2021-06-02 11:05:07 +02:00
tamo
4969abeaab update the facets for the benchmarks 2021-06-02 11:05:07 +02:00
tamo
e5dfde88fd fix the facets conditions 2021-06-02 11:05:07 +02:00
tamo
7c7fba4e57 remove the time limitation to let criterion do what it wants 2021-06-02 11:05:07 +02:00
tamo
5d5d115608 reformat all the files 2021-06-02 11:05:07 +02:00
tamo
7086009f93 improve the base search 2021-06-02 11:05:07 +02:00
tamo
d0b44c380f add benchmarks on a wiki dataset 2021-06-02 11:05:07 +02:00
tamo
beae843766 add a missing space 2021-06-02 11:05:07 +02:00
tamo
5132a106a1 refactorize everything related to the songs dataset in a songs benchmark file 2021-06-02 11:05:07 +02:00
tamo
136efd6b53 fix the benches 2021-06-02 11:05:07 +02:00
tamo
4b78ef31b6 add the configuration of the searchable fields and displayed fields and a default configuration for the songs 2021-06-02 11:05:07 +02:00
tamo
ea0c6d8c40 add a bunch of queries and start the introduction of the filters and the new dataset 2021-06-02 11:05:07 +02:00
tamo
3def42abd8 merge all the criterion only benchmarks in one file 2021-06-02 11:05:07 +02:00
tamo
a2bff68c1a remove the optional words for the typo criterion 2021-06-02 11:05:07 +02:00
tamo
aee49bb3cd add the proximity criterion 2021-06-02 11:05:07 +02:00
tamo
49e4cc3daf add the words criterion to the bench 2021-06-02 11:05:07 +02:00
tamo
15cce89a45 update the README with instructions to get the download the dataset 2021-06-02 11:05:07 +02:00
tamo
e425f70ef9 let criterion decide how much iteration it wants to do in 10s 2021-06-02 11:05:07 +02:00
tamo
4fdbfd6048 push a first version of the benchmark for the typo 2021-06-02 11:05:07 +02:00
bors[bot]
270da98c46 Merge #202
202: Add field id word count docids database r=Kerollmops a=LegendreM

This PR introduces a new database, `field_id_word_count_docids`, that maps the number of words in an attribute with a list of document ids. This relation is limited to attributes that contain less than 11 words.
This database is used by the exactness criterion to know if a document has an attribute that contains exactly the query without any additional word.

Fix #165 
Fix #196
Related to [specifications:#36](https://github.com/meilisearch/specifications/pull/36)

Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-06-01 16:09:48 +00:00
many
e857ca4d7d Fix PR comments 2021-06-01 18:06:46 +02:00
Many
ab2cf69e8d Update milli/src/update/delete_documents.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:10 +02:00
Many
8e6d1ff0dc Update milli/src/update/index_documents/store.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:02 +02:00
bors[bot]
168fe0aa28 Merge #206
206: Fix http-ui r=Kerollmops a=irevoire

I just noticed that `http-ui` was not compiling on `main`.
I'm not sure this is the best fix, but it works 👀

Co-authored-by: Tamo <irevoire@hotmail.fr>
2021-06-01 14:31:32 +00:00
Tamo
608c5bad24 fix http-ui 2021-06-01 16:24:46 +02:00
bors[bot]
7d36d664a7 Merge #203
203: Make the MatchingWords return the number of matching bytes r=Kerollmops a=LegendreM

Make the MatchingWords return the number of matching bytes using a custom Levenshtein algorithm.

Fix #138

Co-authored-by: many <maxime@meilisearch.com>
2021-06-01 12:00:33 +00:00
many
225ae6fd25 Resolve PR comments 2021-06-01 11:53:09 +02:00
bors[bot]
2f9f6a1f21 Merge #169
169: Optimize roaring codec r=Kerollmops a=MarinPostma

Optimize the `BoRoaringBitmapCodec` by preventing it from emiting useless error that caused allocation. On my flamegraph, the byte_decode function went from 4.13% to  1.70% (of transplant graph).

This may not be the greatest optimization ever, but hey, this was a low hanging fruit.

before:
![image](https://user-images.githubusercontent.com/28804882/116241125-17018880-a754-11eb-9f9d-a67418d100e1.png)
after:
![image](https://user-images.githubusercontent.com/28804882/116241167-21bc1d80-a754-11eb-9afc-d9d72727477c.png)



Co-authored-by: Marin Postma <postma.marin@protonmail.com>
2021-06-01 06:30:25 +00:00
Marin Postma
984dc7c1ed rewrite roaring codec without byteorder. 2021-05-31 22:15:39 +02:00
Marin Postma
1373637da1 optimize roaring codec 2021-05-31 22:15:35 +02:00
many
1df68d342a Make the MatchingWords return the number of matching bytes 2021-05-31 18:22:29 +02:00
many
b8e6db0feb Add database in infos crate 2021-05-31 16:29:27 +02:00
many
c701f8bf36 Use field id word count database in exactness criterion 2021-05-31 16:27:28 +02:00
many
4ddf008be2 add field id word count database 2021-05-31 16:27:28 +02:00