4969abeaab
update the facets for the benchmarks
2021-06-02 11:05:07 +02:00
e5dfde88fd
fix the facets conditions
2021-06-02 11:05:07 +02:00
7c7fba4e57
remove the time limitation to let criterion do what it wants
2021-06-02 11:05:07 +02:00
5d5d115608
reformat all the files
2021-06-02 11:05:07 +02:00
7086009f93
improve the base search
2021-06-02 11:05:07 +02:00
d0b44c380f
add benchmarks on a wiki dataset
2021-06-02 11:05:07 +02:00
beae843766
add a missing space
2021-06-02 11:05:07 +02:00
5132a106a1
refactorize everything related to the songs dataset in a songs benchmark file
2021-06-02 11:05:07 +02:00
136efd6b53
fix the benches
2021-06-02 11:05:07 +02:00
4b78ef31b6
add the configuration of the searchable fields and displayed fields and a default configuration for the songs
2021-06-02 11:05:07 +02:00
ea0c6d8c40
add a bunch of queries and start the introduction of the filters and the new dataset
2021-06-02 11:05:07 +02:00
3def42abd8
merge all the criterion only benchmarks in one file
2021-06-02 11:05:07 +02:00
a2bff68c1a
remove the optional words for the typo criterion
2021-06-02 11:05:07 +02:00
aee49bb3cd
add the proximity criterion
2021-06-02 11:05:07 +02:00
49e4cc3daf
add the words criterion to the bench
2021-06-02 11:05:07 +02:00
15cce89a45
update the README with instructions to get the download the dataset
2021-06-02 11:05:07 +02:00
e425f70ef9
let criterion decide how much iteration it wants to do in 10s
2021-06-02 11:05:07 +02:00
4fdbfd6048
push a first version of the benchmark for the typo
2021-06-02 11:05:07 +02:00
2d7785ae0c
remove the dump_batch_size option from the CLI
2021-06-01 20:42:06 +02:00
d0552e765e
forbid deserialization of Setting<Checked>
2021-06-01 20:41:45 +02:00
270da98c46
Merge #202
...
202: Add field id word count docids database r=Kerollmops a=LegendreM
This PR introduces a new database, `field_id_word_count_docids`, that maps the number of words in an attribute with a list of document ids. This relation is limited to attributes that contain less than 11 words.
This database is used by the exactness criterion to know if a document has an attribute that contains exactly the query without any additional word.
Fix #165
Fix #196
Related to [specifications:#36](https://github.com/meilisearch/specifications/pull/36 )
Co-authored-by: many <maxime@meilisearch.com >
Co-authored-by: Many <legendre.maxime.isn@gmail.com >
2021-06-01 16:09:48 +00:00
e857ca4d7d
Fix PR comments
2021-06-01 18:06:46 +02:00
ab2cf69e8d
Update milli/src/update/delete_documents.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com >
2021-06-01 17:04:10 +02:00
8e6d1ff0dc
Update milli/src/update/index_documents/store.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com >
2021-06-01 17:04:02 +02:00
168fe0aa28
Merge #206
...
206: Fix http-ui r=Kerollmops a=irevoire
I just noticed that `http-ui` was not compiling on `main`.
I'm not sure this is the best fix, but it works đź‘€
Co-authored-by: Tamo <irevoire@hotmail.fr >
2021-06-01 14:31:32 +00:00
608c5bad24
fix http-ui
2021-06-01 16:24:46 +02:00
7d36d664a7
Merge #203
...
203: Make the MatchingWords return the number of matching bytes r=Kerollmops a=LegendreM
Make the MatchingWords return the number of matching bytes using a custom Levenshtein algorithm.
Fix #138
Co-authored-by: many <maxime@meilisearch.com >
2021-06-01 12:00:33 +00:00
225ae6fd25
Resolve PR comments
2021-06-01 11:53:09 +02:00
3a7c1f2469
Merge #191
...
191: dumps v2 r=irevoire a=MarinPostma
Co-authored-by: Marin Postma <postma.marin@protonmail.com >
Co-authored-by: marin <postma.marin@protonmail.com >
2021-06-01 09:46:31 +00:00
df6ba0e824
Apply suggestions from code review
...
Co-authored-by: Irevoire <tamo@meilisearch.com >
2021-06-01 11:18:37 +02:00
2f9f6a1f21
Merge #169
...
169: Optimize roaring codec r=Kerollmops a=MarinPostma
Optimize the `BoRoaringBitmapCodec` by preventing it from emiting useless error that caused allocation. On my flamegraph, the byte_decode function went from 4.13% to 1.70% (of transplant graph).
This may not be the greatest optimization ever, but hey, this was a low hanging fruit.
before:

after:

Co-authored-by: Marin Postma <postma.marin@protonmail.com >
2021-06-01 06:30:25 +00:00
984dc7c1ed
rewrite roaring codec without byteorder.
2021-05-31 22:15:39 +02:00
1373637da1
optimize roaring codec
2021-05-31 22:15:35 +02:00
6609f9e3be
review edits
2021-05-31 18:41:37 +02:00
1df68d342a
Make the MatchingWords return the number of matching bytes
2021-05-31 18:22:29 +02:00
b8e6db0feb
Add database in infos crate
2021-05-31 16:29:27 +02:00
c701f8bf36
Use field id word count database in exactness criterion
2021-05-31 16:27:28 +02:00
4ddf008be2
add field id word count database
2021-05-31 16:27:28 +02:00
1c4f0b2ccf
clippy, fmt & tests
2021-05-31 16:03:39 +02:00
10fc870684
improve dump info reports
2021-05-31 15:49:04 +02:00
2f5e61bacb
Merge #184
...
184: Transfer numbers and strings facets into the appropriate facet databases r=Kerollmops a=Kerollmops
This pull request is related to https://github.com/meilisearch/milli/issues/152 and changes the layout of the facets values, numbers and strings are now in dedicated databases and the user no more needs to define the type of the fields. No more conversion between the two types is done, numbers (floats and integers converted to f64) go to the facet float database and strings go to the strings facet database.
There is one related issue that I found regarding CSVs, the values in a CSV are always considered to be strings, [meilisearch/specifications#28 ](d916b57d74/text/0028-indexing-csv.md
) fixes this issue by allowing the user to define the fields types using `:` in the "CSV Formatting Rules" section.
All previous tests on facets have been modified to pass again and I have also done hand-driven tests with the 115m songs dataset. Everything seems to be good!
Fixes #192 .
Co-authored-by: Clément Renault <clement@meilisearch.com >
Co-authored-by: Kerollmops <clement@meilisearch.com >
2021-05-31 13:32:58 +00:00
1c0a5cd136
Resolve code modification suggestions
2021-05-31 15:22:50 +02:00
dffbaca63b
bump sentry version
2021-05-31 13:59:31 +02:00
b3c8f0e1f6
fix empty index error
2021-05-31 10:58:51 +02:00
bc5a5e37ea
fix dump v1
2021-05-31 10:42:31 +02:00
33c6c4f0ee
add timestamos to dump info
2021-05-30 15:55:17 +02:00
39c16c0fe4
fix dump import
2021-05-30 12:35:17 +02:00
1cb64caae4
dump content is now only uuid
2021-05-29 00:08:17 +02:00
b258f4f394
fix dump import
2021-05-27 14:30:20 +02:00
c47369839b
dump meta
2021-05-27 10:51:19 +02:00