Commit Graph

2190 Commits

Author SHA1 Message Date
Kerollmops
28c004aa2c Prefer using constant for the database names 2021-06-15 11:13:04 +02:00
Kerollmops
312c2d1d8e Use the Error enum everywhere in the project 2021-06-14 16:58:38 +02:00
Kerollmops
ca78cb5aca Introduce more variants to the error module enums 2021-06-14 16:58:38 +02:00
Kerollmops
456541e921 Implement the Display trait on the Error type 2021-06-14 16:48:51 +02:00
Kerollmops
44c353fafd Introduce some way to construct an Error 2021-06-14 16:48:51 +02:00
Kerollmops
23fcf7920e Introduce a basic version of the InternalError struct 2021-06-14 16:48:51 +02:00
Kerollmops
d2b1ecc885 Remove a lot of serialization unreachable errors 2021-06-14 16:48:51 +02:00
Kerollmops
65b1d09d55 Move the obkv merging functions into the merge_function module 2021-06-14 16:48:51 +02:00
Kerollmops
ab727e428b Remove the docid_word_positions_merge method that must never be called 2021-06-14 16:48:51 +02:00
Kerollmops
93a8633f18 Remove the documents_merge method that must never be called 2021-06-14 16:48:51 +02:00
Kerollmops
cfc7314bd1 Prefer using an explicit merge function name 2021-06-14 16:48:50 +02:00
Kerollmops
93978ec38a Serializing a RoaringBitmap into a Vec cannot fail 2021-06-14 16:48:50 +02:00
Kerollmops
ff9414a6ba Use the out of the compute_primary_key_pair function 2021-06-14 16:48:50 +02:00
Many
f4cab080a6 Update milli/src/search/query_tree.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-10 11:30:51 +02:00
Many
36715f571c Update milli/src/search/criteria/proximity.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-10 11:30:33 +02:00
many
e923a3ed6a Replace Consecutive by Phrase in query tree
Replace Consecutive by Phrase in query tree in order to remove theorical bugs,
due of the Consecutive enum type.
2021-06-10 11:16:16 +02:00
bors[bot]
afb4133bd2 Merge #212 #222 #223
212: Introduce integration test on criteria r=Kerollmops a=ManyTheFish

- add pre-ranked dataset
- test each criterion 1 by 1
- test all criteria in several order

222: Move the `UpdateStore` into the http-ui crate r=Kerollmops a=Kerollmops

We no more need to have the `UpdateStore` inside of the mill crate as this is the job of the caller to stack the updates and sequentially give them to milli.

223: Update dataset links r=Kerollmops a=curquiza



Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-06-09 08:47:19 +00:00
bors[bot]
6faa87302c Merge #220
220: Make hard separators split phrase query r=Kerollmops a=ManyTheFish

hard separators will now split a phrase query as two sequential phrases (double-quoted strings):

the query `"Radioactive (Imagine Dragons)"` would be considered equivalent to `"Radioactive" "Imagine Dragons"` which as the little disadvantage of not keeping the order of the two (or more) separate phrases.

Fix #208

Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-06-09 08:22:58 +00:00
Kerollmops
0bf4f3f48a Modify a test to check that criteria additions change the fields ids map 2021-06-08 18:14:34 +02:00
Kerollmops
82df524e09 Make sure that we register the field when setting criteria 2021-06-08 18:14:33 +02:00
Kerollmops
103dddba2f Move the UpdateStore into the http-ui crate 2021-06-08 17:59:51 +02:00
Many
faf148d297 Update milli/src/search/query_tree.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-08 17:52:37 +02:00
Kerollmops
133ab98260 Use the index primary key when deleting documents 2021-06-08 17:33:29 +02:00
many
b489d699ce Make hard separators split phrase query
hard separators will now split a phrase query as double double-quotes

Fix #208
2021-06-08 17:29:38 +02:00
bors[bot]
39ed133f9f Merge #193
193: Fix primary key behavior r=Kerollmops a=MarinPostma

this pr:
- Adds early returns on empty document additions, avoiding error messages to be returned when adding no documents and no primary key was set.
- Changes the primary key inference logic to match that of legacy meilisearch.

close #194 

Co-authored-by: Marin Postma <postma.marin@protonmail.com>
Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-06-03 10:24:21 +00:00
marin postma
57898d8a90 fix silent deserialize error 2021-06-03 10:42:55 +02:00
many
26a9974667 Make asc/desc criterion return resting documents
Fix #161.2
2021-06-02 17:41:48 +02:00
Kerollmops
3c304c89d4 Make sure that we generate the faceted database when required 2021-06-02 16:24:58 +02:00
Kerollmops
b0c0490e85 Make sure that we can add a Asc/Desc field without it being filterable 2021-06-02 16:24:58 +02:00
Kerollmops
3b1cd4c4b4 Rename the FacetCondition into FilterCondition 2021-06-02 16:24:58 +02:00
Kerollmops
c2afdbb1fb Move and comment some internal facet_condition helper functions 2021-06-02 16:24:58 +02:00
Kerollmops
6476827d3a Fix the indexer to be sure that distinct and Asc/Desc are also faceted 2021-06-02 16:24:58 +02:00
Marin Postma
1e366dae3e remove useless lifetime on Distinct Trait 2021-06-02 16:24:58 +02:00
Kerollmops
187c713de5 Remove the MapDistinct struct as now distinct attributes are faceted 2021-06-02 16:24:57 +02:00
Kerollmops
ff440c1d9d Introduce the faceted fields method to retrieve those that needs faceting 2021-06-02 16:24:57 +02:00
Kerollmops
2a3f9b32ff Rename the faceted fields into filterable fields 2021-06-02 16:24:57 +02:00
bors[bot]
270da98c46 Merge #202
202: Add field id word count docids database r=Kerollmops a=LegendreM

This PR introduces a new database, `field_id_word_count_docids`, that maps the number of words in an attribute with a list of document ids. This relation is limited to attributes that contain less than 11 words.
This database is used by the exactness criterion to know if a document has an attribute that contains exactly the query without any additional word.

Fix #165 
Fix #196
Related to [specifications:#36](https://github.com/meilisearch/specifications/pull/36)

Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-06-01 16:09:48 +00:00
many
e857ca4d7d Fix PR comments 2021-06-01 18:06:46 +02:00
Many
ab2cf69e8d Update milli/src/update/delete_documents.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:10 +02:00
Many
8e6d1ff0dc Update milli/src/update/index_documents/store.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-06-01 17:04:02 +02:00
bors[bot]
7d36d664a7 Merge #203
203: Make the MatchingWords return the number of matching bytes r=Kerollmops a=LegendreM

Make the MatchingWords return the number of matching bytes using a custom Levenshtein algorithm.

Fix #138

Co-authored-by: many <maxime@meilisearch.com>
2021-06-01 12:00:33 +00:00
many
225ae6fd25 Resolve PR comments 2021-06-01 11:53:09 +02:00
Marin Postma
984dc7c1ed rewrite roaring codec without byteorder. 2021-05-31 22:15:39 +02:00
Marin Postma
1373637da1 optimize roaring codec 2021-05-31 22:15:35 +02:00
many
1df68d342a Make the MatchingWords return the number of matching bytes 2021-05-31 18:22:29 +02:00
many
c701f8bf36 Use field id word count database in exactness criterion 2021-05-31 16:27:28 +02:00
many
4ddf008be2 add field id word count database 2021-05-31 16:27:28 +02:00
bors[bot]
2f5e61bacb Merge #184
184: Transfer numbers and strings facets into the appropriate facet databases r=Kerollmops a=Kerollmops

This pull request is related to https://github.com/meilisearch/milli/issues/152 and changes the layout of the facets values, numbers and strings are now in dedicated databases and the user no more needs to define the type of the fields. No more conversion between the two types is done, numbers (floats and integers converted to f64) go to the facet float database and strings go to the strings facet database.

There is one related issue that I found regarding CSVs, the values in a CSV are always considered to be strings, [meilisearch/specifications#28](d916b57d74/text/0028-indexing-csv.md) fixes this issue by allowing the user to define the fields types using `:` in the "CSV Formatting Rules" section.

All previous tests on facets have been modified to pass again and I have also done hand-driven tests with the 115m songs dataset. Everything seems to be good!

Fixes #192.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-05-31 13:32:58 +00:00
Kerollmops
1c0a5cd136 Resolve code modification suggestions 2021-05-31 15:22:50 +02:00
many
a5e98cf46d Fix plane sweep algorithm 2021-05-25 18:21:55 +02:00