Commit Graph

533 Commits

Author SHA1 Message Date
ac1df9d9d7 fix typo and remove pest 2021-10-12 13:30:40 +08:00
50ad750ec1 enhance error handling 2021-10-12 13:30:40 +08:00
8748df2ca4 draft without error handling 2021-10-12 13:30:40 +08:00
07fb6d64e5 Merge #386
386: fix obkv document r=curquiza a=MarinPostma

When serializing a document, the serializer resolved the field_id of the current field and immediately added it to the obkv document under construction. The issue with that is that obkv expects the fields to be inserted in order, and when a document with out of order fields was added, obkv failed to insert the field.

The current fix first resolves each field_id, and adds all the fields to a temporary `BTreeMap`, until `end` is called on the map serializer, where all the fields are added to the obkv at once, and in order.


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-11 13:45:04 +00:00
dd56e82dba Update version for the next release (v0.17.2) 2021-10-11 15:20:35 +02:00
99889a0ed0 add obkv document serialization test 2021-10-11 15:13:17 +02:00
799f3d43c8 fix serialization to obkv format 2021-10-11 15:04:47 +02:00
b65aa7b5ac Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-10-07 17:51:52 +02:00
11dfe38761 Update the check on the latitude and longitude
Latitude are not supposed to go beyound 90 degrees or below -90.
The same goes for longitude with 180 or -180.

This was badly implemented in the filters, and was not implemented for the AscDesc rules.
2021-10-07 16:10:43 +02:00
085bc6440c Apply PR comments 2021-10-06 11:12:26 +02:00
1bd15d849b Reduce candidates threshold 2021-10-05 18:52:14 +02:00
ea4bd29d14 Apply PR comments 2021-10-05 17:35:07 +02:00
3296bb243c Simplify word level position DB into a word position DB 2021-10-05 12:15:02 +02:00
75d341d928 Re-implement set based algorithm for attribute criterion 2021-10-05 12:14:50 +02:00
05d8a33a28 Update version for the next release (v0.17.1) 2021-10-02 16:21:31 +02:00
d9eba9d145 improve and test the sort error message 2021-09-30 14:38:27 +02:00
0ee67bb7d1 improve the reserved keyword error message for the filters 2021-09-30 14:38:27 +02:00
22551d0941 Merge #379
379: Revert "Change chunk size to 4MiB to fit more the end user usage" r=curquiza a=ManyTheFish

Reverts meilisearch/milli#370

Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-09-29 13:20:53 +00:00
26b5dad042 Revert "Change chunk size to 4MiB to fit more the end user usage" 2021-09-29 15:08:39 +02:00
2e49230ca2 Update milli/src/search/criteria/attribute.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:45 +02:00
7ad0214089 Update milli/src/search/criteria/attribute.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:41 +02:00
1df5b8712b Hotfix meilisearch#1707 2021-09-29 14:41:56 +02:00
68c758a533 Merge #376
376: Stop casting integer docids to string r=Kerollmops a=irevoire

When a docid is an integer, we stop casting it to a string, and thus we don't add `"` around it.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-29 08:32:48 +00:00
0e8665bf18 Update version for the next release (v0.17.0) 2021-09-28 19:38:12 +02:00
f65153ad64 stop casting integer docids to string 2021-09-28 18:35:54 +02:00
785c1372f2 Change "settings" to "setting"
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2021-09-28 20:11:32 +05:30
3580b2d803 Fixes #365 2021-09-28 19:30:23 +05:30
3a12f5887e Merge #373
373: Improve error message for bad sort syntax with geosearch r=Kerollmops a=irevoire

`@Kerollmops` This should be the last PR for the geosearch and error handling, sorry for doing it in so many steps 😬 

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-28 12:39:32 +00:00
a80dcfd4a3 improve error message for bad sort syntax with geosearch 2021-09-28 14:32:24 +02:00
b2a332599e Merge #372
372: Fix Meilisearch 1714 r=Kerollmops a=ManyTheFish

The bug comes from the typo tolerance, to know how many typos are accepted we were counting bytes instead of characters in a word.
On Chinese Script characters, we were allowing  2 typos on 3 characters words.
We are now counting the number of char instead of counting bytes to assign the typo tolerance.

Related to [Meilisearch#1714](https://github.com/meilisearch/MeiliSearch/issues/1714)

Co-authored-by: many <maxime@meilisearch.com>
2021-09-28 11:59:45 +00:00
8046ae4bd5 Count the number of char instead of counting bytes to assign the typo tolerance 2021-09-28 12:10:43 +02:00
1988416295 Add failing test related to Meilisearch#1714 2021-09-28 12:05:11 +02:00
c7cb816ae1 simplify the error handling of the sort syntax for meilisearch 2021-09-27 19:07:22 +02:00
b188063869 Change chunk size to 4MiB to fit more the end user usage 2021-09-27 14:26:21 +02:00
551df0cb77 Add test checking the bug reported in meilisearch issue 1716 2021-09-23 15:55:39 +02:00
87dd441a3a Merge #367
367: Update version for the next release (v0.16.0) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-22 15:20:20 +00:00
1eacab2169 Update version for the next release (v0.15.1) 2021-09-22 17:18:54 +02:00
218f0a6661 Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-22 17:00:27 +02:00
47ee93b0bd return an error when _geoPoint is used but _geo is not sortable 2021-09-22 16:37:41 +02:00
1e5e3d57e2 auto convert AscDescError into CriterionError 2021-09-22 16:37:41 +02:00
023446ecf3 create a smaller and easier to maintain CriterionError type 2021-09-22 16:37:41 +02:00
86e272856a create an asc_desc error type that is never supposed to be returned to the end user 2021-09-22 16:37:41 +02:00
257e621d40 create an asc_desc module 2021-09-22 16:37:41 +02:00
113a061bee fix the error handling on the criterion side 2021-09-22 15:09:07 +02:00
78b0bce9a1 fix the returned error when asc desc fails to be parsed 2021-09-22 11:37:05 +02:00
f8ecbc28e2 Update version for the next release (v0.15.0) 2021-09-21 18:09:14 +02:00
aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00
94764e5c7c Merge #360
360: Update version for the next release (v0.14.0) r=Kerollmops a=curquiza

Release containing the geosearch, cf #322 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-21 08:43:27 +00:00
31c8de1cca Merge #322
322: Geosearch r=ManyTheFish a=irevoire

This PR introduces [basic geo-search functionalities](https://github.com/meilisearch/specifications/pull/59), it makes the engine able to index, filter and, sort by geo-point. We decided to use [the rstar library](https://docs.rs/rstar) and to save the points in [an RTree](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html) that we de/serialize in the index database [by using serde](https://serde.rs/) with [bincode](https://docs.rs/bincode). This is not an efficient way to query this tree as it will consume a lot of CPU and memory when a search is made, but at least it is an easy first way to do so.

### What we will have to do on the indexing part:
 - [x] Index the `_geo` fields from the documents.
   - [x] Create a new module with an extractor in the `extract` module that takes the `obkv_documents` and retrieves the latitude and longitude coordinates, outputting them in a `grenad::Reader` for further process.
   - [x] Call the extractor in the `extract::extract_documents_data` function and send the result to the `TypedChunk` module.
   - [x] Get the `grenad::Reader` in the `typed_chunk::write_typed_chunk_into_index` function and store all the points in the `rtree`
- [x] Delete the documents from the `RTree` when deleting documents from the database. All this can be done in the `delete_documents.rs` file by getting the data structure and removing the points from it, inserting it back after the modification.
- [x] Clearing the `RTree` entirely when we clear the documents from the database, everything happens in the `clear_documents.rs` file.
- [x] save a Roaring bitmap of all documents containing the `_geo` field

### What we will have to do on the query part:
- [x] Filter the documents at a certain distance around a point, this is done by [collecting the documents from the searched point](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html#method.nearest_neighbor_iter) while they are in range.
  - [x] We must introduce new `geoLowerThan` and `geoGreaterThan` variants to the `Operator` filter enum.
  - [x] Implement the `negative` method on both variants where the `geoGreaterThan` variant is implemented by executing the `geoLowerThan` and removing the results found from the whole list of geo faceted documents.
  - [x] Add the `_geoRadius` function in the pest parser.
- [x] Introduce a `_geo` ascending ranking function that takes a point in parameter, ~~this function must keep the iterator on the `RTree` and make it peekable~~ This was not possible for now, we had to collect the whole iterator. Only the documents that are part of the candidates must be sent too!
  - [x] This ascending ranking rule will only be active if the search is set up with the `_geoPoint` parameter that indicates the center point of the ascending ranking rule.

-----------

- On Meilisearch part: We must introduce a new concept, returning the documents with a new `_geoDistance` field when it passed by the `_geo` ranking rule, this has never been done before. We could maybe just do it afterward when the documents have been retrieved from the database, computing the distance from the `_geoPoint` and all of the documents to be returned.

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-20 19:04:57 +00:00
0d104a0fce Update milli/src/criterion.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-20 18:13:17 +02:00