Commit Graph

1409 Commits

Author SHA1 Message Date
bors[bot]
31c8de1cca Merge #322
322: Geosearch r=ManyTheFish a=irevoire

This PR introduces [basic geo-search functionalities](https://github.com/meilisearch/specifications/pull/59), it makes the engine able to index, filter and, sort by geo-point. We decided to use [the rstar library](https://docs.rs/rstar) and to save the points in [an RTree](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html) that we de/serialize in the index database [by using serde](https://serde.rs/) with [bincode](https://docs.rs/bincode). This is not an efficient way to query this tree as it will consume a lot of CPU and memory when a search is made, but at least it is an easy first way to do so.

### What we will have to do on the indexing part:
 - [x] Index the `_geo` fields from the documents.
   - [x] Create a new module with an extractor in the `extract` module that takes the `obkv_documents` and retrieves the latitude and longitude coordinates, outputting them in a `grenad::Reader` for further process.
   - [x] Call the extractor in the `extract::extract_documents_data` function and send the result to the `TypedChunk` module.
   - [x] Get the `grenad::Reader` in the `typed_chunk::write_typed_chunk_into_index` function and store all the points in the `rtree`
- [x] Delete the documents from the `RTree` when deleting documents from the database. All this can be done in the `delete_documents.rs` file by getting the data structure and removing the points from it, inserting it back after the modification.
- [x] Clearing the `RTree` entirely when we clear the documents from the database, everything happens in the `clear_documents.rs` file.
- [x] save a Roaring bitmap of all documents containing the `_geo` field

### What we will have to do on the query part:
- [x] Filter the documents at a certain distance around a point, this is done by [collecting the documents from the searched point](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html#method.nearest_neighbor_iter) while they are in range.
  - [x] We must introduce new `geoLowerThan` and `geoGreaterThan` variants to the `Operator` filter enum.
  - [x] Implement the `negative` method on both variants where the `geoGreaterThan` variant is implemented by executing the `geoLowerThan` and removing the results found from the whole list of geo faceted documents.
  - [x] Add the `_geoRadius` function in the pest parser.
- [x] Introduce a `_geo` ascending ranking function that takes a point in parameter, ~~this function must keep the iterator on the `RTree` and make it peekable~~ This was not possible for now, we had to collect the whole iterator. Only the documents that are part of the candidates must be sent too!
  - [x] This ascending ranking rule will only be active if the search is set up with the `_geoPoint` parameter that indicates the center point of the ascending ranking rule.

-----------

- On Meilisearch part: We must introduce a new concept, returning the documents with a new `_geoDistance` field when it passed by the `_geo` ranking rule, this has never been done before. We could maybe just do it afterward when the documents have been retrieved from the database, computing the distance from the `_geoPoint` and all of the documents to be returned.

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-20 19:04:57 +00:00
Irevoire
0d104a0fce Update milli/src/criterion.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-20 18:13:17 +02:00
Clémentine Urquizar
3f1453f470 Update version for the next release (v0.14.0) 2021-09-20 18:12:23 +02:00
Tamo
f4b8e5675d move the reserved keyword logic for the criterion and sort + add test 2021-09-20 17:21:02 +02:00
Irevoire
3b7a2cdbce fix typo
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-20 16:10:39 +02:00
bors[bot]
203aa727a7 Merge #359
359: Improve the benchmark comparison script r=irevoire a=irevoire

This modification allow us to compare more than 2 benchmarks or to only print the results of one benchmark



Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-20 12:39:59 +00:00
Tamo
eaba772f21 update the README to better match the new critcmp usage
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-20 10:59:55 +02:00
Irevoire
9a920d1f93 Fix datasets links in the readme
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-20 10:44:37 +02:00
Tamo
5e683ba472 add benchmarks for the geosearch 2021-09-20 10:44:37 +02:00
Irevoire
f6c6b026bb improve the comparison script 2021-09-16 11:25:51 +02:00
Tamo
c695a1ffd2 add the possibility to sort by descending order on geoPoint 2021-09-15 11:49:58 +02:00
Tamo
91ce4d1721 Stop iterating through the whole list of points
We stop when there is no possible candidates left
2021-09-15 11:49:58 +02:00
bors[bot]
3b1885859d Merge #356
356: Update the README r=curquiza a=Kerollmops

This PR updates a little bit the README and more specifically the indexing times, fixes #352.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-09-14 10:13:05 +00:00
Kerollmops
2741aa8589 Update the indexing timings in the README 2021-09-14 11:42:59 +02:00
Kerollmops
a43f99c600 Inform the users that documents must have an id in there documents 2021-09-13 14:01:02 +02:00
bors[bot]
90d64d257f Merge #354
354: Update version for the next release (v0.13.1) r=ManyTheFish a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-13 09:30:07 +00:00
Clémentine Urquizar
f167f7b412 Update version for the next release (v0.13.1) 2021-09-10 09:48:17 +02:00
bors[bot]
4af31ec9a6 Merge #353
353: Add lacking parameter to word level position builder r=Kerollmops a=ManyTheFish



Co-authored-by: many <maxime@meilisearch.com>
2021-09-09 16:36:33 +00:00
Tamo
cfc62a1c15 use geoutils instead of haversine 2021-09-09 18:11:38 +02:00
many
26deeb45a3 Add lacking parameter to word level position builder 2021-09-09 17:49:04 +02:00
Tamo
3fc145c254 if we have no rtree we return all other provided documents 2021-09-09 17:44:09 +02:00
Irevoire
a84f3a8b31 Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-09 15:09:35 +02:00
Tamo
c81ff22c5b delete the invalid criterion name error in favor of invalid ranking rule name 2021-09-08 19:17:00 +02:00
Tamo
bad8ea47d5 edit the two lasts TODO comments 2021-09-08 18:24:09 +02:00
Tamo
b15c77ebc4 return an error in case a user try to sort with :desc 2021-09-08 18:24:09 +02:00
Tamo
4b618b95e4 rebase on main 2021-09-08 18:24:09 +02:00
Tamo
2988d3c76d tests the geo filters 2021-09-08 18:24:09 +02:00
Tamo
e5ef0cad9a use meters in the filters 2021-09-08 18:24:09 +02:00
Tamo
4f69b190bc remove the distance from the search, the computation of the distance will be made on meilisearch side 2021-09-08 18:24:09 +02:00
Tamo
7ae2a7341c introduce the reserved keywords in the filters 2021-09-08 18:24:09 +02:00
Tamo
6d5762a6c8 handle the case where you forgot entirely the parenthesis 2021-09-08 18:24:09 +02:00
Tamo
ebf82ac28c improve the error messages and add tests for the filters 2021-09-08 18:24:09 +02:00
Tamo
bd4c248292 improve the error handling in general and introduce the concept of reserved keywords 2021-09-08 18:24:09 +02:00
Tamo
e8c093c1d0 fix the error handling in the filters 2021-09-08 18:24:09 +02:00
Tamo
f0b74637dc fix all the tests 2021-09-08 18:24:09 +02:00
Tamo
b1bf7d4f40 reformat 2021-09-08 18:24:09 +02:00
Tamo
aca707413c remove the memory leak 2021-09-08 18:24:09 +02:00
Tamo
a8a1f5bd55 move the geosearch criteria out of asc_desc.rs 2021-09-08 18:24:09 +02:00
Tamo
dc84ecc40b fix a bug 2021-09-08 18:24:09 +02:00
Tamo
7483614b75 [HTTP-UI] add the sorters 2021-09-08 18:24:09 +02:00
Tamo
4820ac71a6 allow spaces in a geoRadius 2021-09-08 18:24:09 +02:00
Tamo
13c78e5aa2 Implement the _geoPoint in the sortable 2021-09-08 18:24:09 +02:00
Tamo
5bb175fc90 only index _geo if it's set as sortable OR filterable
and only allow the filters if geo was set to filterable
2021-09-08 17:51:08 +02:00
Tamo
f73273d71c only call the extractor if needed 2021-09-08 17:51:08 +02:00
cvermand
4fd0116a0d Stringify objects on dashboard to avoid [Object object] 2021-09-08 17:51:08 +02:00
Irevoire
ea2f2ecf96 create a new database containing all the documents that were geo-faceted 2021-09-08 17:51:08 +02:00
Irevoire
4b459768a0 create the _geoRadius filter 2021-09-08 17:51:07 +02:00
Irevoire
6d70978edc update the facet filter grammar 2021-09-08 17:51:07 +02:00
Irevoire
216a8aa3b2 add a tests for the indexation of the geosearch 2021-09-08 17:51:07 +02:00
Irevoire
a21c854790 handle errors 2021-09-08 17:51:07 +02:00