Commit Graph

120 Commits

Author SHA1 Message Date
bors[bot]
31c8de1cca Merge #322
322: Geosearch r=ManyTheFish a=irevoire

This PR introduces [basic geo-search functionalities](https://github.com/meilisearch/specifications/pull/59), it makes the engine able to index, filter and, sort by geo-point. We decided to use [the rstar library](https://docs.rs/rstar) and to save the points in [an RTree](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html) that we de/serialize in the index database [by using serde](https://serde.rs/) with [bincode](https://docs.rs/bincode). This is not an efficient way to query this tree as it will consume a lot of CPU and memory when a search is made, but at least it is an easy first way to do so.

### What we will have to do on the indexing part:
 - [x] Index the `_geo` fields from the documents.
   - [x] Create a new module with an extractor in the `extract` module that takes the `obkv_documents` and retrieves the latitude and longitude coordinates, outputting them in a `grenad::Reader` for further process.
   - [x] Call the extractor in the `extract::extract_documents_data` function and send the result to the `TypedChunk` module.
   - [x] Get the `grenad::Reader` in the `typed_chunk::write_typed_chunk_into_index` function and store all the points in the `rtree`
- [x] Delete the documents from the `RTree` when deleting documents from the database. All this can be done in the `delete_documents.rs` file by getting the data structure and removing the points from it, inserting it back after the modification.
- [x] Clearing the `RTree` entirely when we clear the documents from the database, everything happens in the `clear_documents.rs` file.
- [x] save a Roaring bitmap of all documents containing the `_geo` field

### What we will have to do on the query part:
- [x] Filter the documents at a certain distance around a point, this is done by [collecting the documents from the searched point](https://docs.rs/rstar/0.9.1/rstar/struct.RTree.html#method.nearest_neighbor_iter) while they are in range.
  - [x] We must introduce new `geoLowerThan` and `geoGreaterThan` variants to the `Operator` filter enum.
  - [x] Implement the `negative` method on both variants where the `geoGreaterThan` variant is implemented by executing the `geoLowerThan` and removing the results found from the whole list of geo faceted documents.
  - [x] Add the `_geoRadius` function in the pest parser.
- [x] Introduce a `_geo` ascending ranking function that takes a point in parameter, ~~this function must keep the iterator on the `RTree` and make it peekable~~ This was not possible for now, we had to collect the whole iterator. Only the documents that are part of the candidates must be sent too!
  - [x] This ascending ranking rule will only be active if the search is set up with the `_geoPoint` parameter that indicates the center point of the ascending ranking rule.

-----------

- On Meilisearch part: We must introduce a new concept, returning the documents with a new `_geoDistance` field when it passed by the `_geo` ranking rule, this has never been done before. We could maybe just do it afterward when the documents have been retrieved from the database, computing the distance from the `_geoPoint` and all of the documents to be returned.

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-20 19:04:57 +00:00
Clémentine Urquizar
f167f7b412 Update version for the next release (v0.13.1) 2021-09-10 09:48:17 +02:00
Irevoire
a84f3a8b31 Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-09 15:09:35 +02:00
Tamo
7483614b75 [HTTP-UI] add the sorters 2021-09-08 18:24:09 +02:00
cvermand
4fd0116a0d Stringify objects on dashboard to avoid [Object object] 2021-09-08 17:51:08 +02:00
Kerollmops
68856e5e2f Disable the default snappy compression for the http-ui crate 2021-09-08 14:17:32 +02:00
Clémentine Urquizar
eb7b9d9dbf Update version for the next release (v0.13.0) 2021-09-08 10:59:30 +02:00
bors[bot]
5cbe879325 Merge #308
308: Implement a better parallel indexer r=Kerollmops a=ManyTheFish

Rewrite the indexer:
- enhance memory consumption control
- optimize parallelism using rayon and crossbeam channel
- factorize the different parts and make new DB implementation easier
- optimize and fix prefix databases


Co-authored-by: many <maxime@meilisearch.com>
2021-09-02 15:03:52 +00:00
Clémentine Urquizar
285849e3a6 Update version for the next release (v0.12.0) 2021-09-02 10:08:41 +02:00
many
1d314328f0 Plug new indexer 2021-09-01 16:48:36 +02:00
Tamo
d106eb5b90 add the sortable attributes to http-ui and fix the tests 2021-08-30 16:25:10 +02:00
Kerollmops
af65485ba7 Reexport the grenad CompressionType from milli 2021-08-24 18:15:31 +02:00
Kerollmops
2f20257070 Update milli to the v0.11.0 2021-08-24 18:10:11 +02:00
Clément Renault
89d0758713 Revert "Revert "Sort at query time"" 2021-08-24 11:55:16 +02:00
Clémentine Urquizar
88f6c18665 Update version for the next release (v0.10.2) 2021-08-23 11:33:30 +02:00
Clémentine Urquizar
922f9fd4d5 Revert "Sort at query time" 2021-08-20 18:09:17 +02:00
bors[bot]
41fc0dcb62 Merge #309
309: Sort at query time r=Kerollmops a=Kerollmops

This PR:
 - Makes the `Asc/Desc` criteria work with strings too, it first returns documents ordered by numbers then by strings, and finally the documents that can't be ordered. Note that it is lexicographically ordered and not ordered by character, which means that it doesn't know about wide and short characters i.e. `a`, `丹`, `▲`.
 - Changes the syntax for the `Asc/Desc` criterion by now using a colon to separate the name and the order i.e. `title:asc`, `price:desc`.
 - Add the `Sort` criterion at the third position in the ranking rules by default.
 - Add the `sort_criteria` method to the `Search` builder struct to let the users define the `Asc/Desc` sortable attributes they want to use at query time. Note that we need to check that the fields are registered in the sortable attributes before performing the search.
 - Introduce a new `InvalidSortableAttribute` user error that is raised when the sort criteria declared at query time are not part of the sortable attributes.
 - `@ManyTheFish` introduced integration tests for the dynamic Sort criterion.

Fixes #305.

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: many <maxime@meilisearch.com>
2021-08-18 16:55:32 +00:00
bors[bot]
198c416bd8 Merge #312
312: Update milli version to v0.10.1 r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-08-18 12:08:04 +00:00
Clémentine Urquizar
6cb9c3b81f Update milli version to v0.10.1 2021-08-18 13:46:27 +02:00
Clémentine Urquizar
42cf847a63 Update tokenizer version to v0.2.5 2021-08-18 13:37:41 +02:00
Kerollmops
5b88df508e Use the new Asc/Desc syntax everywhere 2021-08-17 14:15:22 +02:00
Clémentine Urquizar
fcc520e49a Update version for the next release (v0.10.0) 2021-08-16 12:00:28 +02:00
Clémentine Urquizar
7f26c75610 Update milli to v0.9.0 2021-08-04 16:04:55 +02:00
Kerollmops
341c244965 Bump milli to v0.8.1 2021-07-29 15:56:36 +02:00
Clémentine Urquizar
6a141694da Update version for the next release (v0.8.0) 2021-07-27 16:38:42 +02:00
bors[bot]
cc54c41e30 Merge #283
283: Use the AlwaysFreePages flag when opening an index r=irevoire a=Kerollmops

We introduced a new flag in our fork of LMDB, this `AlwaysFreePages` flag forces LMDB to always free the single pages it uses before writing to the disk instead of keeping them in a linked list.

Declaring this flag reduces the memory print (leak) we have on memory after indexing a lot of documents.

Fixes #279.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2021-07-05 16:59:16 +00:00
Irevoire
4562b278a8 remove a warning and add a log
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-07-05 17:46:02 +02:00
Tamo
a57e522a67 introduce a die route let the program exit itself alone 2021-07-05 17:38:10 +02:00
Kerollmops
91c5d0c042 Use the AlwaysFreePages flag when opening an index 2021-07-05 16:36:13 +02:00
Kerollmops
a6b4069172 Bump to v0.7.2 2021-07-05 10:54:53 +02:00
Clémentine Urquizar
b489515f4d Update milli version to v0.7.1 2021-06-30 13:52:46 +02:00
Clément Renault
80c6aaf1fd Bump milli to 0.7.0 2021-06-28 18:31:56 +02:00
Clément Renault
bdc5599b73 Bump heed to use the git repo with v0.12.0 2021-06-28 18:26:20 +02:00
Kerollmops
98285b4b18 Bump milli to 0.6.0 2021-06-23 17:30:26 +02:00
Clémentine Urquizar
9885fb4159 Update version for the next release (v0.5.1) 2021-06-23 14:05:20 +02:00
bors[bot]
634201244c Merge #250 #251
250: Add the limit field to http-ui r=Kerollmops a=irevoire



251: Fix the limit r=Kerollmops a=irevoire

There was no check on the limit and thus if a user specified a very large number this line could cause a panic.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-06-22 13:00:52 +00:00
Tamo
81643e6d70 add the limit field to http-ui 2021-06-22 14:47:23 +02:00
Tamo
77eb37934f add jemalloc to http-ui and the benchmarks 2021-06-22 14:17:56 +02:00
Clémentine Urquizar
320670f8fe Update version for the next release (v0.5.0) 2021-06-21 15:59:17 +02:00
Clémentine Urquizar
35fcc351a0 Update version for the next release (v0.4.2) 2021-06-20 17:37:24 +02:00
Kerollmops
ccd6f13793 Update version to the next release (0.4.1) 2021-06-17 15:01:20 +02:00
Tamo
9716fb3b36 format the whole project 2021-06-16 18:33:33 +02:00
Clémentine Urquizar
f5ff3e8e19 Update version for the next release (v0.4.0) 2021-06-16 14:01:05 +02:00
Kerollmops
78fe4259a9 Fix the http-ui crate 2021-06-14 18:06:23 +02:00
Clémentine Urquizar
7d5395c12b Update Tokenizer version to v0.2.3 2021-06-10 17:00:04 +02:00
Clémentine Urquizar
dc64e139b9 Update version for the next release (v0.3.1) 2021-06-09 14:39:21 +02:00
Kerollmops
103dddba2f Move the UpdateStore into the http-ui crate 2021-06-08 17:59:51 +02:00
Clémentine Urquizar
3b2b3aeea9 Update Cargo.toml for next release v0.3.0 2021-06-03 12:24:27 +02:00
Kerollmops
3b1cd4c4b4 Rename the FacetCondition into FilterCondition 2021-06-02 16:24:58 +02:00
Kerollmops
c10469ddb6 Patch the http-ui crate to support filterable fields 2021-06-02 16:24:58 +02:00