Commit Graph

1260 Commits

Author SHA1 Message Date
d9eba9d145 improve and test the sort error message 2021-09-30 14:38:27 +02:00
0ee67bb7d1 improve the reserved keyword error message for the filters 2021-09-30 14:38:27 +02:00
22551d0941 Merge #379
379: Revert "Change chunk size to 4MiB to fit more the end user usage" r=curquiza a=ManyTheFish

Reverts meilisearch/milli#370

Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-09-29 13:20:53 +00:00
26b5dad042 Revert "Change chunk size to 4MiB to fit more the end user usage" 2021-09-29 15:08:39 +02:00
6a057a3bd0 Merge #378
378: Hotfix meilisearch#1707 r=Kerollmops a=ManyTheFish

This PR contains an ugly quick fix of [meilisearch#1707](https://github.com/meilisearch/MeiliSearch/issues/1707).

- remove comparison reverse on rank. Enhancing relevancy and performances
- iterate over level 0 only. Enhancing performances.

A better fix is in development.

Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-09-29 12:57:31 +00:00
2e49230ca2 Update milli/src/search/criteria/attribute.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:45 +02:00
7ad0214089 Update milli/src/search/criteria/attribute.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:41 +02:00
1df5b8712b Hotfix meilisearch#1707 2021-09-29 14:41:56 +02:00
bfedbc1b6d Merge #374
374: Enhance CSV document parsing r=Kerollmops a=ManyTheFish

Benchmarks on `search_songs` were crashing because of the CSV parsing.

Co-authored-by: many <maxime@meilisearch.com>
2021-09-29 08:55:54 +00:00
68c758a533 Merge #376
376: Stop casting integer docids to string r=Kerollmops a=irevoire

When a docid is an integer, we stop casting it to a string, and thus we don't add `"` around it.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-29 08:32:48 +00:00
d2427f18e5 Enhance CSV document parsing 2021-09-29 10:25:33 +02:00
00f94b1ffd Merge #377
377: Update version for the next release (v0.17.0) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-28 20:43:33 +00:00
0e8665bf18 Update version for the next release (v0.17.0) 2021-09-28 19:38:12 +02:00
f65153ad64 stop casting integer docids to string 2021-09-28 18:35:54 +02:00
adddf3f179 Merge #375
375: Fixes #365 r=Kerollmops a=vishnugt



Co-authored-by: Vishnu Ganesan <vganesan@microsoft.com>
Co-authored-by: Vishnu Gt <vishnugt@hotmail.com>
2021-09-28 14:42:48 +00:00
785c1372f2 Change "settings" to "setting"
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2021-09-28 20:11:32 +05:30
3580b2d803 Fixes #365 2021-09-28 19:30:23 +05:30
3a12f5887e Merge #373
373: Improve error message for bad sort syntax with geosearch r=Kerollmops a=irevoire

`@Kerollmops` This should be the last PR for the geosearch and error handling, sorry for doing it in so many steps 😬 

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-28 12:39:32 +00:00
a80dcfd4a3 improve error message for bad sort syntax with geosearch 2021-09-28 14:32:24 +02:00
b2a332599e Merge #372
372: Fix Meilisearch 1714 r=Kerollmops a=ManyTheFish

The bug comes from the typo tolerance, to know how many typos are accepted we were counting bytes instead of characters in a word.
On Chinese Script characters, we were allowing  2 typos on 3 characters words.
We are now counting the number of char instead of counting bytes to assign the typo tolerance.

Related to [Meilisearch#1714](https://github.com/meilisearch/MeiliSearch/issues/1714)

Co-authored-by: many <maxime@meilisearch.com>
2021-09-28 11:59:45 +00:00
8046ae4bd5 Count the number of char instead of counting bytes to assign the typo tolerance 2021-09-28 12:10:43 +02:00
1988416295 Add failing test related to Meilisearch#1714 2021-09-28 12:05:11 +02:00
3b479948c6 Merge #371
371: Provide a sort error handler r=Kerollmops a=irevoire

This PR simplify the error handling of asc-desc rules for Meilisearch or any other wrapper by providing directly in milli a new error type called `SortError` that can be generated from an `AscDescError` and that can be automatically converted to a `UserError`.

Basically now, wherever you are in the code as a user or in milli you can parse an `AscDesc` syntax and depending on the context, cast it either as a `SortError` or a `CriterionError` in one line with improved error messages.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-28 09:28:32 +00:00
cc732fe95e update http-ui to use the sort-error 2021-09-28 11:15:24 +02:00
c7cb816ae1 simplify the error handling of the sort syntax for meilisearch 2021-09-27 19:07:22 +02:00
4c09f6838f Merge #370
370: Change chunk size to 4MiB to fit more the end user usage r=ManyTheFish a=ManyTheFish

We made several indexing tests using different sizes of datasets (5 datasets from 9MiB to 100MiB) on several typologies of VMs (`XS: 1GiB RAM, 1 VCPU`, `S: 2GiB RAM, 2 VCPU`, `M: 4GiB RAM, 3 VCPU`, `L: 8GiB RAM, 4 VCPU`).
The result of these tests shows that the `4MiB` chunk size seems to be the best size compared to other chunk sizes (`2Mib`, `4MiB`, `8Mib`, `16Mib`,  `32Mib`, `64Mib`, `128Mib`).

below is the average time per chunk size:

![Capture d’écran 2021-09-27 à 14 27 50](https://user-images.githubusercontent.com/6482087/134909368-ef0bc45e-68d5-49d1-aaf9-91113b7c410f.png)

<details>
<summary>Detailled data</summary>
<br>

![Capture d’écran 2021-09-27 à 14 39 48](https://user-images.githubusercontent.com/6482087/134909952-a36b1457-bbbd-4a6c-bbe5-519e4b926b5a.png)
</br>
</details> 


Co-authored-by: many <maxime@meilisearch.com>
2021-09-27 12:57:52 +00:00
b188063869 Change chunk size to 4MiB to fit more the end user usage 2021-09-27 14:26:21 +02:00
0f8320bdc2 Merge #369
369: Add test checking the bug reported in meilisearch issue 1716 r=Kerollmops a=ManyTheFish

The bug is not present in the newer milli version.

Related to [Meilisearch#1716](https://github.com/meilisearch/MeiliSearch/issues/1716)

Co-authored-by: many <maxime@meilisearch.com>
2021-09-23 14:27:34 +00:00
551df0cb77 Add test checking the bug reported in meilisearch issue 1716 2021-09-23 15:55:39 +02:00
87dd441a3a Merge #367
367: Update version for the next release (v0.16.0) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-22 15:20:20 +00:00
1eacab2169 Update version for the next release (v0.15.1) 2021-09-22 17:18:54 +02:00
b806097141 Merge #366
366: Geosearch error handling r=Kerollmops a=irevoire

Rewrite most of geosearch error handling and another batch of tests on the criterion parsing.

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-09-22 15:08:11 +00:00
218f0a6661 Apply suggestions from code review
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-22 17:00:27 +02:00
47ee93b0bd return an error when _geoPoint is used but _geo is not sortable 2021-09-22 16:37:41 +02:00
1e5e3d57e2 auto convert AscDescError into CriterionError 2021-09-22 16:37:41 +02:00
023446ecf3 create a smaller and easier to maintain CriterionError type 2021-09-22 16:37:41 +02:00
86e272856a create an asc_desc error type that is never supposed to be returned to the end user 2021-09-22 16:37:41 +02:00
257e621d40 create an asc_desc module 2021-09-22 16:37:41 +02:00
113a061bee fix the error handling on the criterion side 2021-09-22 15:09:07 +02:00
ad3befaaf5 Merge #364
364: Fix all the benchmarks  r=Kerollmops a=irevoire

#324 broke all benchmarks.
I fixed everything and noticed that `cargo check --all` was insufficient to check the bench in multiple workspaces, so I also updated the CI to use `cargo check --workspace --all-targets`.

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-22 12:40:34 +00:00
176160d32f fix all benchmarks and add the compile time checking of the benhcmarks in the ci 2021-09-22 12:10:21 +02:00
16790ee620 Merge #363
363: Fix the returned `AscDesc` error r=Kerollmops a=irevoire

With my previous PR on the geosearch I erased the change I've introduced with my pre-previous PR about the new error type when we fail to parse the `AscDesc` type.

Sorry for that, here is the fix

Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-22 09:53:35 +00:00
78b0bce9a1 fix the returned error when asc desc fails to be parsed 2021-09-22 11:37:05 +02:00
2837cab5da Merge #362
362: Remove the `Cargo.lock` again r=Kerollmops a=irevoire



Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-22 09:33:09 +00:00
2e99fa8251 remove the cargo.lock again 2021-09-22 11:30:33 +02:00
fe9f380993 Merge #361
361: Update version for the next release (v0.15.0) r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-21 16:19:16 +00:00
f8ecbc28e2 Update version for the next release (v0.15.0) 2021-09-21 18:09:14 +02:00
700318dc62 Merge #357
357: Add benchmarks for the geosearch r=Kerollmops a=irevoire

closes #336

Should I merge this PR in #322 and then we merge everything in `main` or should we wait for #322 to be merged and then merge this one in `main` later?

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-09-21 16:08:06 +00:00
9d9010e45f Merge #324
324: Implement documents API r=Kerollmops a=MarinPostma

This pr implement the intermediary document representation for milli. The JSON, JSONL and CSV formats are replaced with the format instead, to push the serialization duty on the client side.

The `documents` module contains the interface to the new document format:

- The `DocumentsBuilder` allows the creation of a writer backed document addition, when documents are added either one by one, or as arrays of depth 1. This is made possible by the fact that the seriliazer used by the `add_documents` methods only accepts `[Object]` and `Object`. The related serialization logic is located in the `serde.rs` file.
- The `DocumentsReader` allows to to iterate over the documents created by a `DocumentsBuilder`. A call to `next_document_with_index` returns the next obkv reader in the document addition, along with a reference to the index used to map the field ids in the obkv reader to the field names

All references to json, jsonl or csv in the tests have been replaced with the `documents!` macro, works exaclty like the `serde_json::json` macro, as a convenient way to create a `DocumentsReader`.

Rewrote the search cli, to the `cli` crate, to also allow index manipulation. This only offers basic functionalities for now, but is meant to be easier to extend than http ui


blocked by #308

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-09-21 15:40:03 +00:00
aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00