Commit Graph

7195 Commits

Author SHA1 Message Date
a8defb585b Merge #742
742: Add a "Criterion implementation strategy" parameter to Search r=irevoire a=loiclec

Add a parameter to search requests which determines the implementation strategy of the criteria. This can be either `set-based`, `iterative`, or `dynamic` (ie choosing between set-based or iterative at search time). See https://github.com/meilisearch/milli/issues/755 for more context about this change.


Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-21 12:18:49 +00:00
339a4b0789 Make clippy happy 2022-12-21 12:49:34 +01:00
904fd2f6d1 Add a search strategy option to the cli 2022-12-21 12:48:53 +01:00
229405aeb9 Choose implementation strategy of criterion at runtime 2022-12-21 09:29:39 +01:00
2780e365e2 test update and ndjson serde use from_slice 2022-12-21 14:31:45 +08:00
bf2a401a05 serde ndjson fix 2022-12-21 11:27:15 +08:00
9925309492 Merge #3263
3263: Handle most io error instead of tagging everything as an internal r=dureuill a=irevoire

Fix https://github.com/meilisearch/meilisearch/issues/2255
Fix https://github.com/meilisearch/meilisearch/issues/2785
Close https://github.com/meilisearch/milli/pull/580

- [x] Find a way to catch the `io::Error` contained in `serde_json::Error`: We can't: https://docs.rs/serde_json/latest/serde_json/struct.Error.html
- [x] Check the `grenad::Error` as well => the `grenad::Error::Io` error are correctly converted to a `milli::Error::Io` error 
- [x] Ensure the error code mean the same thing under windows

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-20 17:15:53 +00:00
9e0cce5ca4 Update dump/src/error.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-20 18:08:51 +01:00
336ea57384 Update dump/src/error.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-20 18:08:44 +01:00
c637bfba37 convert all the document format error due to io to io::Error 2022-12-20 17:49:38 +01:00
3040172562 update the error message as well 2022-12-20 17:31:13 +01:00
249e051cd4 Merge #750
750: Fix hard-deletion of an external id that was soft-deleted and then reimported - main r=irevoire a=loiclec

# Pull Request

## Related issue
Fixes (when merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3021

## What does this PR do?
There was a bug happening when:

1. Documents were added
2. Some of these documents were replaced using soft-deletion
3. A deletion of another non-replaced document takes place and triggers a hard-deletion
4. Documents with the same identifiers as the replaced documents are added again

Then, search results would return duplicate documents. No crash would happen at any time (this is the reason it wasn't caught by the previous fuzz test. I have updated the new one such that it also checks the result of a placeholder search request, which then finds the bug immediately).

The cause of the bug is: 

1. When a hard-deletion is triggered, we try to retrieve the external document id associated with each soft-deleted document id. 
2. Then, we take this list of external document ids and remove each of them from the `ExternalDocumentsIds` structure. 
3. However, this is not correct in case an existing (non-deleted) document shares the external id of a soft-deleted document. 
   
## Implementation of the fix
1. Before we process a permanent deletion, we update the list of soft-deleted document ids.
2. Then, the permanent deletion's job is to remove the soft-deleted documents from all data structures. Therefore, to update `ExternalDocumentsIds`, we can simply call the `delete_soft_deleted_documents_ids_from_fsts` method, which is faster and simpler.

## Correctness
A unit test was added to reproduce the bug. The new fuzz test, when adjusted to check the correctness of a placeholder search, could also instantly reproduce the bug, but now does not find any other problem.

Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2022-12-20 16:13:20 +00:00
52aa34d984 remove an unused error handling file 2022-12-20 16:32:51 +01:00
fc0e7382fe Fix hard-deletion of an external id that was soft-deleted 2022-12-20 15:33:31 +01:00
2c86d42a44 Merge #3264
3264: Remove macos-latest and windows-latest usages r=curquiza a=curquiza

Related to https://github.com/meilisearch/meilisearch/issues/3109#issuecomment-1359151297

Remove the `macos-latest` and `windows-latest` to replace them with the specific version: this will avoid "surprises" in the future when GitHub changes the `latest` version.
This way, it will also allow us to let the documentation team know about the changes, since we will control the macOS/Windows version we support

Co-authored-by: curquiza <clementine@meilisearch.com>
2022-12-20 10:53:37 +00:00
8ce3a34ffa Remove macos-latest and windows-latest usages 2022-12-20 11:10:09 +01:00
259c04eb28 Merge #3261
3261: Use ubuntu-18.04 container instead of GitHub hosted actions r=curquiza a=curquiza

Related to (but does not fix totally) https://github.com/meilisearch/meilisearch/issues/3109 and https://github.com/meilisearch/product/discussions/547#discussioncomment-4109143

## For reviewers, what's the PR changes:
- Use ubuntu-latest where compiling with ubuntu-18.04 is not needed (`update-version-cargo-toml`, `fmt`, `clippy` jobs)
- Where ubuntu-18.04 is required
  - Use `ubuntu-latest` as runner
  - Use `ubuntu:18.04` as Docker container
  - Install the required dependencies (curl and cc)
  - Use `actions-rs/toolchain@v1` instead of `hecrj/setup-rust-action@master`. It's more stable and followed alternative. Plus it was easy to make it work with our container contrary to the old one. Change applied in all our CIs to be more consistent
- Remove some useless space to increase readability.

Co-authored-by: curquiza <clementine@meilisearch.com>
2022-12-20 09:28:09 +00:00
d8fb506c92 handle most io error instead of tagging everything as an internal 2022-12-19 20:50:40 +01:00
aa03e02fdc Apply Rustfmt 2022-12-19 19:24:56 +01:00
7ef23addb6 Add comment to bring more context 2022-12-19 18:46:27 +01:00
97fb64e40e Merge #747
747: Soft-deletion computation no longer depends on the mapsize r=irevoire a=dureuill

# Pull Request

## Related issue

Related to https://github.com/meilisearch/meilisearch/issues/3231: After removing `--max-index-size`, the `mapsize` will always be unrelated to the actual max size the user wants for their DB, so it doesn't make sense to use these values any longer.

This implements solution 2.3 from https://github.com/meilisearch/meilisearch/issues/3231#issuecomment-1348628824

## What does this PR do?

### User-visible

- Soft-deleted are no longer deleted when there is less than 10% of the mapsize available or when they take more than 10% of the mapsize
- Instead, they are deleted when they are more soft deleted than regular documents, or when they take more than 1GiB disk space (estimated).

### Implementation standpoint

1. Adds a `DeletionStrategy` struct to replace the boolean `disable_soft_deletion` that we had up until now. This enum allows us to specify that we want "always hard", "always soft", or to use the dynamic soft-deletion strategy (default).
2. Uses the current strategy when deleting documents, with the new heuristics being used in the `DeletionStrategy::Dynamic` variant.
3. Updates the tests to use the appropriate DeletionStrategy whenever needed (one of `AlwaysHard` or `AlwaysSoft` depending on the test)

Note to reviewers: this PR is optimized for a commit-by-commit review.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-12-19 17:46:18 +00:00
b3fce7c366 Remove useless continue-on-error 2022-12-19 18:39:35 +01:00
5099a40484 Use ubuntu-18.04 container in publish CIs 2022-12-19 18:35:33 +01:00
69edbf9f6d Update milli/src/update/delete_documents.rs 2022-12-19 18:23:50 +01:00
8957251eed Merge #751
751: Update version for the next release (v0.38.0) in Cargo.toml files r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2022-12-19 17:02:39 +00:00
c72535531b Update version for the next release (v0.38.0) in Cargo.toml files 2022-12-19 16:35:38 +00:00
19ee9a828f Merge #3262
3262: Clippy fixes after updating Rust to v1.66 r=curquiza a=dureuill

Ran `cargo clippy --fix`

Fixes the CI.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-19 14:05:59 +00:00
869d331680 Clippy fixes after updating Rust to v1.66 2022-12-19 14:17:12 +01:00
913eff5b2f Use ubuntu-18.04 container in rust tests 2022-12-19 10:46:29 +01:00
916c23e7be Tests: rename snapshots 2022-12-19 10:07:17 +01:00
ad9937c755 Fix tests after adding DeletionStrategy 2022-12-19 10:07:17 +01:00
171c942282 Soft-deletion computation no longer takes into account the mapsize
Implemented solution 2.3 from https://github.com/meilisearch/meilisearch/issues/3231#issuecomment-1348628824
2022-12-19 10:07:17 +01:00
e2ae3b24aa Hard or soft delete according to the deletion strategy 2022-12-19 10:00:13 +01:00
fc7618d49b Add DeletionStrategy 2022-12-19 09:49:58 +01:00
b4a73f2d74 Remove redundant date-setting 2022-12-16 08:32:44 +01:00
4e175ae882 Replace Index::new_with_creation_dates(...) with Index::new(...) 2022-12-16 08:20:13 +01:00
5a0a0468df Combine created and added into date 2022-12-16 08:11:12 +01:00
7f88c4ff2f Fix #1714 test 2022-12-15 18:22:28 +01:00
96d4242b93 Update charabia 2022-12-15 18:22:22 +01:00
60ebf0ea0b Add a specific test on finite pagination placeolder search with distinct attributes 2022-12-15 17:28:20 +01:00
867279f2a4 Merge #3249
3249: Bring back changes from release-v0.30.3 to main r=curquiza a=curquiza

⚠️ ⚠️ I had to fix git conflicts, ensure I did not lose anything ⚠️ ⚠️ 

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-15 14:13:30 +00:00
5114686394 Merge #743
743: Fix finite pagination with placeholder search r=Kerollmops a=ManyTheFish

this bug is reproducible on real datasets and is hard to isolate in a simple test.

related to: https://github.com/meilisearch/meilisearch/issues/3200

poke `@curquiza` 

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-12-15 09:31:47 +00:00
3322018c06 Fix placeholder search 2022-12-14 20:09:47 +01:00
ce84a59873 Re-apply some changes from #3132 2022-12-14 20:02:39 +01:00
d66bb3a53f rename the two new functions 2022-12-14 17:27:43 +01:00
6c0b8edab5 Fix typos
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2022-12-14 17:27:37 +01:00
fbbc6eaeca Fix the import of dumps and snapshot.
Some flags were badly applied + the database wrongly deleted when they shouldn't
2022-12-14 17:27:28 +01:00
60c3bac108 Bump milli to v0.37.3 2022-12-14 17:25:40 +01:00
9491fe0704 Merge #3247
3247: Re-add push in docker CI r=curquiza a=curquiza

I made a mistake here https://github.com/meilisearch/meilisearch/pull/3229, `push` is not `true` by default, see https://github.com/docker/build-push-action#customizing

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-12-14 13:15:41 +00:00
240c73d292 Re-add push 2022-12-14 14:05:25 +01:00