Commit Graph

204 Commits

Author SHA1 Message Date
98a785b0d7 Merge #5080
Some checks failed
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m43s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch, meilisearch-macos-amd64, macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Waiting to run
Look for flaky tests / flaky (push) Failing after 21s
Test suite / Tests on ubuntu-20.04 (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 22s
Test suite / Tests almost all features (push) Failing after 7s
Test suite / Test disabled tokenization (push) Failing after 7s
Test suite / Run tests in debug (push) Failing after 9s
Test suite / Run Rustfmt (push) Successful in 1m24s
Test suite / Run Clippy (push) Successful in 6m14s
Publish binaries to GitHub release / Check the version validity (push) Successful in 7s
Publish binaries to GitHub release / Publish binary for Linux (push) Failing after 9s
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch.exe, meilisearch-windows-amd64.exe, windows-2022) (push) Failing after 19s
Publish binaries to GitHub release / Publish binary for aarch64 (meilisearch-linux-aarch64, aarch64-unknown-linux-gnu) (push) Failing after 9s
5080: Fix getting a single batch through the GET route r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes a bug where getting a single batch does not work

Related to #5070 


fix by `@Kerollmops` 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 17:08:46 +00:00
ba7500998e Fix getting a single batch through the GET route 2024-11-21 17:59:31 +01:00
19e6f675b3 Merge #4900
4900: Indexer edition 2024 r=Kerollmops a=dureuill

This PR is implementing the indexer edition 2024, largely inspired by [the ideas from this blog post](https://blog.kerollmops.com/meilisearch-is-too-slow).

Fixes https://github.com/meilisearch/meilisearch/issues/4985

## Features
- Stream-first approach to reading documents.
- Minimum disk write operations.
- RAM usage-first approach to avoid modifying common bitmaps on disk but in memory.
- Reduced LMDB fragmentation by writing entries only once...
- ...computing the final version of the entries in parallel...
- ...and storing them in write-optimized data structures before sending them to the BTree (LMDB).
- Indexing in multiple transactions to improve large dataset support (dumps).


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 16:19:10 +00:00
36962b943b First batch of PR comment 2024-11-21 16:38:11 +01:00
1e694ae432 improve the count of the number of tasks in a batch 2024-11-20 17:48:26 +01:00
bda2b41d11 update snaps after merge 2024-11-20 17:08:30 +01:00
6e6acfcf1b Merge branch 'main' into indexer-edition-2024 2024-11-20 16:59:58 +01:00
4d616f8794 Parse every attributes and filter before tokenization 2024-11-20 15:15:25 +01:00
35bbe1c2a2 Add failing test on settings changes 2024-11-20 14:48:12 +01:00
a7ac590e9e implements the reverse query parameter for the batches 2024-11-20 13:29:52 +01:00
32d0e50a75 Fix all the benchmark compilation errors 2024-11-20 13:16:32 +01:00
aba8a0e9e0 Fix some tests but not all of them 2024-11-20 13:16:31 +01:00
7e379b3d14 remove useless prints 2024-11-20 12:27:12 +01:00
56eacd221f update the tests after the rebase 2024-11-20 10:54:38 +01:00
b906e3ed70 improve the way we access the mutex 2024-11-20 10:51:06 +01:00
4abcd9c04e add some stats on the batches 2024-11-20 10:51:06 +01:00
5d10c2312b remove unused file 2024-11-20 10:51:06 +01:00
f1d38581e5 add the front end tests on the batches routes 2024-11-20 10:51:06 +01:00
d489f5635f add the mapping between the task and batches 2024-11-20 10:49:23 +01:00
a1251c3c83 Implements the get all batches route with filters working 2024-11-20 10:42:55 +01:00
6062914654 add the batch_id to the tasks 2024-11-20 10:42:54 +01:00
057fcb3993 Add indices field to _matchesPosition to specify where in an array a match comes from (#5005)
Some checks are pending
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h4m31s
* Remove unreachable code

* Add `indices` field to `MatchBounds`

For matches inside arrays, this field holds the indices of the array
elements that matched. For example, searching for `cat` inside
`{ "a": ["dog", "cat", "fox"] }` would return `indices: [1]`. For nested
arrays, this contains multiple indices, starting with the one for the
top-most array. For matches in fields without arrays, `indices` is not
serialized (does not exist) to save space.
2024-11-20 01:00:43 +01:00
8924d486db Add a test reproducing the bug 2024-11-18 16:08:55 +01:00
e0c3f3d560 Fix #4984 2024-11-18 16:08:53 +01:00
e9d17136b2 Add deadline of 3 seconds to embedding requests made in the context of hybrid search 2024-11-18 12:15:11 +01:00
a05e448cf8 Add test 2024-11-18 12:15:11 +01:00
9150c8f052 Accept changes to vector format 2024-11-18 11:04:57 +01:00
72ba353498 reproduce sdk fail 2024-11-18 10:03:23 +01:00
0dd321afc7 reproduce #4984 2024-11-14 10:02:51 +01:00
8e5b1a3ec1 Compute the field distribution and convert _geo into an f64s 2024-11-13 17:44:05 +01:00
b17896d899 Finialize the GeoExtractor 2024-11-13 17:43:02 +01:00
a01bc7b454 Fix error_document_field_limit_reached_in_one_document test 2024-11-13 10:34:54 +01:00
82dcaba6ca Fix test: somehow on main vectors where displayed even though retrieveVectors: false 2024-11-12 23:58:25 +01:00
cb1d6613dd Adjust snapshots 2024-11-12 23:26:30 +01:00
1d13e804f7 Adjust test snapshots 2024-11-12 22:52:41 +01:00
68bbf674c9 Make REST mock thread independent 2024-11-12 16:31:31 +01:00
e32677999f Adapt some snapshots 2024-11-08 00:06:33 +01:00
2eb1801e85 reverse the order of the task queue 2024-11-07 19:19:44 +01:00
a5d7ae23bd Merge #5044
5044: Adds new metrics to prometheus r=irevoire a=PedroTurik

not 100% confident in this solution, especially because i couldn't make the "Search Queue searches waiting" metric give me any value other than 0 with my local testing 😆. But i believe it solves the Issue.

# Pull Request

## Related issue
Fixes #4998 

## What does this PR do?
### Adds new metrics to prometheus;
- SearchQueue size, 
- SearchQueue searches running, 
- and Search Queue searches waiting.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Pedro Turik Firmino <pedroturik@gmail.com>
2024-11-07 17:05:43 +00:00
03886d0012 Applies optimizations to formatted integration tests (#5043) 2024-11-07 15:58:55 +01:00
b427b9e88f Merge #5025
5025: test: improve performance of get_documents.rs r=irevoire a=PedroTurik

# Pull Request

## Related issue
Fixes one item from #4840 

## What does this PR do?
- Applies the changes recommended on the issue for `meilisearch/tests/documents/get_documents.rs`

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Pedro Turik Firmino <pedroturik@gmail.com>
2024-11-07 09:46:34 +00:00
8b95f5ccc6 Adds new metrics to prometheus: SearchQueue size, SearchQueue searches running, and Search Queue searches waiting. 2024-11-06 15:37:16 -03:00
10feeb88f2 Merge branch 'main' into indexer-edition-2024 2024-11-06 15:19:18 +01:00
da59a043ba Fixes formatting issues 2024-11-06 09:55:48 -03:00
da4d47b5d0 Fixes formatting issues 2024-11-06 09:54:20 -03:00
ede086bc30 Merge #5034
5034: Upgrade from v1 10 to v1 11 r=irevoire a=irevoire

# Pull Request

## Related issue
Parts of https://github.com/meilisearch/meilisearch/issues/4978

## What does this PR do?
- Move the code around the offline upgrade to its own module with a file per version
- Fix the upgrade from v1.9 to v1.10 because I couldn’t make it work anymore. It now uses a specified format instead of relying on cargo to get the right set of feature
- ☝️ must be checked against docker
- Provide an update path from v1.10 to v1.11. Most of the code is boilerplate in meilitool, the real code is located here: 053807bf38/src/lib.rs (L161-L269)


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-11-05 14:49:56 +00:00
1b49b60486 Merge #5026
5026: test: improve performance of update_documents.rs  r=dureuill a=PedroTurik

# Pull Request

## Related issue
Fixes one item from #4840 

## What does this PR do?
- Applies the changes recommended on the issue for `meilisearch/tests/documents/update_documents.rs`

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Pedro Turik Firmino <pedroturik@gmail.com>
2024-11-05 08:37:44 +00:00
d0b1ba20cb Improves usage of shared indexes 2024-11-04 17:26:50 -03:00
106cc7fe3a fmt 2024-11-04 17:51:40 +01:00
cf6ad1ae5e Merge branch 'main' into tmp-release-v1.11.0 2024-11-04 16:14:44 +01:00