Commit Graph

11202 Commits

Author SHA1 Message Date
Kerollmops
ca1ad51564 Put the Ollama tests under a feature 2025-02-06 17:27:47 +01:00
Louis Dureuil
70aac71c63 exclude network time from processingMs 2025-02-06 17:18:36 +01:00
Kerollmops
a1d1e7c82a Setup dedicated CI to run the Ollama tests 2025-02-06 17:12:17 +01:00
Louis Dureuil
56438bdea4 Introduce an Ollama integration test 2025-02-06 17:12:17 +01:00
meili-bors[bot]
a562d6abc1 Merge #5322
5322: Make sure arroy is using the rayon thread-pool r=dureuill a=Kerollmops

This PR fixes #5249 by ensuring arroy uses the rayon thread pool.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-06 15:28:47 +00:00
michascant
33b67b82e1 fixed rustfmt errors 2025-02-06 09:57:39 -05:00
meili-bors[bot]
b7fdd9516c Merge #4970
4970: Create a new export documents meilitool subcommand r=dureuill a=Kerollmops

This subcommand can be useful for extracting documents from an existing database.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-06 14:48:27 +00:00
Kerollmops
5f2a1a4fd1 Skip the documents before fetching them 2025-02-06 15:40:22 +01:00
Kerollmops
2b0e17ede0 Make sure arroy is using the rayon thread-pool 2025-02-06 15:28:10 +01:00
Kerollmops
37092adc71 Show a bit of progress 2025-02-06 10:37:05 +01:00
Kerollmops
86fcad788e Introduce a parameter to skip the first documents 2025-02-06 10:32:50 +01:00
Kerollmops
2ea5c57871 Create a new export documents meilitool subcommand based on v1.12 2025-02-06 10:32:39 +01:00
michascant
7b4f2aa593 updated code 2025-02-05 22:07:32 -05:00
michascant
1fb96d3edb made changes to ensure its not allowing everything through 2025-02-05 20:37:07 -05:00
Tamo
b63c64395d add a test ensuring the index-scheduler version is set when we cannot write the version file 2025-02-05 18:08:50 +01:00
Tamo
628119e31e fix the dumpless upgrade potential corruption when upgrading from the v1.12 2025-02-05 18:08:50 +01:00
meili-bors[bot]
78867b6852 Merge #5299
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 1s
Test suite / Tests on ubuntu-20.04 (push) Failing after 17s
Test suite / Tests on windows-2022 (push) Failing after 25s
Test suite / Run Rustfmt (push) Failing after 1m6s
Test suite / Run Clippy (push) Successful in 8m46s
Test suite / Tests on macos-13 (push) Has been cancelled
5299: Remote federated search r=dureuill a=dureuill

Fixes #4980 

- Usage: https://www.notion.so/meilisearch/API-usage-Remote-search-request-f64fae093abf409e9434c9b9c8fab6f3?pvs=25#1894b06b651f809a9f3dcc6b7189646e

- Changes database format:
  - Adds a new database key: the code is resilient to the case where the key is missing
  - Adds a new experimental feature: the code for experimental features is resilient to this case

Changes:

- Add experimental feature `proxySearch`
- Add network routes
- Dump support for network
- Add proxy search
- Add various tests

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
v1.13.0-rc.1
2025-02-05 16:08:48 +00:00
Louis Dureuil
b21b8e8f30 Remote search tests 2025-02-05 15:03:33 +01:00
Louis Dureuil
4a9e5ae215 mv multi.rs -> multi/mod.rs 2025-02-05 15:03:33 +01:00
Louis Dureuil
6e1865b75b network integration tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
64409a1de7 Test server: clear_api_key 2025-02-05 15:03:32 +01:00
Louis Dureuil
1b81cab782 Add more analytics 2025-02-05 15:03:32 +01:00
Louis Dureuil
88190b5602 Fix tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
0b27aa5138 Multi search reads header to know if it is being proxied 2025-02-05 15:03:32 +01:00
Louis Dureuil
35160788d7 Proxy search requests 2025-02-05 15:03:32 +01:00
Louis Dureuil
c3e5c3ba36 Allow rebuilding a SearchQueryWithIndex from its components 2025-02-05 15:03:16 +01:00
Louis Dureuil
04ac0af54b Add WeightedScoreValues to be able to compare remote scores 2025-02-05 15:03:16 +01:00
Louis Dureuil
9996533364 Make search types serialize and deserialize so that reading from a proxy is possible 2025-02-05 15:03:16 +01:00
Louis Dureuil
3f6b334fc5 Route network 2025-02-05 15:03:16 +01:00
Louis Dureuil
b30e5a7a35 Add new permissions 2025-02-05 15:03:16 +01:00
Louis Dureuil
6d79cb23ba New error codes 2025-02-05 15:03:16 +01:00
Louis Dureuil
e34afca6d7 Support network in dumps 2025-02-05 15:03:16 +01:00
Louis Dureuil
4918b9ffb6 Network stored in DB 2025-02-05 15:03:15 +01:00
Louis Dureuil
73474e7af0 Network types 2025-02-05 15:03:15 +01:00
Louis Dureuil
7ae6dda03f Add new experimental feature 2025-02-05 15:01:04 +01:00
meili-bors[bot]
00e764b0d3 Merge #5314
5314: Activate used database size r=irevoire a=ManyTheFish

# Pull Request

make the `/stats` route return the `usedDatabaseSize` corresponding to the size used to store the "real" data in the database and not the disk size used by LMDB


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-05 12:51:57 +00:00
ManyTheFish
4abf0db0b4 Activate used database size 2025-02-05 13:45:47 +01:00
meili-bors[bot]
acc885fd0a Merge #5312
5312: Send the OSS analytics once per day instead of once per hour r=ManyTheFish a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5311

## What does this PR do?
- If the instance is OSS => we send the analytics once every day
- If the instance is on the meilisearch cloud => we send the analytics every hour


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-05 11:15:34 +00:00
Tamo
61e8cfd4bc Send the OSS analytics once per day instead of once per hour 2025-02-04 15:39:00 +01:00
meili-bors[bot]
796acd1aee Merge #5288
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 13s
Test suite / Run Clippy (push) Failing after 19s
Test suite / Tests on windows-2022 (push) Failing after 48s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Tests on macos-13 (push) Has been cancelled
5288: Improve AI logging r=dureuill a=Kerollmops

This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:

 - `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
 - `filtered_universe: search::universe` the time spent filtering the documents.
 - ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
 - `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
 - `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
Kerollmops
cc8df5e11f Move back the search-side logging to tracing 2025-02-04 11:16:17 +01:00
meili-bors[bot]
ede74ccc42 Merge #5306
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 2s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 2s
Test suite / Tests on windows-2022 (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m33s
Test suite / Run Clippy (push) Successful in 6m20s
Test suite / Tests on macos-13 (push) Has been cancelled
5306: Fix internal error when passing `documentTemplateMaxBytes` to a source that doesn't support it r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #5305 

## What does this PR do?
- add `DOCUMENT_TEMPLATE_MAX_BYTES` to `allowed_sources_for_field` and `allowed_fields_for_source` to prevent a panic


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-04 08:46:13 +00:00
meili-bors[bot]
e93a5719ef Merge #5293
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 1s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 14s
Test suite / Tests on windows-2022 (push) Failing after 24s
Test suite / Run Clippy (push) Failing after 31s
Test suite / Run Rustfmt (push) Successful in 1m45s
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m3s
Indexing bench (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of indexing (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Has been cancelled
Test suite / Tests on macos-13 (push) Has been cancelled
5293: Support merging update and replacement operations r=irevoire a=Kerollmops

This PR fixes #5286 by modifying the auto-batcher and how we merge documents when preparing them for the new indexer.

## To do
- [x] Make sure we can auto-batch different operation types.
- [x] Make sure the indexer correctly understands and mixes the different kinds.
- [x] Create a test to see if it mixes the documents correctly.
- [x] Modify the auto-batcher tests for the new behavior.


Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-02-03 11:28:41 +00:00
Tamo
d34f0b606c Update crates/milli/src/update/new/document_change.rs 2025-02-03 12:08:52 +01:00
meili-bors[bot]
6425451bbc Merge #5303
5303: Bring back changes from v1.12.8 into v1.13.0 r=Kerollmops a=Kerollmops

Fixes #5087 and other problems that you can find in the original PR #5294.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-03 10:49:26 +00:00
Kerollmops
acc400face Support merging update and replacement operations 2025-02-03 11:47:17 +01:00
meili-bors[bot]
fe46855462 Merge #5235
5235: Introduce a compaction subcommand in meilitool r=dureuill a=Kerollmops

This PR proposes a change to the meilitool helper, introducing the `compact-index` subcommand to reduce the size of the indexes.

While working on this tool, I discovered that the current heed `Env::copy_to_file` API is not very temp file friendly and [could be improved](https://github.com/meilisearch/heed/issues/306).

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-02-03 10:11:01 +00:00
Kerollmops
8e7d2d25f2 Only open indexes, do not create them 2025-02-03 10:50:38 +01:00
Louis Dureuil
a436534515 Fix test 2025-02-03 10:36:34 +01:00
Kerollmops
aa2327591e Add more mixing updates and replacements tests 2025-02-03 10:34:07 +01:00