Commit Graph

1415 Commits

Author SHA1 Message Date
Kerollmops
5f2a1a4fd1 Skip the documents before fetching them 2025-02-06 15:40:22 +01:00
Kerollmops
2b0e17ede0 Make sure arroy is using the rayon thread-pool 2025-02-06 15:28:10 +01:00
Kerollmops
37092adc71 Show a bit of progress 2025-02-06 10:37:05 +01:00
Kerollmops
86fcad788e Introduce a parameter to skip the first documents 2025-02-06 10:32:50 +01:00
Kerollmops
2ea5c57871 Create a new export documents meilitool subcommand based on v1.12 2025-02-06 10:32:39 +01:00
michascant
7b4f2aa593 updated code 2025-02-05 22:07:32 -05:00
michascant
1fb96d3edb made changes to ensure its not allowing everything through 2025-02-05 20:37:07 -05:00
Tamo
b63c64395d add a test ensuring the index-scheduler version is set when we cannot write the version file 2025-02-05 18:08:50 +01:00
Tamo
628119e31e fix the dumpless upgrade potential corruption when upgrading from the v1.12 2025-02-05 18:08:50 +01:00
Louis Dureuil
b21b8e8f30 Remote search tests 2025-02-05 15:03:33 +01:00
Louis Dureuil
4a9e5ae215 mv multi.rs -> multi/mod.rs 2025-02-05 15:03:33 +01:00
Louis Dureuil
6e1865b75b network integration tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
64409a1de7 Test server: clear_api_key 2025-02-05 15:03:32 +01:00
Louis Dureuil
1b81cab782 Add more analytics 2025-02-05 15:03:32 +01:00
Louis Dureuil
88190b5602 Fix tests 2025-02-05 15:03:32 +01:00
Louis Dureuil
0b27aa5138 Multi search reads header to know if it is being proxied 2025-02-05 15:03:32 +01:00
Louis Dureuil
35160788d7 Proxy search requests 2025-02-05 15:03:32 +01:00
Louis Dureuil
c3e5c3ba36 Allow rebuilding a SearchQueryWithIndex from its components 2025-02-05 15:03:16 +01:00
Louis Dureuil
04ac0af54b Add WeightedScoreValues to be able to compare remote scores 2025-02-05 15:03:16 +01:00
Louis Dureuil
9996533364 Make search types serialize and deserialize so that reading from a proxy is possible 2025-02-05 15:03:16 +01:00
Louis Dureuil
3f6b334fc5 Route network 2025-02-05 15:03:16 +01:00
Louis Dureuil
b30e5a7a35 Add new permissions 2025-02-05 15:03:16 +01:00
Louis Dureuil
6d79cb23ba New error codes 2025-02-05 15:03:16 +01:00
Louis Dureuil
e34afca6d7 Support network in dumps 2025-02-05 15:03:16 +01:00
Louis Dureuil
4918b9ffb6 Network stored in DB 2025-02-05 15:03:15 +01:00
Louis Dureuil
73474e7af0 Network types 2025-02-05 15:03:15 +01:00
Louis Dureuil
7ae6dda03f Add new experimental feature 2025-02-05 15:01:04 +01:00
meili-bors[bot]
00e764b0d3 Merge #5314
5314: Activate used database size r=irevoire a=ManyTheFish

# Pull Request

make the `/stats` route return the `usedDatabaseSize` corresponding to the size used to store the "real" data in the database and not the disk size used by LMDB


Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-05 12:51:57 +00:00
ManyTheFish
4abf0db0b4 Activate used database size 2025-02-05 13:45:47 +01:00
Tamo
61e8cfd4bc Send the OSS analytics once per day instead of once per hour 2025-02-04 15:39:00 +01:00
meili-bors[bot]
796acd1aee Merge #5288
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 13s
Test suite / Run Clippy (push) Failing after 19s
Test suite / Tests on windows-2022 (push) Failing after 48s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Tests on macos-13 (push) Has been cancelled
5288: Improve AI logging r=dureuill a=Kerollmops

This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:

 - `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
 - `filtered_universe: search::universe` the time spent filtering the documents.
 - ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
 - `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
 - `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
Kerollmops
cc8df5e11f Move back the search-side logging to tracing 2025-02-04 11:16:17 +01:00
meili-bors[bot]
ede74ccc42 Merge #5306
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 2s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 2s
Test suite / Tests on windows-2022 (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m33s
Test suite / Run Clippy (push) Successful in 6m20s
Test suite / Tests on macos-13 (push) Has been cancelled
5306: Fix internal error when passing `documentTemplateMaxBytes` to a source that doesn't support it r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #5305 

## What does this PR do?
- add `DOCUMENT_TEMPLATE_MAX_BYTES` to `allowed_sources_for_field` and `allowed_fields_for_source` to prevent a panic


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-04 08:46:13 +00:00
Tamo
d34f0b606c Update crates/milli/src/update/new/document_change.rs 2025-02-03 12:08:52 +01:00
meili-bors[bot]
6425451bbc Merge #5303
5303: Bring back changes from v1.12.8 into v1.13.0 r=Kerollmops a=Kerollmops

Fixes #5087 and other problems that you can find in the original PR #5294.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-03 10:49:26 +00:00
Kerollmops
acc400face Support merging update and replacement operations 2025-02-03 11:47:17 +01:00
meili-bors[bot]
fe46855462 Merge #5235
5235: Introduce a compaction subcommand in meilitool r=dureuill a=Kerollmops

This PR proposes a change to the meilitool helper, introducing the `compact-index` subcommand to reduce the size of the indexes.

While working on this tool, I discovered that the current heed `Env::copy_to_file` API is not very temp file friendly and [could be improved](https://github.com/meilisearch/heed/issues/306).

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-02-03 10:11:01 +00:00
Kerollmops
8e7d2d25f2 Only open indexes, do not create them 2025-02-03 10:50:38 +01:00
Louis Dureuil
a436534515 Fix test 2025-02-03 10:36:34 +01:00
Kerollmops
aa2327591e Add more mixing updates and replacements tests 2025-02-03 10:34:07 +01:00
Kerollmops
a6f9e0ddf0 Fix auto batching related tests 2025-02-03 10:34:07 +01:00
Kerollmops
60470bb647 Fix the tests to use the new replace/update documents 2025-02-03 10:34:07 +01:00
Kerollmops
294e1ba16d Fix functions calls to use the new mixed system 2025-02-03 10:34:06 +01:00
Kerollmops
8e6893ddbe Make sure we correctly mix different document operations 2025-02-03 10:34:06 +01:00
Kerollmops
d018346f18 Make the auto-batcher batche replacement with updates 2025-02-03 10:34:05 +01:00
Kerollmops
2385842537 Fix the imports 2025-02-03 10:29:09 +01:00
Kerollmops
6a70c0ec92 Add a link to the experimental feature GitHub discussion 2025-02-03 10:24:53 +01:00
Kerollmops
7a9382b115 Better document the rayon limitation condition 2025-02-03 10:24:53 +01:00
Kerollmops
62dabeba5f Do not create too many rayon tasks when processing the settings 2025-02-03 10:24:52 +01:00
Kerollmops
48812229a9 Remove a log that would log too much 2025-02-03 10:24:52 +01:00