Commit Graph

650 Commits

Author SHA1 Message Date
meili-bors[bot]
796acd1aee Merge #5288
Some checks failed
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 13s
Test suite / Run Clippy (push) Failing after 19s
Test suite / Tests on windows-2022 (push) Failing after 48s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Tests on macos-13 (push) Has been cancelled
5288: Improve AI logging r=dureuill a=Kerollmops

This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:

 - `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
 - `filtered_universe: search::universe` the time spent filtering the documents.
 - ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
 - `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
 - `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
Kerollmops
cc8df5e11f Move back the search-side logging to tracing 2025-02-04 11:16:17 +01:00
meili-bors[bot]
ede74ccc42 Merge #5306
Some checks failed
Test suite / Tests on ubuntu-20.04 (push) Failing after 2s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 2s
Test suite / Tests on windows-2022 (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m33s
Test suite / Run Clippy (push) Successful in 6m20s
Test suite / Tests on macos-13 (push) Has been cancelled
5306: Fix internal error when passing `documentTemplateMaxBytes` to a source that doesn't support it r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #5305 

## What does this PR do?
- add `DOCUMENT_TEMPLATE_MAX_BYTES` to `allowed_sources_for_field` and `allowed_fields_for_source` to prevent a panic


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-04 08:46:13 +00:00
meili-bors[bot]
6425451bbc Merge #5303
5303: Bring back changes from v1.12.8 into v1.13.0 r=Kerollmops a=Kerollmops

Fixes #5087 and other problems that you can find in the original PR #5294.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-03 10:49:26 +00:00
meili-bors[bot]
fe46855462 Merge #5235
5235: Introduce a compaction subcommand in meilitool r=dureuill a=Kerollmops

This PR proposes a change to the meilitool helper, introducing the `compact-index` subcommand to reduce the size of the indexes.

While working on this tool, I discovered that the current heed `Env::copy_to_file` API is not very temp file friendly and [could be improved](https://github.com/meilisearch/heed/issues/306).

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-02-03 10:11:01 +00:00
Kerollmops
8e7d2d25f2 Only open indexes, do not create them 2025-02-03 10:50:38 +01:00
Louis Dureuil
a436534515 Fix test 2025-02-03 10:36:34 +01:00
Kerollmops
2385842537 Fix the imports 2025-02-03 10:29:09 +01:00
Kerollmops
6a70c0ec92 Add a link to the experimental feature GitHub discussion 2025-02-03 10:24:53 +01:00
Kerollmops
7a9382b115 Better document the rayon limitation condition 2025-02-03 10:24:53 +01:00
Kerollmops
62dabeba5f Do not create too many rayon tasks when processing the settings 2025-02-03 10:24:52 +01:00
Kerollmops
48812229a9 Remove a log that would log too much 2025-02-03 10:24:52 +01:00
Kerollmops
915cc377fb Refine the env variable and the max readers 2025-02-03 10:24:52 +01:00
Louis Dureuil
96544bfa43 add DOCUMENT_TEMPLATE_MAX_BYTES to allowed_sources_for_field and allowed_fields_for_source 2025-02-03 09:59:17 +01:00
Kerollmops
aaefbfae1f Do not create too many rayon tasks 2025-01-30 16:36:12 +01:00
Kerollmops
97e17f52a1 Add more logs to see calls to the embedders 2025-01-30 16:36:12 +01:00
Kerollmops
62ced0e3f1 Make cargo fmt happy 2025-01-30 11:09:54 +01:00
Clément Renault
71bb24f17e Throw and error when the index is not found
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-30 11:07:43 +01:00
Clément Renault
c72f114b33 Fix english in the comments
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-30 11:07:09 +01:00
meili-bors[bot]
8ed39f5de0 Merge #5300
5300: Improve unexpected panic message r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5273

## What does this PR do?
- When an unexpected panic happens in the index-scheduler we catch it and rebuild an error message from the join_error
- Same when the upgrade index-scheduler fails


Co-authored-by: Tamo <tamo@meilisearch.com>
2025-01-30 09:23:17 +00:00
Kerollmops
424c5bde40 Move the embedding computation and extraction log to debug 2025-01-29 16:40:36 +01:00
Tamo
bdd3005d10 Log the progress when a batch fails 2025-01-29 16:36:23 +01:00
Kerollmops
cb1b7513af Log the memory metrics only once 2025-01-29 15:21:52 +01:00
Clément Renault
a9d0f4a002 Improve english comments
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-29 15:16:40 +01:00
Kerollmops
db032079d8 Show indexation allocated memory 2025-01-29 14:21:02 +01:00
Clément Renault
a00796c46a Improve the naming in the log message 2025-01-29 14:21:02 +01:00
Kerollmops
6112bd8caa Display the channel congestion 2025-01-29 14:21:02 +01:00
Kerollmops
cec88cfc29 Measure the bbqueue congestion 2025-01-29 14:21:02 +01:00
Tamo
8439aeb7cf improve error message in case of unexpected panic while processing tasks 2025-01-29 11:51:06 +01:00
Tamo
1beda3b9af fix another flaky test 2025-01-28 16:53:50 +01:00
Tamo
8676e94f5c fix the flaky tests 2025-01-28 16:53:50 +01:00
Tamo
ef47a0d820 apply review comment 2025-01-28 16:53:50 +01:00
Tamo
e0f0da57e2 make sure the batches we snapshots actually all contains an enqueued_at 2025-01-28 16:53:50 +01:00
Tamo
485e3127c7 use the remove_n_tasks_datetime_earlier_than function when updating batches 2025-01-28 16:53:50 +01:00
Tamo
58f90b70c7 store the enqueued at to eases the batch deletion 2025-01-28 16:53:50 +01:00
Tamo
508db9020d update the snapshots 2025-01-28 16:53:50 +01:00
Kerollmops
6ff37c6fc4 Fix the insta snapshots 2025-01-28 16:53:50 +01:00
Kerollmops
f21ae1f5d1 Remove the batch id from the date time databases 2025-01-28 16:53:50 +01:00
Kerollmops
19bc885b07 Fix the milli logo 2025-01-27 14:30:59 +01:00
Kerollmops
47f70e3d79 Debug the first vector sort fill buffer 2025-01-27 14:22:29 +01:00
Kerollmops
0f8eb3b506 Improve the logs of the search with AI 2025-01-27 14:22:22 +01:00
Kerollmops
4a5923a55e log the time arroy took to insert embeddings 2025-01-27 14:22:17 +01:00
manojks1999
528d9d6d8b Removed CouldNotUpgrade from error file 2025-01-26 21:04:57 +05:30
Louis Dureuil
50280bf02b Support offline upgrade up to v1.12.7 2025-01-24 12:25:33 +01:00
Clément Renault
9b579069df Comment the max grant of the bbqueue
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-01-24 12:18:32 +01:00
Louis Dureuil
f5a4a1c8b2 Give more RAM to bbqueue.
- bbqueue buffers used to have (5% * 2%) / num_threads
- they now have 5% / num_threads
2025-01-24 12:18:32 +01:00
Kerollmops
5ab4cdb1f3 Reduce the maximum grant possible we can store in the BBQueue 2025-01-24 12:18:32 +01:00
Louis Dureuil
73d8a4eace Remove db.snapshot 2025-01-23 17:21:42 +01:00
Louis Dureuil
c1e5897076 Do not assume v1.12 when there is no index-scheduler version 2025-01-23 17:16:53 +01:00
Louis Dureuil
718a98fbbf remove : char from filenames 2025-01-23 17:08:35 +01:00