Commit Graph

9844 Commits

Author SHA1 Message Date
Louis Dureuil
d731fa661b ollama and openai use new EmbedderOptions 2024-07-24 14:34:17 +02:00
Louis Dureuil
a1beddd5d9 rest embedder: use json_template 2024-07-24 14:34:17 +02:00
Louis Dureuil
4109182ca4 Add json_template module 2024-07-24 14:34:12 +02:00
Louis Dureuil
1a297c048e Error changes 2024-07-24 14:34:12 +02:00
meili-bors[bot]
ecee0c922f Merge #4822
4822: HuggingFace: Clearer error message when a model is not supported r=Kerollmops a=dureuill

# Pull Request

## Related issue
Context: <https://github.com/meilisearch/meilisearch/discussions/4820>

## What does this PR do?
- Improve error message when a model configuration cannot be loaded and its "architectures" field does not contain "BertModel"

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-23 14:09:47 +00:00
Louis Dureuil
303e601b87 HuggingFace: Clearer error message when a model is not supported 2024-07-23 15:13:22 +02:00
meili-bors[bot]
f6d2c59bca Merge #4817
4817: Update version for the next release (v1.10.0) in Cargo.toml r=curquiza a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
2024-07-22 15:51:20 +00:00
curquiza
50b7093f8e Update version for the next release (v1.10.0) in Cargo.toml 2024-07-22 13:54:38 +00:00
meili-bors[bot]
48bc797dce Merge #4812
4812: Allow `MEILI_NO_VERGEN` env var to skip vergen r=irevoire a=dureuill

- vergen checks the state of the `.git` directory to embed commit information into the `meilisearch` binary and the `cargo xtask bench` invocations.
- This check unfortunately results in too many recompilation of the `meilisearch` binary.
- This PR allows skipping vergen when the `MEILI_NO_VERGEN` variable is present in the environment

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-18 16:16:01 +00:00
Louis Dureuil
c6b33fd407 Allow MEILI_NO_VERGEN env var to skip vergen 2024-07-18 17:28:01 +02:00
meili-bors[bot]
6e9d0de8b7 Merge #4806
4806: Update rustls as much as possible r=Kerollmops a=irevoire

# Pull Request

## Related issue
Part of https://github.com/meilisearch/meilisearch/issues/4753

## What does this PR do?
- Update rustls as much as possible

## What is missing

In rustls-0.22.0 two structures we were using have been removed with no explanation or workaround
<img width="518" alt="image" src="https://github.com/user-attachments/assets/fa112db1-3400-4163-8819-7913f22d6b87">



Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 17:00:01 +00:00
Tamo
1bfb16386c Update rustls as much as possible 2024-07-17 18:21:26 +02:00
meili-bors[bot]
ea73615abf Merge #4804
4804: Implements the experimental contains filter operator r=irevoire a=irevoire

# Pull Request
Related PRD: (private link) https://www.notion.so/meilisearch/Contains-Like-Filter-Operator-0d8ad53c6761466f913432eb1d843f1e
Public usage page: https://meilisearch.notion.site/Contains-filter-operator-usage-3e7421b0aacf45f48ab09abe259a1de6

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3613

## What does this PR do?
- Extract the contains operator from this PR: https://github.com/meilisearch/meilisearch/pull/3751
- Gate it behind a feature flag
- Add tests


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 15:47:11 +00:00
Tamo
02c61eabfa fix the range reported when the experimental feature has not been set 2024-07-17 16:54:33 +02:00
Tamo
56b60ec7a0 apply review comment 2024-07-17 16:13:40 +02:00
meili-bors[bot]
8f416e8f34 Merge #4805
4805: Log the time to index a batch of task r=Kerollmops a=irevoire

This was proposed by `@qdequele` in a private conversation and I think it’s a nice addition.

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 11:45:39 +00:00
Tamo
cf760cbfb1 Log the time to index a batch of task 2024-07-17 11:56:57 +02:00
Tamo
2af9481804 Implements the experimental contains filter operator« 2024-07-17 11:13:37 +02:00
meili-bors[bot]
7a292b572a Merge #4801
4801: AI quality-of-life improvements r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4802 

## What does this PR do?
This PR implements several quality-of-life improvements described in the [public usage](https://meilisearch.notion.site/v1-10-AI-search-changes-737c9d7d010d4dd685582bf5dab579e2#ece824a1814e47a0a986d786baff1be9)


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-17 09:00:47 +00:00
Louis Dureuil
8d6ac261ae Add tests on various failure modes for embedders 2024-07-16 13:39:02 +02:00
Louis Dureuil
b4c8b01c88 Update existing snapshots 2024-07-16 13:39:01 +02:00
Louis Dureuil
24240934f9 Improve errors when indexing documents with a user provided embedder 2024-07-16 13:39:01 +02:00
Louis Dureuil
f4c94ac57f manual embedders: limit max size of errors to 250 2024-07-16 13:39:01 +02:00
Louis Dureuil
4087a88dbe rest|ollama|openai: increase tries to 10 + randomize retry duration 2024-07-16 13:39:00 +02:00
Louis Dureuil
5adacf2f45 OpenAI: embed only the first MAX_TOKENS tokens 2024-07-16 13:39:00 +02:00
Louis Dureuil
65d0c32aa7 Allow overriding OpenAI's url 2024-07-16 13:39:00 +02:00
Louis Dureuil
82647bcded When retrieveVectors is true, retrieve _vectors.embedder even if there are no vector for that embedder 2024-07-16 13:39:00 +02:00
meili-bors[bot]
1582c7e788 Merge #4769
4769: Federated search r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4747 

[Usage](https://meilisearch.notion.site/v1-10-federated-search-698dfe36ab6b4668b044f735fb40f0b2)

## What does this PR do?
- multi-search now allows a top-level federation object. When not `null`, the results of multi-search are modified to be a single list of results rather than a list of a list of results
- changed lifetimes around tokenizer et al. to be able to make hits one by one rather than using a vector
- adds `roaring` to Meilisearch itself. As the federated search happens at the Meilisearch level (reuses the search functions declared at the Meilisearch level + merge happens after the hits were created), `RoaringBitmap`s are needed to track the candidates: hits that were seen,  all candidates.
- Refactor `make_hits` to allow for an individual, optimized `make_hit` 
- Score details comparison no longer fail when sorting on different field names or target point (for geo)

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-07-16 08:14:46 +00:00
Louis Dureuil
20094eba06 Apply review comments 2024-07-15 12:43:29 +02:00
Louis Dureuil
c35904d6e8 search::federated::ranking_rules -> search::ranking_rules 2024-07-15 08:43:22 +02:00
Louis Dureuil
2cacc448b6 Rename src/search.rs -> src/search/mod.rs 2024-07-15 08:43:21 +02:00
Louis Dureuil
a61b852695 Add tests 2024-07-15 08:43:21 +02:00
Louis Dureuil
3167411e98 Analytics 2024-07-15 08:43:21 +02:00
Louis Dureuil
83d71662aa Changes to multi_search route 2024-07-15 08:43:21 +02:00
Louis Dureuil
5c323cecc7 search: introduce federated search 2024-07-15 08:43:21 +02:00
meili-bors[bot]
77b9347fff Merge #4783
4783: Update minimal ubuntu version used from 18.04 to 20.04 r=curquiza a=curquiza

Fixes #4782 

Co-authored-by: curquiza <clementine@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-11 16:44:30 +00:00
Tamo
c85dd9f635 install a default stable toolchain before cargo build tries to install cross 2024-07-11 18:43:47 +02:00
curquiza
7da95d62e2 Add DEBIAN_FRONTEND to avoid interaction with tzdata 2024-07-11 18:43:47 +02:00
curquiza
2cda1360ee Remove ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION in CI 2024-07-11 18:43:47 +02:00
curquiza
5f9c05b944 Update minimal ubuntu version used from 18.04 to 20.04 2024-07-11 18:43:47 +02:00
Louis Dureuil
d3a6d2a6fa search: introduce hitmaker 2024-07-11 16:35:59 +02:00
Louis Dureuil
2123d76089 search: introduce "search_from_kind" 2024-07-11 16:35:11 +02:00
Louis Dureuil
edab4e75b0 Make SearchKind cloneable 2024-07-11 16:33:24 +02:00
Louis Dureuil
b9982587d4 Add new errors to meilisearch 2024-07-11 16:31:44 +02:00
Louis Dureuil
e83da00446 Milli changes to match to allow for more flexible lifetimes 2024-07-11 16:29:35 +02:00
Louis Dureuil
7fb3e378ff Do not fail sort comparisons when the field name or target point are different 2024-07-11 16:28:14 +02:00
Louis Dureuil
12a7a45930 Add roaring to meilisearch 2024-07-11 16:27:50 +02:00
meili-bors[bot]
677ed6bbf6 Merge #4787
4787: Add index exists function in index_scheduler which stops opening indexes to only check if they exist. r=Kerollmops a=Karribalu

# Pull Request

## Related issue
Fixes #4784

## What does this PR do?
- Added index_exists function in the index_scheduler.
- Resolved opening indexes to only check if they exist.
- Made changes to existing tests to test this function.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: karribalu <karri.balu123456@gmail.com>
2024-07-11 13:05:20 +00:00
meili-bors[bot]
29b44e5541 Merge #4626
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops

This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808). It uses [the Rhai programming language](https://rhai.rs/) to let users express the modifications they want apply.

You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483) through Docker by using the following command:

```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```

## TODO
 - [x] Support the `DocumentEdition` task in dumps.
 - [x] Remove the unwraps and panics.
 - [x] Improve error codes for the `function` parameter.
 - [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0) 🚀
 - [x] Make it an experimental feature (only restrict the HTTP calls).
 - [x] It must be possible not to send a context.
 - [x] Rebase on main.
 - [x] Check that the script cannot do any io.
 - [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
 - [x] Change the `editionCode` to the clearer `function` field name in the tasks.
 - [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
 - [x] Support deleting documents when the `doc` is `()` (nil, null).
 - [x] Support canceling document edition.
 - [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/)).
 - [x] Limit the number of instruction by function execution.
 - [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
 - [x] Ignore unmodified documents in the tasks count.
 - [x] Make the `filter` field optional (not forced to be `null`).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-11 09:02:55 +00:00
Clément Renault
6e80364c50 Apply review comments 2024-07-11 11:00:27 +02:00