4512: Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"" r=Kerollmops a=irevoire
Reverts meilisearch/meilisearch#4510
This PR was supposed to be merged on `release-v1.7.1` not main 🤦
Co-authored-by: Tamo <irevoire@protonmail.ch>
4510: Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" r=Kerollmops a=irevoire
In https://github.com/meilisearch/meilisearch/pull/4502 we merged main into release-v1.7.1 instead of a temporary branch thus we now need to revert this merge commit.
This reverts commit bd74cce86a, reversing changes made to d2f77e88bd.
Co-authored-by: Tamo <tamo@meilisearch.com>
4500: Don't display dimensions as 0 when it is not set r=ManyTheFish a=dureuill
Fixes regression in embedders where `dimensions: 0` was displayed when it hadn't be set for the `openAi` source.
Was breaking a PHP SDK integration test: cbaecb8c55/tests/Settings/EmbeddersTest.php (L28)
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4491: chore: remove repetitive words r=curquiza a=shuangcui
# Pull Request
## Related issue
Fixes #<issue_number>
## What does this PR do?
- ...
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: shuangcui <fliter@qq.com>
4483: Workflows: Fix reason param when benches are triggered from a comment. r=irevoire a=dureuill
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4487: Update version for the next release (v1.7.1) in Cargo.toml r=Kerollmops a=meili-bot
⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.
Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
4475: Allow running benchmarks without sending results to the dashboard r=irevoire a=dureuill
Adds a `--no-dashboard` option to avoid sending results to the dashboard.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm
# Pull Request
## Related issue
[Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305)
## What does this PR do?
- Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama`
- Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set
- Changes some of the structs and functions in `openai.rs` to be public so that they can be shared.
- Added more error variants for Ollama specific errors
- It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model
Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Co-authored-by: Jakob Klemm <jakob@jeykey.net>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
Instead of the user manually specifying the model dimensions it will now automatically get determined
Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output
Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user
4473: Bring back changes from v1.7.0 to main r=curquiza a=curquiza
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
4463: Add tests when the field limit is reached r=Kerollmops a=irevoire
# Pull Request
## Related issue
Related to https://github.com/meilisearch/meilisearch/discussions/4429#discussioncomment-8689101
This user found out that the error message we’re supposed to return when the maximum number of attributes is reached is _not_ returned in some cases
## What does this PR do?
- This PR adds four tests around the maximum number of attributes:
1. Add a document with u16::MAX + 1 fields - Meilisearch panics
2. Add two documents which together adds up to u16::MAX + 1 fields - Meilisearch returns the expected error
3. Add a document with u16::MAX + 1 **nested fields** - No error message but the document isn’t indexed
4. Add two documents which together add up to u16::MAX + 1 nested fields - Meilisearch doesn’t return any error but doesn’t index the document
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Tamo <tamo@meilisearch.com>
4462: Divide threshold by ten r=dureuill a=ManyTheFish
Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights
Co-authored-by: ManyTheFish <many@meilisearch.com>
4458: Replace logging timer by spans r=Kerollmops a=dureuill
- Remove logging timer dependency.
- Remplace last uses in search by spans
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4459: Put a bound on OpenAI timeout r=dureuill a=dureuill
# Pull Request
## Related issue
Fixes#4460
## What does this PR do?
- Makes sure that the timeout of the openai embedder is limited to max 1min, rather than the prior 15min+
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4445: Add subcommand to run benchmarks r=irevoire a=dureuill
# Pull Request
## Related issue
Not user-facing, no issue
## What does this PR do?
- Adds a new `cargo xtask bench` subcommand that can run one or multiple workload files and report the results to a server
- A workload file is a JSON file with a specific schema
- Refactor our use of the `vergen` crate:
- update to the beta `vergen-git2` crate
- VERGEN_GIT_SEMVER_LIGHTWEIGHT => VERGEN_GIT_DESCRIBE
- factor logic in a single `build-info` crate that is used both by meilisearch and xtask (prevents vergen variables from overriding themselves)
- checked that defining the variables by hand when no git repo is available (docker build case) still works.
- Add CI to run `cargo xtask bench`
Co-authored-by: Louis Dureuil <louis@meilisearch.com>