Commit Graph

801 Commits

Author SHA1 Message Date
cde7ce4f44 Add test 2024-03-27 14:02:09 +01:00
087a96d22e fix flaky test 2024-03-27 11:05:37 +01:00
34dfea72cc Merge #4509
4509: Rest embedder r=ManyTheFish a=dureuill

Fixes #4531 

See [Usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42?pvs=25#e6f58c3b742c4effb4ddc625ce12ee16)

### Implementation changes

- Remove tokio, futures, reqwests
- Add a new `milli::vector::rest::Embedder` embedder
- Update OpenAI and Ollama embedders to use the REST embedder internally
- Make Embedder::embed a sync method
- Add the new embedder source as described in the usage


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-27 09:27:46 +00:00
3a1f458139 fix a flaky test 2024-03-26 21:06:55 +01:00
55df9daaa0 adds a comment about the safety of an operation 2024-03-26 19:34:55 +01:00
2e36f069c2 fmt imports 2024-03-26 19:23:55 +01:00
8f5d9f501a update the discussion link 2024-03-26 19:18:32 +01:00
8127c9a115 handle the case of a queue of zero elements 2024-03-26 19:04:39 +01:00
e7704f1fc1 add a test to ensure we effectively returns a retry-after when the search queue is full 2024-03-26 18:08:59 +01:00
34262c7a0d Add analytics for the negative operator 2024-03-26 18:01:27 +01:00
e2a1bbae37 simplify and improve the http error 2024-03-26 17:53:37 +01:00
e433fd53e6 rename the method to get a permit and use it in all search requests 2024-03-26 17:28:03 +01:00
3f23fbb46d create the experimental CLI argument 2024-03-26 16:43:40 +01:00
c41e1274dc push and test the search queue datastructure 2024-03-26 15:56:43 +01:00
9a95ed619d Add tests 2024-03-26 10:36:56 +01:00
f82d056072 Hide secrets in settings and task queue 2024-03-26 10:36:24 +01:00
5ea017b922 Merge #4530
4530: fix: set the histogram bucket boundaries to follow the otel spec r=curquiza a=rohankmr414

# Pull Request

## What does this PR do?
- Fixes the http request duration histogram bucket boundaries to follow the opentelemetry spec, currently the bucket boundaries are too granular and only track latencies below 1s.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Rohan Kumar <rohankmr414@gmail.com>
2024-03-25 12:23:31 +00:00
dfa5e41ea6 Check validity of the URL setting 2024-03-25 11:23:16 +01:00
f649f58013 embed no longer async 2024-03-25 11:23:03 +01:00
13a84ae557 fix: set the histogram bucket boundaries to follow the otel spec 2024-03-25 11:20:30 +05:30
5833070358 feat: add status code label to prometheus http request counter 2024-03-25 10:49:40 +05:30
d8fe4fe49d return the order in the score details 2024-03-19 15:45:04 +01:00
0ae39644f7 fix the facet search 2024-03-19 15:07:06 +01:00
4369e9e97c add an error code test on the setting 2024-03-19 11:14:28 +01:00
7bd881b9bc adds the degraded searches to the prometheus dashboard 2024-03-19 10:35:47 +01:00
6a0c399c2f rename the search_cutoff parameter to search_cutoff_ms 2024-03-19 10:35:47 +01:00
038c26c118 stop returning the degraded boolean when a search was cutoff 2024-03-19 10:35:47 +01:00
ad9192fbbf reduce the size of an integration test 2024-03-19 10:35:47 +01:00
b8cda6c300 fix the search cutoff and add a test 2024-03-19 10:35:47 +01:00
b72495eb58 fix the settings tests 2024-03-19 10:28:23 +01:00
d1db495119 add a settings for the search cutoff 2024-03-19 10:28:23 +01:00
4a467739cd implements a first version of the cutoff without settings 2024-03-19 10:28:21 +01:00
5c95b5c933 chore: remove repetitive words
Signed-off-by: shuangcui <fliter@qq.com>
2024-03-14 21:28:55 +08:00
abd954755d Merge #4476
4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops

This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting.

Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly.

### TODO
 - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings.
         This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one.
- [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by.
  - [x] Use it when there is no search query.
  - [x] Use it when there is a search query with and without allowed typos.
  - [x] Do not change the original logic, only use a wrapper.
- [x] Add tests

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-03-13 14:36:14 +00:00
6c9823d7bb Add tests to sortFacetValuesBy count 2024-03-13 11:59:39 +01:00
9f7a4fbfeb Return the facets of a placeholder facet-search sorted by count 2024-03-13 10:09:01 +01:00
5ed7b6a0b2 Merge #4456
4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm

# Pull Request

## Related issue
[Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305)

## What does this PR do?
- Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama`
- Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set
- Changes some of the structs and functions in `openai.rs` to be public so that they can be shared.
- Added more error variants for Ollama specific errors
- It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model

Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Jakob Klemm <jakob@jeykey.net>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
2024-03-13 08:48:47 +00:00
d3a95ea2f6 Introduce a new OrderByMap struct to simplify the sort by usage 2024-03-12 13:56:56 +01:00
69c118ef76 Extract the facet order before extracting the facets values 2024-03-12 10:35:39 +01:00
8ec3e30d2b Merge branch 'main' into tmp-release-v1.7.0 2024-03-11 15:39:51 +01:00
f053c280e1 add tests when the field limit is reached 2024-03-06 18:42:41 +01:00
4d42a7af7c Merge #4445
4445: Add subcommand to run benchmarks r=irevoire a=dureuill

# Pull Request

## Related issue
Not user-facing, no issue

## What does this PR do?
- Adds a new `cargo xtask bench` subcommand that can run one or multiple workload files and report the results to a server
- A workload file is a JSON file with a specific schema
- Refactor our use of the `vergen` crate:
  - update to the beta `vergen-git2` crate
  - VERGEN_GIT_SEMVER_LIGHTWEIGHT => VERGEN_GIT_DESCRIBE
  - factor logic in a single `build-info` crate that is used both by meilisearch and xtask (prevents vergen variables from overriding themselves)
  - checked that defining the variables by hand when no git repo is available (docker build case) still works.
- Add CI to run `cargo xtask bench`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-05 14:03:57 +00:00
7408db2a46 Meilisearch: fix date formatting 2024-03-05 14:56:48 +01:00
15c38dca78 Output RFC 3339 dates where we can
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-03-05 14:44:48 +01:00
b130917933 add the content type in the webhook + improve the test 2024-03-05 11:22:29 +01:00
c608b3f9b5 Factor vergen stuff to a build-info crate 2024-03-05 10:11:43 +01:00
d3004d8040 Implemented Ollama as an embeddings provider
Initial prototype of Ollama embeddings actually working, error handlign / retries still missing.

Allow model to be any String and require dimensions parameter

Fixed rustfmt formatting issues

There were some formatting issues in the initial PR and this should not make the changes comply with the Rust style guidelines

Because I accidentally didn't follow the style guide for commits in my commit messages I squashed them into one to comply
2024-03-04 15:09:43 +01:00
452a343a2b Fix imports 2024-02-28 18:09:40 +01:00
b87485e80d Merge #4433
4433: Enhance facet incremental r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #4367
Fixes #4409

## What does this PR do?

- Add a test reproducing #4409
- Fix #4409 by removing a document from a level only if it is no more present in all the linked sub-level nodes
- Optimize facet Incremental indexing by creating or deleting a complete level once per field id instead of for each facet value
- Optimize facet Incremental indexing by doing the additions and the deletions in the same process instead of doing them separately


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-28 15:28:46 +00:00
716ffc07ee Build the embedders when importing a dump 2024-02-26 22:15:57 +01:00