Commit Graph

2477 Commits

Author SHA1 Message Date
Louis Dureuil
ee8cbea810 Don't optimize reindexing when fields contain dots 2024-03-27 17:04:45 +01:00
Louis Dureuil
572fb3a51d Finer granularity for embedder needs reindex 2024-03-27 12:01:34 +01:00
Louis Dureuil
4ff0255783 remove unused function 2024-03-27 11:51:14 +01:00
Louis Dureuil
a25456120d Expose distribution in settings 2024-03-27 11:51:04 +01:00
Louis Dureuil
168ded3b9d Deserr for distribution 2024-03-27 11:50:33 +01:00
Louis Dureuil
afd1da5642 Add distribution to all embedders 2024-03-27 11:50:22 +01:00
Clément Renault
34262c7a0d Add analytics for the negative operator 2024-03-26 18:01:27 +01:00
Clément Renault
1da9e0f246 Better support space around the negative operator (-) 2024-03-26 17:47:13 +01:00
Clément Renault
e4a3e603b3 Expose a first working version of the negative keyword 2024-03-26 17:47:13 +01:00
Louis Dureuil
817ccc089a also allow api_key 2024-03-25 11:50:00 +01:00
Louis Dureuil
4136630ea5 Use constants instead of raw strings in set_*set() 2024-03-25 11:39:33 +01:00
Louis Dureuil
58972f35cb Allow url parameter for ollama embedder 2024-03-25 11:32:55 +01:00
Louis Dureuil
dfa5e41ea6 Check validity of the URL setting 2024-03-25 11:23:16 +01:00
Louis Dureuil
a1db342f01 Expose REST embedder to the API 2024-03-25 11:23:15 +01:00
Louis Dureuil
f87747f4d3 Remove unwraps 2024-03-25 11:23:04 +01:00
Louis Dureuil
b6b4b6bab7 Remove the tokio and the reqwests 2024-03-25 11:23:03 +01:00
Louis Dureuil
ac52c857e8 Update ollama and openai impls to use the rest embedder internally 2024-03-25 11:23:03 +01:00
Louis Dureuil
8708cbef25 Add RestEmbedder 2024-03-25 11:23:03 +01:00
Louis Dureuil
c3d02f092d OpenAI sync 2024-03-25 11:23:03 +01:00
Louis Dureuil
bc58e8a310 Documentation for the vector module 2024-03-25 11:23:03 +01:00
meili-bors[bot]
ec81c2bf1a Merge #4511
4511: Bump charabia to 0.8.8 r=ManyTheFish a=6543

... and update lock file

this will add the fix (https://github.com/meilisearch/charabia/pull/275) to support markdown formatted codeblocks

Co-authored-by: 6543 <6543@obermui.de>
2024-03-25 09:26:11 +00:00
meili-bors[bot]
fc1c3f4a29 Merge #4466
4466: Implements the search cutoff r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4488

## What does this PR do?
- Adds a cutoff to the bucket sort after 150ms has been spent
- Adds a new setting to customize the default value of 150ms
- When the time is exceeded, we exit early with what we had the time to sort
- If the cutoff has been reached, the search details are updated with a new `Skip` ranking details for the ranking rules that were skipped
- Adds analytics to measure the total number of degraded search requests
- Adds the number of degraded search requests to the Prometheus metrics and Grafana dashboard
- The cutoff **must not** skip the filters; otherwise, we would leak documents to people who don’t have the right to see them


Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-20 13:06:53 +00:00
6543
4628b7b7bd bump charabia to 0.8.8
and update lock file
2024-03-20 13:39:00 +01:00
Tamo
c5322df519 Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"" 2024-03-20 10:08:28 +01:00
Tamo
6079141ea6 snapshot the scores side by side with the score details 2024-03-19 18:30:14 +01:00
Tamo
2c3af8e513 query the detailed score detail in the test 2024-03-19 18:09:02 +01:00
Louis Dureuil
098ab594eb A score of 0.0 is now lesser than a sort result
handles the niche case 🐩 in the hybrid search where:
1. a sort ranking rule is the first rule.
2. the keyword search is skipped at the first rule.
3. the semantic search is not skipped at the first rule.

Previously, we would have the skipped search winning, whereas we want the non skipped one winning.
2024-03-19 17:32:32 +01:00
Tamo
567194b925 Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1"
This reverts commit bd74cce86a, reversing
changes made to d2f77e88bd.
2024-03-19 16:56:21 +01:00
Tamo
d8fe4fe49d return the order in the score details 2024-03-19 15:45:04 +01:00
Tamo
7b9e0d2944 forward the degraded parameter to the hybrid search 2024-03-19 15:11:21 +01:00
Tamo
bfec9468d4 Update milli/src/search/mod.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-03-19 14:49:15 +01:00
Clément Renault
bd74cce86a Merge remote-tracking branch 'origin/main' into release-v1.7.1 2024-03-19 13:39:17 +01:00
Tamo
b8cda6c300 fix the search cutoff and add a test 2024-03-19 10:35:47 +01:00
Tamo
d1db495119 add a settings for the search cutoff 2024-03-19 10:28:23 +01:00
Tamo
4a467739cd implements a first version of the cutoff without settings 2024-03-19 10:28:21 +01:00
Louis Dureuil
a302e258bd Don't display dimensions as 0 when it is not set 2024-03-18 16:10:12 +01:00
shuangcui
5c95b5c933 chore: remove repetitive words
Signed-off-by: shuangcui <fliter@qq.com>
2024-03-14 21:28:55 +08:00
meili-bors[bot]
abd954755d Merge #4476
4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops

This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting.

Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly.

### TODO
 - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings.
         This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one.
- [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by.
  - [x] Use it when there is no search query.
  - [x] Use it when there is a search query with and without allowed typos.
  - [x] Do not change the original logic, only use a wrapper.
- [x] Add tests

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-03-13 14:36:14 +00:00
Clément Renault
f3fc2bd01f Address some issues with preallocations 2024-03-13 15:22:14 +01:00
Clément Renault
e0dac5a22f Simplify the algorithm by using the new facet values collection wrapper 2024-03-13 11:31:34 +01:00
Clément Renault
b918b55c6b Introduce a new facet value collection wrapper to simply the usage 2024-03-13 11:31:34 +01:00
Clément Renault
306b25ad3a Move the searchForFacetValues struct into a dedicated module 2024-03-13 10:24:21 +01:00
Clément Renault
9f7a4fbfeb Return the facets of a placeholder facet-search sorted by count 2024-03-13 10:09:01 +01:00
meili-bors[bot]
5ed7b6a0b2 Merge #4456
4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm

# Pull Request

## Related issue
[Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305)

## What does this PR do?
- Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama`
- Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set
- Changes some of the structs and functions in `openai.rs` to be public so that they can be shared.
- Added more error variants for Ollama specific errors
- It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model

Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: Jakob Klemm <jakob@jeykey.net>
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
2024-03-13 08:48:47 +00:00
Louis Dureuil
ae67d5eef0 Update milli/src/vector/error.rs
Fix Meilisearch capitalization
2024-03-13 09:45:04 +01:00
Jakob Klemm
88bc9556a9 Add Ollama dimension inference and add clearer errors
Instead of the user manually specifying the model dimensions it will now automatically get determined
Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output
Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user
2024-03-12 19:59:11 +01:00
Clément Renault
ca4876fd10 Do not reindex when modifying unknown faceted field 2024-03-12 16:18:58 +01:00
Clément Renault
d3a95ea2f6 Introduce a new OrderByMap struct to simplify the sort by usage 2024-03-12 13:56:56 +01:00
Clément Renault
69c118ef76 Extract the facet order before extracting the facets values 2024-03-12 10:35:39 +01:00
meili-bors[bot]
ee3076d5ba Merge #4462
4462: Divide threshold by ten r=dureuill a=ManyTheFish

Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-03-06 13:05:38 +00:00