Commit Graph

9171 Commits

Author SHA1 Message Date
Louis Dureuil
9800d5a103 Set rust toolchain to 1.71.1 in dockerfile 2023-12-18 10:59:25 +01:00
meili-bors[bot]
7c4ed07617 Merge #4257
4257: Change proximity precision settings r=dureuill a=ManyTheFish

- [x] Add proximity_precision value into the analytics
- [x] Change the naming of `attributeScale` and `wordScale` into `byAttribute` and `byWord`
- [x] Remove proximityPrecision from the experimental feature

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2023-12-18 09:07:28 +00:00
ManyTheFish
3a99a555a2 Fix experimental features snapshots in tests 2023-12-18 10:05:51 +01:00
Many the fish
9e1b458010 Merge branch 'main' into change-proximity-precision-settings 2023-12-18 09:08:47 +01:00
meili-bors[bot]
2aede03bc2 Merge #4226
4226: Hybrid search r=dureuill a=dureuill

Allows to perform hybrid search requests that combine the results of semantic and keyword search and automatically generate embeddings.

## How to use

See [feature description](https://meilisearch.notion.site/v1-6-Hybrid-Search-Embedders-ea42c82f90cc4bc0be1eeb917c1118c8)

## Changes

- work is based on #4213 
- milli::new search now takes an input universe directly, rather than computing it from a filter. This adds flexibility to require results on a subset of documents
- vector search is now a regular ranking rule (akin to sort and geosort) and reports its score as a ScoreDetail
- separate keyword search and vector search functions, vector search now respects (geo)sort ranking rules
- add automatic embedding
- add hybrid search

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-12-14 16:24:56 +00:00
ManyTheFish
e741bc1c62 Add proximity_precision value into the analytics 2023-12-14 16:48:06 +01:00
ManyTheFish
6425996e36 Change the naming of attributeScale and wordScale into byAttribute and byWord 2023-12-14 16:31:00 +01:00
Louis Dureuil
eb5cb91da2 Switch default from hf to openai 2023-12-14 16:19:46 +01:00
Louis Dureuil
87bba98bd8 Various changes
- fixed seed for arroy
- check vector dimensions as soon as it is provided to search
- don't embed whitespace
2023-12-14 16:08:42 +01:00
Louis Dureuil
217105b7da hybrid search uses semantic ratio, error handling 2023-12-14 16:08:42 +01:00
ManyTheFish
1b7c164a55 Pass the semantic ratio to milli 2023-12-14 16:08:42 +01:00
ManyTheFish
f3f3944469 Fix error checking 2023-12-14 16:08:42 +01:00
ManyTheFish
93dcbf598d Deserialize semantic ratio 2023-12-14 16:08:42 +01:00
ManyTheFish
ac68f33194 Add simple test 2023-12-14 16:08:42 +01:00
ManyTheFish
9991152bbe Add TODOs 2023-12-14 16:08:42 +01:00
Louis Dureuil
a4536b1381 Small adjustments to respect the spec 2023-12-14 16:08:42 +01:00
Louis Dureuil
5b51cb04af Remove some settings 2023-12-14 16:08:42 +01:00
Louis Dureuil
3c1a14f1cd Add settings routes 2023-12-14 16:08:42 +01:00
Louis Dureuil
b8e4709dfa Remove prompt strategy and fallback 2023-12-14 16:08:41 +01:00
Louis Dureuil
806e5b6899 Tests pass 2023-12-14 16:08:41 +01:00
Louis Dureuil
61bd2fb7a9 Update arroy 2023-12-14 16:08:41 +01:00
Louis Dureuil
e0cc775dc4 Various changes
- DistributionShift in Search object (to be set from model in embed?)
- Fix issue where embedder index wasn't computed at search time
- Accept as default embedder either the "default" one, or the only embedder when there is only one
2023-12-14 16:08:41 +01:00
Louis Dureuil
12940d79a9 WIP
- manual embedder
- multi embedders OK
- clippy + tests OK
2023-12-14 16:08:41 +01:00
Louis Dureuil
922a640188 WIP multi embedders
fixed template bugs
2023-12-14 16:08:41 +01:00
Louis Dureuil
abbe131084 Cosmetic change 2023-12-14 16:08:41 +01:00
Louis Dureuil
d4715e0c4d Fix same vector sort bug 2023-12-14 16:08:41 +01:00
Louis Dureuil
11e2a2c1aa Fix geosort bug 2023-12-14 16:08:41 +01:00
Louis Dureuil
65e49b7092 Remove stuff, add distribution shift (WIP) 2023-12-14 16:08:38 +01:00
Louis Dureuil
e56f160032 Actually pass embedders on reindex 2023-12-14 16:07:49 +01:00
Louis Dureuil
687d92f217 prompt bifluor+ 2023-12-14 16:07:49 +01:00
Louis Dureuil
fb539f61fe WIP 2023-12-14 16:07:49 +01:00
Louis Dureuil
cb4ebe163e WIP 2023-12-14 16:07:49 +01:00
Louis Dureuil
dde3a04679 WIP arroy integration 2023-12-14 16:07:49 +01:00
Louis Dureuil
13c2c6c16b Small commit to add hybrid search and autoembedding 2023-12-14 16:07:48 +01:00
Louis Dureuil
21bcf32109 Add candle and hg_hub, updating a lot of deps in the process 2023-12-14 16:07:48 +01:00
ManyTheFish
35e1981488 Remove proximityPrecision form the experimental feature 2023-12-14 15:52:42 +01:00
meili-bors[bot]
e0f712b9d3 Merge #4254
4254: Bring back v1.5.1 changes into main r=ManyTheFish a=Kerollmops

This pull request brings back changes from the _release-v1.5.1_ branch into _main_.

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: curquiza <curquiza@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-12-14 09:41:57 +00:00
Clément Renault
56571f762a Merge remote-tracking branch 'origin/main' into tmp-release-v1.5.1 2023-12-13 11:57:01 +01:00
Clément Renault
005800634d Merge pull request #4249 from meilisearch/flag-limit-batch-size
Introduce parameters to limit the number of batched tasks
2023-12-13 10:32:14 +01:00
Clément Renault
976af4fa8f Add the default commented experimental batched tasks limit parameter to the config file 2023-12-12 10:59:00 +01:00
Clément Renault
99fec27788 Make the --max-number-of-batched-tasks argument experimental prototype-limit-batched-tasks-1 2023-12-12 10:55:39 +01:00
meili-bors[bot]
afa8f273a8 Merge #4250
4250: Update version for the next release (v1.5.1) in Cargo.toml r=dureuill a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging.

Co-authored-by: curquiza <curquiza@users.noreply.github.com>
v1.5.1
2023-12-12 08:26:06 +00:00
curquiza
4b644f6bc0 Update version for the next release (v1.5.1) in Cargo.toml 2023-12-11 17:15:11 +00:00
Clément Renault
7e259cb0d2 Expose the --max-number-of-batched-tasks argument prototype-limit-batched-tasks-0 2023-12-11 16:08:39 +01:00
meili-bors[bot]
0fbc1511d7 Merge #4225
4225: [EXP] Let the user customize the proximity precision r=dureuill a=ManyTheFish

# Pull Request
This PR introduces a new setting `proximityPrecision` allowing the user to trade indexing time with search precision on proximity-based features:
- proximity ranking rules
- multi-word synonyms
- phrase search
- split-words

I put the API PRD below:
https://www.notion.so/meilisearch/3988b345b5b248948a4a0dc5932a18ce?v=45d79150adb84b0aa27826ff6da2e029&p=aa69c2bab2c3402bab9340ae4def4577&pm=s

## Related issue
Fixes #4187

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-12-06 17:21:43 +00:00
ManyTheFish
c9860c7913 Small test fixes prototype-proximity-precision-0 2023-12-06 15:49:05 +01:00
ManyTheFish
03ffabe889 Add a new dump test 2023-12-06 15:49:05 +01:00
ManyTheFish
1f4fc9c229 Make the feature experimental 2023-12-06 15:49:05 +01:00
ManyTheFish
8cc3c54117 Add proximityPrecision setting in settings route 2023-12-06 15:49:05 +01:00
ManyTheFish
467b49153d Implement proximityPrecision setting on milli side 2023-12-06 15:49:02 +01:00