Commit Graph

2086 Commits

Author SHA1 Message Date
3bac22fd87 We do not do intersections with the universe when it is related to cache 2024-07-10 16:49:36 +02:00
ce61cb7fe6 Simplify and speedup an intersection pass 2024-07-10 16:49:36 +02:00
1693d1a311 Simplify the check to decide to stop a loop 2024-07-10 16:49:36 +02:00
febea735ca Remove the unused universe parameter from resolve_negative_phrases 2024-07-10 16:49:36 +02:00
93ba051094 Remove the invalid get_phrases_docids universe parameter 2024-07-10 16:49:35 +02:00
cd7a20fa32 Make it work by avoid storing invalid stuff in the cache 2024-07-10 16:49:35 +02:00
41f51adbec Do less useless intersections 2024-07-10 16:49:35 +02:00
0ca1a4e805 Always do the intersections with the universe 2024-07-10 16:49:34 +02:00
50a7393c55 Modify the compute_query_term_subset_docids function to accept the universe 2024-07-10 16:49:34 +02:00
2099b4f0dd Merge #4786
4786: Update dependencies r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes #4753

## What does this PR do?
- Update all dependencies except rustls
- [x] Release charabia
- [x] Update charabia
- [x] Double check that the docker build works after updating charabia



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-07-10 13:23:54 +00:00
4d5005b01a make clippy happy 2024-07-10 10:06:59 +02:00
0a40a98bb6 Make milli use edition 2021 (#4770)
* Make milli use edition 2021

* Add lifetime annotations to milli.

* Run cargo fmt
2024-07-09 17:25:39 +02:00
cd46ebd6b5 remove insta deprecating 2024-07-08 18:38:05 +02:00
128e6c7502 Search: spans with a finer granularity 2024-07-02 16:13:53 +02:00
015d90a962 merge main 2024-07-01 11:50:36 +02:00
e53de15b8e Fix behavior of limit and offset for hybrid search when keyword results are returned early
The test is fixed
2024-06-27 14:25:33 +02:00
ce08dc509b add more tests and improve the location of the error 2024-06-27 11:51:45 +02:00
1daaed163a Make _vectors.:embedding.regenerate mandatory + tests + error messages 2024-06-27 11:04:58 +02:00
7e3c306c54 Merge #4725
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong

# Pull Request

## Related issue
Fixes #4696 

## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: JWSong <thdwjddn123@gmail.com>
2024-06-26 07:06:04 +00:00
dcdc83946f accept large number as string 2024-06-25 21:41:47 +09:00
3c4c46377b Merge #4665
4665: Add missing Korean support r=ManyTheFish a=junhochoi

Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.

# Pull Request

## Related issue

#3443 #3882 

## What does this PR do?
- Improvement on enabling Korean support

Inspired by the work (#3882) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Junho Choi <jh.choi@catenoid.net>
2024-06-25 11:51:21 +00:00
d75e0098c7 Fixes for Rust v1.79 2024-06-25 11:16:06 +02:00
2e0ff56f3f Add missing Korean support
Some configuration is missing `korean` features and
add a test case in `milli/src/search/mod.rs`.
2024-06-25 12:45:21 +09:00
1693332cab Update arroy and always build the tree that need to be built 2024-06-24 10:14:03 +02:00
ddd564665b Merge #4713
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops

This PR is akin to #4682, but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.

A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-06-24 05:23:46 +00:00
9736e16a88 Make clippy happy 2024-06-20 13:02:44 +02:00
6fa4da8ae7 Improve facet distribution speed in count mode 2024-06-20 12:58:51 +02:00
19d7cdc20d Improve facet distribution speed in lexico mode 2024-06-20 12:57:08 +02:00
a04041c8f2 Only spawn the pool once 2024-06-19 16:25:33 +02:00
e580d6b98f Merge #4693
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops

This PR fixes #4611.

### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-18 07:45:03 +00:00
43875e6758 fix bug around nested fields 2024-06-17 15:59:30 +02:00
e9bf4c43a4 Merge #4649
4649: Don't store the vectors in the documents database r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607

## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-17 12:32:03 +00:00
0a8f50695e Fixes for Rust v1.79 2024-06-13 17:47:44 +02:00
e35ef31738 Small changes following review 2024-06-13 14:20:48 +02:00
3bc8f81abc user_provided => regenerate 2024-06-12 18:12:20 +02:00
a89eea233b Fix vectors injection 2024-06-12 17:10:19 +02:00
f5cf01e7d1 Rework extraction to use EmbedderAction 2024-06-12 14:50:55 +02:00
d1dd7e5d09 In transform for removed embedders, write back their user provided vectors in documents, and clear the writers 2024-06-12 14:50:55 +02:00
d18c1f77d7 Update embedder configs with a finer granularity
- no longer clear vector DB between any two embedder changes
2024-06-12 14:50:55 +02:00
d0b05ae691 Add EmbedderAction to settings 2024-06-12 14:50:54 +02:00
e9bf4eb100 Reformulate ParsedVectorsDiff in terms of VectorState 2024-06-12 14:11:44 +02:00
b368105272 Add EmbedderConfigs::into_inner 2024-06-12 14:11:44 +02:00
e0eff08095 Merge #4685
4685: Fix ci tests r=dureuill a=ManyTheFish

# Pull Request
Make the all following CI succeed:
https://github.com/meilisearch/meilisearch/actions/runs/9477183091

## Related issue
Fixes #4629

## What does this PR do?
- Change the test behavior for `swedish-recomposition` feature flag
- Remove the `-v` parameter from grep

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-06-12 07:58:33 +00:00
39f60abd7d Add and modify distinct tests 2024-06-11 17:53:53 -04:00
1991bd03da Distinct at search erases the distinct in the settings 2024-06-11 17:02:39 -04:00
ee39309aae Improve errors and introduce a new InvalidSearchDistinct error code 2024-06-11 16:03:39 -04:00
0d31be1494 Make the distinct work at search 2024-06-11 11:39:35 -04:00
7cef2299cf Fix behavior when removing a document 2024-06-11 09:45:08 +02:00
57d066595b fix Tests almost all features 2024-06-06 17:24:50 +02:00
75b2e02cd2 Log more stuff around filtering 2024-06-06 11:00:07 -04:00