c3cdc407ec
Avoid unnecessary clone()
2024-08-08 14:57:02 +02:00
2f10273d14
Group by normalized values, make sure you don't remove a value where there remains at still one value that normalizes towards it
2024-08-08 14:02:53 +02:00
d4ea7cc2a9
fix clippy 👉 👈
2024-07-25 12:10:32 +02:00
2413592bbf
Display docid when there are documents without manual embeddings for a manual embedder
2024-07-25 12:10:32 +02:00
04fa44e7eb
Implement localized attributes settings
2024-07-25 10:51:27 +02:00
cc02920f2b
Update charabia
2024-07-25 10:51:27 +02:00
24240934f9
Improve errors when indexing documents with a user provided embedder
2024-07-16 13:39:01 +02:00
0a40a98bb6
Make milli use edition 2021 ( #4770 )
...
* Make milli use edition 2021
* Add lifetime annotations to milli.
* Run cargo fmt
2024-07-09 17:25:39 +02:00
ddd564665b
Merge #4713
...
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops
This PR is akin to #4682 , but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.
A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-06-24 05:23:46 +00:00
9736e16a88
Make clippy happy
2024-06-20 13:02:44 +02:00
a04041c8f2
Only spawn the pool once
2024-06-19 16:25:33 +02:00
e35ef31738
Small changes following review
2024-06-13 14:20:48 +02:00
3bc8f81abc
user_provided => regenerate
2024-06-12 18:12:20 +02:00
f5cf01e7d1
Rework extraction to use EmbedderAction
2024-06-12 14:50:55 +02:00
7cef2299cf
Fix behavior when removing a document
2024-06-11 09:45:08 +02:00
2cdcb703d9
fix the deletion of vectors and add a test
2024-06-06 11:39:29 +02:00
b7349910d9
implements mor review comments
2024-06-06 11:39:29 +02:00
5d50850e12
always push the user defined vectors in arroy
2024-06-06 11:39:29 +02:00
a73ccc78a6
forward the embedding config to the extractors
2024-06-06 11:39:28 +02:00
84e498299b
Remove the vectors from the documents database
2024-06-06 11:36:11 +02:00
b833be46b9
Avoid running proximity when only the exact attributes changes
2024-06-05 17:30:07 +02:00
261e92d7e6
Skip iterating over documents when the faceted field list doesn't change
2024-06-05 17:30:07 +02:00
5cd08979b1
iterate over the faceted fields instead of over the whole document
2024-06-05 17:30:07 +02:00
e1fbfde6c4
Merge branch 'main' into merge-release-v1.8.1-in-main
2024-05-29 11:31:03 +02:00
dc949ab46a
Remove puffin usage
2024-05-27 15:59:14 +02:00
8a941c0241
Smaller review changes
2024-05-22 14:44:42 +02:00
f762307838
Fix clippy
2024-05-21 13:44:20 +02:00
3e94a90722
Fixes
2024-05-21 13:39:46 +02:00
fc7e817221
Index geo points based on the settings differences
2024-05-20 12:27:26 +02:00
52d9cb6e5a
Refactor vector indexing
...
- use the parsed_vectors module
- only parse `_vectors` once per document, instead of once per embedder per document
2024-05-20 10:36:17 +02:00
c22460045c
Stops returning an option in the internal searchable fields
2024-05-14 17:00:02 +02:00
4d5971f343
Merge #4621
...
4621: Bring back changes from v1.8.0 into main r=curquiza a=curquiza
Co-authored-by: ManyTheFish <many@meilisearch.com >
Co-authored-by: Tamo <tamo@meilisearch.com >
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-05-06 13:46:39 +00:00
ebca29f3de
Merge #4597
...
4597: Fix embeddings settings update r=ManyTheFish a=ManyTheFish
# Pull Request
- add some conditions reducing the work done when changing the settings
- add some benchmarks on embedders
## Related issue
Fixes #4585
Co-authored-by: ManyTheFish <many@meilisearch.com >
2024-04-25 16:37:28 +00:00
d4aeff92d0
Introduce the ThreadPoolNoAbort wrapper
2024-04-24 16:40:12 +02:00
a1aa999026
Add conditions reducing wrok
2024-04-22 14:18:35 +02:00
df29ba709a
Make some cleaning in Arcs
2024-04-17 12:33:25 +02:00
3acfab2eb7
Fix PR comments
2024-04-17 10:55:51 +02:00
87a93ba47d
fix clippy
2024-04-16 14:39:30 +02:00
eaf113ef34
Fix wod pair proximity error when nothing has to be extracted
2024-04-16 14:39:30 +02:00
e5ae337aae
Comeback to sorters in extract_word_docids
...
using buffers and merge the keys manually is less efficient
2024-04-16 14:39:30 +02:00
a489b406b4
fix test
2024-04-16 14:39:06 +02:00
02c3d6b265
finish work
2024-04-16 14:39:06 +02:00
b5e4a55af6
refactor faceted and searchable pipeline
2024-04-16 14:39:06 +02:00
cf864a1c2e
chore: fix some typos in comments
...
Signed-off-by: yudrywet <yudeyao@yeah.net >
2024-04-14 20:11:34 +08:00
f87747f4d3
Remove unwraps
2024-03-25 11:23:04 +01:00
ac52c857e8
Update ollama and openai impls to use the rest embedder internally
2024-03-25 11:23:03 +01:00
b11df7ec34
Meilisearch: fix some wrong spans
2024-03-05 10:11:43 +01:00
3beda8833d
Fix and add logs
2024-02-14 11:46:30 +01:00
e5e811e2c9
Update milli/src/update/index_documents/extract/mod.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-02-13 14:22:21 +01:00
be1b054b05
Compute chunk size based on the input data size ant the number of indexing threads
2024-02-08 17:28:37 +01:00