2f10273d14
Group by normalized values, make sure you don't remove a value where there remains at still one value that normalizes towards it
2024-08-08 14:02:53 +02:00
e3ef0ae19e
also intersect the universe for searchOnAttributes
2024-08-06 14:06:56 +02:00
57f7af77c7
Merge #4846
...
4846: Add OpenAI tests r=dureuill a=dureuill
# Pull Request
## Related issue
Part of fixing #4757
## What does this PR do?
- OpenAI embedder: don't pass apiKey when it is empty (slightly improves error messages)
- rest embedder and rest-based embedders: specialize the authorization denied error message depending on the configuration source
- fix existing tests
- Adds assets containing prerecorded texts to embed and the embeddings obtained from OpenAI
- Adds an asset containing a tokenized long document and the embedding obtained from OpenAI for this token
- Uses the wiremock crate to mock the OpenAI API: parse the openai request, lookup the response in assets, craft an openai response
Co-authored-by: Louis Dureuil <louis@meilisearch.com >
2024-08-05 10:49:28 +00:00
e64d0e0ca8
use insert instead of push for bitmaps
2024-08-01 18:32:45 +02:00
9ef710cad4
Use wrapper that forces the desired date format
2024-07-31 17:12:19 +02:00
5aa6cb3600
Specialize authorized error message depending on config source
2024-07-31 15:03:44 +02:00
9b7764575b
openai: don't pass apiKey when it is empty
2024-07-31 15:03:44 +02:00
0e68718027
Add detailed spans
2024-07-31 13:05:47 +02:00
7c3fc8c655
Split settings and document facet string extractions
2024-07-31 10:57:46 +02:00
8acd3f50bb
skip normalization when the locales and values are the same
2024-07-31 09:53:00 +02:00
d262b1df32
craft an API over the Shared Server and Shared index to avoid hard to debug mistakes
2024-07-30 14:24:57 +02:00
d4ea7cc2a9
fix clippy đ đ
2024-07-25 12:10:32 +02:00
2413592bbf
Display docid when there are documents without manual embeddings for a manual embedder
2024-07-25 12:10:32 +02:00
553440632e
Introduce Setting::some_or_not_set
2024-07-25 12:01:52 +02:00
7a347966da
Allow explicit dimensions
for ollama
2024-07-25 12:01:51 +02:00
4654d51e05
Add custom headers for REST embedder
2024-07-25 12:01:51 +02:00
a918561ac1
Fix PR comments
2024-07-25 10:52:56 +02:00
70d71581ee
fix clippy
2024-07-25 10:52:56 +02:00
04fa44e7eb
Implement localized attributes settings
2024-07-25 10:51:27 +02:00
90c0a6db7d
Implement localized search
2024-07-25 10:51:27 +02:00
cc02920f2b
Update charabia
2024-07-25 10:51:27 +02:00
988552e178
add tests on the rest embedder
2024-07-24 14:34:17 +02:00
0d8199f3b7
Change parameters in milli settings
2024-07-24 14:34:17 +02:00
4b74803dae
Change parameters in vector settings
2024-07-24 14:34:17 +02:00
d731fa661b
ollama and openai use new EmbedderOptions
2024-07-24 14:34:17 +02:00
a1beddd5d9
rest embedder: use json_template
2024-07-24 14:34:17 +02:00
4109182ca4
Add json_template module
2024-07-24 14:34:12 +02:00
1a297c048e
Error changes
2024-07-24 14:34:12 +02:00
303e601b87
HuggingFace: Clearer error message when a model is not supported
2024-07-23 15:13:22 +02:00
ea73615abf
Merge #4804
...
4804: Implements the experimental contains filter operator r=irevoire a=irevoire
# Pull Request
Related PRD: (private link) https://www.notion.so/meilisearch/Contains-Like-Filter-Operator-0d8ad53c6761466f913432eb1d843f1e
Public usage page: https://meilisearch.notion.site/Contains-filter-operator-usage-3e7421b0aacf45f48ab09abe259a1de6
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3613
## What does this PR do?
- Extract the contains operator from this PR: https://github.com/meilisearch/meilisearch/pull/3751
- Gate it behind a feature flag
- Add tests
Co-authored-by: Tamo <tamo@meilisearch.com >
2024-07-17 15:47:11 +00:00
02c61eabfa
fix the range reported when the experimental feature has not been set
2024-07-17 16:54:33 +02:00
2af9481804
Implements the experimental contains filter operator«
2024-07-17 11:13:37 +02:00
24240934f9
Improve errors when indexing documents with a user provided embedder
2024-07-16 13:39:01 +02:00
f4c94ac57f
manual embedders: limit max size of errors to 250
2024-07-16 13:39:01 +02:00
4087a88dbe
rest|ollama|openai: increase tries to 10 + randomize retry duration
2024-07-16 13:39:00 +02:00
5adacf2f45
OpenAI: embed only the first MAX_TOKENS tokens
2024-07-16 13:39:00 +02:00
65d0c32aa7
Allow overriding OpenAI's url
2024-07-16 13:39:00 +02:00
82647bcded
When retrieveVectors
is true, retrieve _vectors.embedder
even if there are no vector for that embedder
2024-07-16 13:39:00 +02:00
e83da00446
Milli changes to match to allow for more flexible lifetimes
2024-07-11 16:29:35 +02:00
7fb3e378ff
Do not fail sort comparisons when the field name or target point are different
2024-07-11 16:28:14 +02:00
29b44e5541
Merge #4626
...
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops
This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808 ). It uses [the Rhai programming language](https://rhai.rs/ ) to let users express the modifications they want apply.
You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25 ). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483 ) through Docker by using the following command:
```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```
## TODO
- [x] Support the `DocumentEdition` task in dumps.
- [x] Remove the unwraps and panics.
- [x] Improve error codes for the `function` parameter.
- [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0 ) đ
- [x] Make it an experimental feature (only restrict the HTTP calls).
- [x] It must be possible not to send a context.
- [x] Rebase on main.
- [x] Check that the script cannot do any io.
- [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
- [x] Change the `editionCode` to the clearer `function` field name in the tasks.
- [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
- [x] Support deleting documents when the `doc` is `()` (nil, null).
- [x] Support canceling document edition.
- [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/ )).
- [x] Limit the number of instruction by function execution.
- [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
- [x] Ignore unmodified documents in the tasks count.
- [x] Make the `filter` field optional (not forced to be `null`).
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-07-11 09:02:55 +00:00
6e80364c50
Apply review comments
2024-07-11 11:00:27 +02:00
3bac22fd87
We do not do intersections with the universe when it is related to cache
2024-07-10 16:49:36 +02:00
ce61cb7fe6
Simplify and speedup an intersection pass
2024-07-10 16:49:36 +02:00
1693d1a311
Simplify the check to decide to stop a loop
2024-07-10 16:49:36 +02:00
febea735ca
Remove the unused universe parameter from resolve_negative_phrases
2024-07-10 16:49:36 +02:00
93ba051094
Remove the invalid get_phrases_docids universe parameter
2024-07-10 16:49:35 +02:00
cd7a20fa32
Make it work by avoid storing invalid stuff in the cache
2024-07-10 16:49:35 +02:00
41f51adbec
Do less useless intersections
2024-07-10 16:49:35 +02:00
0ca1a4e805
Always do the intersections with the universe
2024-07-10 16:49:34 +02:00