24240934f9
Improve errors when indexing documents with a user provided embedder
2024-07-16 13:39:01 +02:00
f4c94ac57f
manual embedders: limit max size of errors to 250
2024-07-16 13:39:01 +02:00
4087a88dbe
rest|ollama|openai: increase tries to 10 + randomize retry duration
2024-07-16 13:39:00 +02:00
5adacf2f45
OpenAI: embed only the first MAX_TOKENS tokens
2024-07-16 13:39:00 +02:00
65d0c32aa7
Allow overriding OpenAI's url
2024-07-16 13:39:00 +02:00
82647bcded
When retrieveVectors
is true, retrieve _vectors.embedder
even if there are no vector for that embedder
2024-07-16 13:39:00 +02:00
e83da00446
Milli changes to match to allow for more flexible lifetimes
2024-07-11 16:29:35 +02:00
7fb3e378ff
Do not fail sort comparisons when the field name or target point are different
2024-07-11 16:28:14 +02:00
29b44e5541
Merge #4626
...
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops
This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808 ). It uses [the Rhai programming language](https://rhai.rs/ ) to let users express the modifications they want apply.
You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25 ). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483 ) through Docker by using the following command:
```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```
## TODO
- [x] Support the `DocumentEdition` task in dumps.
- [x] Remove the unwraps and panics.
- [x] Improve error codes for the `function` parameter.
- [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0 ) đ
- [x] Make it an experimental feature (only restrict the HTTP calls).
- [x] It must be possible not to send a context.
- [x] Rebase on main.
- [x] Check that the script cannot do any io.
- [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
- [x] Change the `editionCode` to the clearer `function` field name in the tasks.
- [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
- [x] Support deleting documents when the `doc` is `()` (nil, null).
- [x] Support canceling document edition.
- [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/ )).
- [x] Limit the number of instruction by function execution.
- [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
- [x] Ignore unmodified documents in the tasks count.
- [x] Make the `filter` field optional (not forced to be `null`).
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-07-11 09:02:55 +00:00
6e80364c50
Apply review comments
2024-07-11 11:00:27 +02:00
3bac22fd87
We do not do intersections with the universe when it is related to cache
2024-07-10 16:49:36 +02:00
ce61cb7fe6
Simplify and speedup an intersection pass
2024-07-10 16:49:36 +02:00
1693d1a311
Simplify the check to decide to stop a loop
2024-07-10 16:49:36 +02:00
febea735ca
Remove the unused universe parameter from resolve_negative_phrases
2024-07-10 16:49:36 +02:00
93ba051094
Remove the invalid get_phrases_docids universe parameter
2024-07-10 16:49:35 +02:00
cd7a20fa32
Make it work by avoid storing invalid stuff in the cache
2024-07-10 16:49:35 +02:00
41f51adbec
Do less useless intersections
2024-07-10 16:49:35 +02:00
0ca1a4e805
Always do the intersections with the universe
2024-07-10 16:49:34 +02:00
50a7393c55
Modify the compute_query_term_subset_docids function to accept the universe
2024-07-10 16:49:34 +02:00
837274f853
Restrict even more the Rhai engine
2024-07-10 16:30:18 +02:00
aace587dd1
Create errors for the internal processing ones
2024-07-10 16:29:18 +02:00
f35d6710f3
Update rhai to v1.19.0
2024-07-10 16:29:17 +02:00
81ec0abad1
Use the new rayon-par-bridge library
2024-07-10 16:29:04 +02:00
b67d385cf0
Parallelize the edition functions
2024-07-10 16:28:54 +02:00
dfecb25814
Disable the time package
2024-07-10 16:28:37 +02:00
2eae2015d7
Support aborting documents edition by function
2024-07-10 16:28:15 +02:00
33fa17bf12
Support deleting documents with functions
2024-07-10 16:28:15 +02:00
400e6b93ce
Support user-provided context for documents edition
2024-07-10 16:28:15 +02:00
f4add93043
Limit the number of script operations
2024-07-10 16:28:14 +02:00
2fae96ac14
Show the actual number of actually edited documents
2024-07-10 16:28:14 +02:00
45af18ae9c
Check the Rhai syntax before accepting the script
2024-07-10 16:28:13 +02:00
2d97164d9f
It works perfectly with some Rhai
2024-07-10 16:28:13 +02:00
efc156a4a4
Executing Lua works correctly
2024-07-10 16:27:36 +02:00
2099b4f0dd
Merge #4786
...
4786: Update dependencies r=Kerollmops a=irevoire
# Pull Request
## Related issue
Fixes #4753
## What does this PR do?
- Update all dependencies except rustls
- [x] Release charabia
- [x] Update charabia
- [x] Double check that the docker build works after updating charabia
Co-authored-by: Tamo <tamo@meilisearch.com >
Co-authored-by: Clément Renault <clement@meilisearch.com >
2024-07-10 13:23:54 +00:00
9d6885793e
Upgrade dependencies
2024-07-10 13:46:24 +02:00
5f4530ce57
Remove more unused dependencies
2024-07-10 13:36:34 +02:00
4d5005b01a
make clippy happy
2024-07-10 10:06:59 +02:00
952e742321
update charabia
2024-07-09 23:41:29 +02:00
0a40a98bb6
Make milli use edition 2021 ( #4770 )
...
* Make milli use edition 2021
* Add lifetime annotations to milli.
* Run cargo fmt
2024-07-09 17:25:39 +02:00
cd46ebd6b5
remove insta deprecating
2024-07-08 18:38:05 +02:00
6afa578688
update most incompatible dependencies
2024-07-08 18:31:15 +02:00
300bdfc2a7
update most dependencies
2024-07-08 18:09:12 +02:00
128e6c7502
Search: spans with a finer granularity
2024-07-02 16:13:53 +02:00
015d90a962
merge main
2024-07-01 11:50:36 +02:00
e53de15b8e
Fix behavior of limit and offset for hybrid search when keyword results are returned early
...
The test is fixed
2024-06-27 14:25:33 +02:00
ce08dc509b
add more tests and improve the location of the error
2024-06-27 11:51:45 +02:00
1daaed163a
Make _vectors.:embedding.regenerate mandatory + tests + error messages
2024-06-27 11:04:58 +02:00
7e3c306c54
Merge #4725
...
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong
# Pull Request
## Related issue
Fixes #4696
## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: JWSong <thdwjddn123@gmail.com >
2024-06-26 07:06:04 +00:00
dcdc83946f
accept large number as string
2024-06-25 21:41:47 +09:00
3c4c46377b
Merge #4665
...
4665: Add missing Korean support r=ManyTheFish a=junhochoi
Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.
# Pull Request
## Related issue
#3443 #3882
## What does this PR do?
- Improvement on enabling Korean support
Inspired by the work (#3882 ) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Junho Choi <jh.choi@catenoid.net >
2024-06-25 11:51:21 +00:00