Commit Graph

775 Commits

Author SHA1 Message Date
d3004d8040 Implemented Ollama as an embeddings provider
Initial prototype of Ollama embeddings actually working, error handlign / retries still missing.

Allow model to be any String and require dimensions parameter

Fixed rustfmt formatting issues

There were some formatting issues in the initial PR and this should not make the changes comply with the Rust style guidelines

Because I accidentally didn't follow the style guide for commits in my commit messages I squashed them into one to comply
2024-03-04 15:09:43 +01:00
023c2d755f Merge #4391
4391: Tracing r=dureuill a=irevoire

# Pull Request

- [ ] Hide the parameters of the process batch
- [x] Make actix-web trace every call on every route
- [x] Remove all `env_logger`/`logs` dependencies
- [x] Be able to enable or disable the memory measurement using the `/logs` route parameters

See the following product discussion: https://github.com/orgs/meilisearch/discussions/721

Supersedes https://github.com/meilisearch/meilisearch/pull/4338

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4317

## What does this PR do?

Update the format of the logs from:
```
[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers
```

to

```
2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers
```

First, run meilisearch with the route enabled via the feature flag:
- `cargo run --experimental-enable-logs-route`
- Or at runtime by sending the following payload:
```
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "logsRoute": true
  }'
```

Then gather data from meilisearch by calling for example:
```
curl \
	-X POST http://localhost:7700/logs \
	-H 'Content-Type: application/json' \
	--data-binary '{
	    "mode": "fmt",
            "target": "milli=trace"
    }'
```

Once your operation is over, tell meilisearch to stop the route:
```
curl \
	-X DELETE http://localhost:7700/logs
```

----

In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand.

```bash
cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json
```

Then go to https://profiler.firefox.com and load it.
Note that we can also share the profiles using the https://share.firefox.dev website.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-08 14:16:56 +00:00
407ad753ed rust fmt 2024-02-08 15:11:42 +01:00
bf43a3f60a fix typo 2024-02-08 15:04:06 +01:00
1502382316 use debug instead of debug_span 2024-02-08 15:04:06 +01:00
08af0e690c Structures a bunch of logs 2024-02-08 15:04:06 +01:00
db722d201a Write entries into database downgraded to trace level 2024-02-08 15:04:05 +01:00
e773dfa9ba get rids of log in milli and add logs for the bucket sort 2024-02-08 15:04:05 +01:00
5d7061682e Add tracing to milli 2024-02-08 15:03:31 +01:00
72ebac1fbb Merge #4388
4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops

This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available.

Fixes #4152.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-08 13:19:28 +00:00
88d03c56ab Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model 2024-02-07 11:52:09 +01:00
517f5332d6 Allow actually passing dimensions for OpenAI source
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
053306c0e7 Try with 500MiB 2024-02-07 11:24:43 +01:00
9eeb75d501 Clamp the max memory of the grenad sorters to a reasonable maximum 2024-02-06 10:47:04 +01:00
fbf5f2a392 Don't use a runtime in extract_embedder, use it only for OpenAI 2024-02-01 10:33:27 +01:00
9f8f3105d5 make clippy happy 2024-02-01 10:33:27 +01:00
318843aacd add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-02-01 10:33:27 +01:00
c1bf33a112 Revert "Remove panic on the geosearch" 2024-01-25 18:51:19 +01:00
0887186ecf make clippy happy 2024-01-17 16:07:10 +01:00
7d190d8078 add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-01-17 15:51:52 +01:00
01e2c3d6bb Bump arroy to v0.2.0 2024-01-16 16:45:55 +01:00
9f9ad4cc05 Fix Clippy warnings 2024-01-16 15:27:24 +01:00
3ee7682fa7 Fix some integer comparisons 2024-01-16 15:22:23 +01:00
54ae6951eb fix warning 2024-01-02 15:19:30 +01:00
6ff81de401 Fix tests 2023-12-20 17:16:46 +01:00
9123370e90 Validate fused settings in settings task after fusing with existing setting 2023-12-20 17:16:46 +01:00
e249e4db7b Change Setting::apply function signature 2023-12-20 17:15:24 +01:00
9e1b458010 Merge branch 'main' into change-proximity-precision-settings 2023-12-18 09:08:47 +01:00
6425996e36 Change the naming of attributeScale and wordScale into byAttribute and byWord 2023-12-14 16:31:00 +01:00
87bba98bd8 Various changes
- fixed seed for arroy
- check vector dimensions as soon as it is provided to search
- don't embed whitespace
2023-12-14 16:08:42 +01:00
b8e4709dfa Remove prompt strategy and fallback 2023-12-14 16:08:41 +01:00
806e5b6899 Tests pass 2023-12-14 16:08:41 +01:00
e0cc775dc4 Various changes
- DistributionShift in Search object (to be set from model in embed?)
- Fix issue where embedder index wasn't computed at search time
- Accept as default embedder either the "default" one, or the only embedder when there is only one
2023-12-14 16:08:41 +01:00
12940d79a9 WIP
- manual embedder
- multi embedders OK
- clippy + tests OK
2023-12-14 16:08:41 +01:00
922a640188 WIP multi embedders
fixed template bugs
2023-12-14 16:08:41 +01:00
65e49b7092 Remove stuff, add distribution shift (WIP) 2023-12-14 16:08:38 +01:00
e56f160032 Actually pass embedders on reindex 2023-12-14 16:07:49 +01:00
687d92f217 prompt bifluor+ 2023-12-14 16:07:49 +01:00
fb539f61fe WIP 2023-12-14 16:07:49 +01:00
cb4ebe163e WIP 2023-12-14 16:07:49 +01:00
dde3a04679 WIP arroy integration 2023-12-14 16:07:49 +01:00
13c2c6c16b Small commit to add hybrid search and autoembedding 2023-12-14 16:07:48 +01:00
467b49153d Implement proximityPrecision setting on milli side 2023-12-06 15:49:02 +01:00
bddc168d83 List TODOs 2023-12-06 14:59:23 +01:00
d32eb11329 Move to the v0.20.0-alpha.9 of heed 2023-11-27 11:52:22 +01:00
0dbf1a16ff Make clippy happy 2023-11-23 14:11:38 +01:00
462b4c0080 Fix the tests 2023-11-23 12:07:35 +01:00
0d4482625a Make the changes to use heed v0.20-alpha.6 2023-11-23 11:43:58 +01:00
d3575fb028 Make into_del_add_obkv parameters more human readable 2023-11-20 16:10:39 +01:00
39cbb499c2 Small fixes 2023-11-20 10:20:39 +01:00