Commit Graph

2215 Commits

Author SHA1 Message Date
ManyTheFish
a493a50825 Fix clippy 2024-02-22 14:53:33 +01:00
ManyTheFish
9d1f489a37 Fix facet incremental indexing 2024-02-21 18:42:16 +01:00
meili-bors[bot]
d34692e30b Merge #4365
4365: Update charabia r=dureuill a=ManyTheFish

Update Charabia v0.8.7,

- Add Vietnamese Normalization (Ð and Đ into d)

Fixes #4357

Charabia versions:
- https://github.com/meilisearch/charabia/releases/tag/v0.8.6
- https://github.com/meilisearch/charabia/releases/tag/v0.8.7

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-14 16:57:25 +00:00
ManyTheFish
78e04520fc Update charabia version 2024-02-14 15:16:16 +01:00
ManyTheFish
03bb6372af Change is_batchable_with by mergeable_with 2024-02-14 11:50:22 +01:00
ManyTheFish
3beda8833d Fix and add logs 2024-02-14 11:46:30 +01:00
ManyTheFish
55e942cd45 buggy 2024-02-13 15:26:30 +01:00
ManyTheFish
48026aa75c fix PR comments 2024-02-13 15:19:01 +01:00
Many the fish
e5e811e2c9 Update milli/src/update/index_documents/extract/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:21 +01:00
Many the fish
55de96f74e Update milli/src/update/facet/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:10 +01:00
ManyTheFish
39c83cb3d9 fix clippy 2024-02-12 09:12:54 +01:00
Louis Dureuil
7efb1cae11 yield in loop when the channel is not disconnected 2024-02-12 09:12:54 +01:00
Louis Dureuil
7877788510 fix logs 2024-02-12 09:12:54 +01:00
ManyTheFish
be1b054b05 Compute chunk size based on the input data size ant the number of indexing threads 2024-02-08 17:28:37 +01:00
meili-bors[bot]
023c2d755f Merge #4391
4391: Tracing r=dureuill a=irevoire

# Pull Request

- [ ] Hide the parameters of the process batch
- [x] Make actix-web trace every call on every route
- [x] Remove all `env_logger`/`logs` dependencies
- [x] Be able to enable or disable the memory measurement using the `/logs` route parameters

See the following product discussion: https://github.com/orgs/meilisearch/discussions/721

Supersedes https://github.com/meilisearch/meilisearch/pull/4338

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4317

## What does this PR do?

Update the format of the logs from:
```
[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers
```

to

```
2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers
```

First, run meilisearch with the route enabled via the feature flag:
- `cargo run --experimental-enable-logs-route`
- Or at runtime by sending the following payload:
```
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "logsRoute": true
  }'
```

Then gather data from meilisearch by calling for example:
```
curl \
	-X POST http://localhost:7700/logs \
	-H 'Content-Type: application/json' \
	--data-binary '{
	    "mode": "fmt",
            "target": "milli=trace"
    }'
```

Once your operation is over, tell meilisearch to stop the route:
```
curl \
	-X DELETE http://localhost:7700/logs
```

----

In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand.

```bash
cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json
```

Then go to https://profiler.firefox.com and load it.
Note that we can also share the profiles using the https://share.firefox.dev website.


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-08 14:16:56 +00:00
Louis Dureuil
407ad753ed rust fmt 2024-02-08 15:11:42 +01:00
Tamo
bf43a3f60a fix typo 2024-02-08 15:04:06 +01:00
Tamo
1502382316 use debug instead of debug_span 2024-02-08 15:04:06 +01:00
Tamo
08af0e690c Structures a bunch of logs 2024-02-08 15:04:06 +01:00
Louis Dureuil
db722d201a Write entries into database downgraded to trace level 2024-02-08 15:04:05 +01:00
Tamo
e773dfa9ba get rids of log in milli and add logs for the bucket sort 2024-02-08 15:04:05 +01:00
Louis Dureuil
5d7061682e Add tracing to milli 2024-02-08 15:03:31 +01:00
meili-bors[bot]
72ebac1fbb Merge #4388
4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops

This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available.

Fixes #4152.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-08 13:19:28 +00:00
Louis Dureuil
a1caac9bfb Correct distribution shifts for new models 2024-02-07 15:09:16 +01:00
Louis Dureuil
88d03c56ab Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model 2024-02-07 11:52:09 +01:00
Louis Dureuil
32ee05ccef Fix default dimensions for models 2024-02-07 11:52:09 +01:00
Louis Dureuil
74c180267e pass dimensions only when defined 2024-02-07 11:52:08 +01:00
Louis Dureuil
517f5332d6 Allow actually passing dimensions for OpenAI source
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
Louis Dureuil
9ac5750096 Retrieve the overriden dimensions from the configuration when fetching settings 2024-02-07 11:51:44 +01:00
Louis Dureuil
7ae4013478 Make sure the overriden dimensions are always used when embedding 2024-02-07 11:51:44 +01:00
Gosti
fb705116a6 feat: add new models and ability to override dimensions 2024-02-07 11:51:42 +01:00
Clément Renault
053306c0e7 Try with 500MiB 2024-02-07 11:24:43 +01:00
Clément Renault
9eeb75d501 Clamp the max memory of the grenad sorters to a reasonable maximum 2024-02-06 10:47:04 +01:00
Louis Dureuil
fbf5f2a392 Don't use a runtime in extract_embedder, use it only for OpenAI 2024-02-01 10:33:27 +01:00
Louis Dureuil
1555870088 Truncate HuggingFace vectors that are too long 2024-02-01 10:33:27 +01:00
Tamo
9f8f3105d5 make clippy happy 2024-02-01 10:33:27 +01:00
Tamo
318843aacd add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-02-01 10:33:27 +01:00
Louis Dureuil
dff2707471 Use MatchingWords from keyword search instead of the one from vector search 2024-02-01 10:33:27 +01:00
Tamo
c1bf33a112 Revert "Remove panic on the geosearch" 2024-01-25 18:51:19 +01:00
Louis Dureuil
f692021bfc Implement PR comments 2024-01-22 10:25:56 +01:00
Louis Dureuil
84f49d76cd Add cuda feature 2024-01-22 10:25:16 +01:00
Tamo
0887186ecf make clippy happy 2024-01-17 16:07:10 +01:00
Tamo
7d190d8078 add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB 2024-01-17 15:51:52 +01:00
Clément Renault
01e2c3d6bb Bump arroy to v0.2.0 2024-01-16 16:45:55 +01:00
Clément Renault
9f9ad4cc05 Fix Clippy warnings 2024-01-16 15:27:24 +01:00
Clément Renault
3ee7682fa7 Fix some integer comparisons 2024-01-16 15:22:23 +01:00
Clément Renault
7f125bfb12 Update incompatible dependencies 2024-01-16 15:15:54 +01:00
Clément Renault
5869ca7716 Upgrade all compatible dependencies 2024-01-16 15:05:03 +01:00
meili-bors[bot]
e93d36d5b9 Merge #4313
4313: Fix document formatting performances r=Kerollmops a=ManyTheFish

reduce the formatted option list to the attributes that should be formatted,
instead of all the attributes to display.
The time to compute the `format` list scales with the number of fields to format;
cumulated with `map_leaf_values` that iterates over all the nested fields, it gives a quadratic complexity:
`d*f` where `d` is the total number of fields to display and `f` is the total number of fields to format.

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-01-11 14:19:44 +00:00
ManyTheFish
5f5a486895 Reduce formatting time 2024-01-11 11:36:41 +01:00