Compare commits

...

850 Commits

Author SHA1 Message Date
Tamo
b7f32c5acd Merge pull request #5817 from meilisearch/fix-dumpless-upgrade
fix the dumpless upgrade again
2025-08-11 15:29:54 +00:00
Tamo
8f04529ba2 Merge pull request #5816 from meilisearch/webhook-telemetry
Update webhook telemetry events
2025-08-11 15:13:41 +00:00
Tamo
54b85b8644 fix the dumpless upgrade again 2025-08-11 16:37:09 +02:00
Mubelotix
562c620fec Update webhook telemetry events 2025-08-11 16:21:14 +02:00
Clément Renault
33bc86d71a Merge pull request #5810 from meilisearch/curquiza-patch-1
Add category to release draft
2025-08-11 11:57:51 +00:00
curquiza
b265c92852 Thank contributors better 2025-08-11 12:17:10 +02:00
Clémentine
759beed560 Add category in release draft 2025-08-07 18:15:29 +02:00
Tamo
2035f342f0 Merge pull request #5807 from meilisearch/patch-chat-settings
Turn chat settings to `PATCH`
2025-08-06 14:17:30 +00:00
Tamo
27fed758c2 Merge pull request #5806 from meilisearch/update-to-v1-17-0
Update version to v1.17.0
2025-08-06 13:33:51 +00:00
Mubelotix
3ead985caf Fix issue #5772 2025-08-06 15:02:25 +02:00
Mubelotix
e302e9edd3 Add test for task 2025-08-06 15:02:15 +02:00
Tamo
1fdf820931 Update version to v1.17.0 2025-08-06 12:12:52 +02:00
Tamo
b4f2eeac0a Merge pull request #5803 from meilisearch/curquiza-patch-1
Minor docs update about release.md
2025-08-06 09:14:57 +00:00
Tamo
7e3f2ab0c6 Merge pull request #5785 from meilisearch/webhook-api
Webhook api
2025-08-05 18:10:00 +00:00
Tamo
899be9c3ff make sure we NEVER ever write the cli defined webhook to the database or dumps 2025-08-05 18:55:32 +02:00
Clément Renault
444231e812 Merge pull request #5804 from meilisearch/curquiza-patch-2
Minor fix in PR template
2025-08-05 15:11:23 +00:00
Mubelotix
1ff6da63e8 Make errors singular 2025-08-05 16:58:25 +02:00
Mubelotix
b5158e1e83 Fix cli webhook getting stored in dumps 2025-08-05 16:58:25 +02:00
Tamo
3f1e172c6f fix race condition: take the rtxn before entering the thread so we're sure we won't try to retrieve deleted tasks 2025-08-05 16:47:35 +02:00
Tamo
2b5b41790e update the dump so it doesn't contains the null-uuid webhook 2025-08-05 16:21:14 +02:00
Mubelotix
55cd3203fe Merge pull request #5783 from meilisearch/starts-with-optim
Optimize the starts_with filter
2025-08-05 14:10:10 +00:00
Clémentine
45bb13bf43 Minor fix in PR template 2025-08-05 15:42:56 +02:00
Clémentine
095cba8fba Minor docs update about release.md 2025-08-05 15:29:42 +02:00
Mubelotix
3a9b08960a Add test 2025-08-05 13:49:28 +02:00
Mubelotix
c4e7bf2e60 Stabilize STARTS WITH filter 2025-08-05 12:14:25 +02:00
Mubelotix
4f6a48c327 Stop storing the cli webhook in the db 2025-08-05 11:44:53 +02:00
Tamo
4c61a227ca fmt after my suggestion 2025-08-05 11:29:54 +02:00
Tamo
3d2c204f2d Update crates/milli/src/search/facet/filter.rs 2025-08-05 11:26:10 +02:00
Mubelotix
8b27dec25c Test that the cli webhook receives data 2025-08-05 11:19:21 +02:00
Mubelotix
a9c924b433 Turn url back into a setting 2025-08-05 11:16:34 +02:00
Mubelotix
6cb2296644 Update tests 2025-08-05 11:10:48 +02:00
Mubelotix
b2d157a74a Remove dbg
Co-Authored-By: Thomas Campistron <irevoire@hotmail.fr>
2025-08-05 10:49:21 +02:00
Mubelotix
386cf83285 Improve webhook settings 2025-08-05 10:48:39 +02:00
Mubelotix
8ef1a50086 Add hint
Co-Authored-By: Thomas Campistron <irevoire@hotmail.fr>
2025-08-05 10:42:39 +02:00
Mubelotix
84651ffd7d Remove hardcoded buffer size
Co-Authored-By: Thomas Campistron <irevoire@hotmail.fr>
2025-08-05 10:41:28 +02:00
Mubelotix
43c20bb3ed Add missing actions in from_repr
Co-Authored-By: Thomas Campistron <irevoire@hotmail.fr>
2025-08-05 10:39:52 +02:00
Mubelotix
d340013d8b Change error name 2025-08-05 10:35:12 +02:00
Mubelotix
8a44d9faef Merge branch 'main' into webhook-api 2025-08-05 10:32:36 +02:00
Mubelotix
afb367c7f4 Update old comment 2025-08-05 10:29:39 +02:00
Mubelotix
84bcf9785f Merge branch 'main' into starts-with-optim 2025-08-05 10:27:45 +02:00
Mubelotix
fc814b7537 Apply review suggestion 2025-08-05 10:25:14 +02:00
Clémentine
0865d8af6c Merge pull request #5766 from meilisearch/release-process-change
Release process change
2025-08-05 07:07:46 +00:00
Clément Renault
cac884401f Merge pull request #5800 from meilisearch/tmp-release-v1.16.0
Bring back changes to main
2025-08-04 16:34:24 +00:00
Mubelotix
7251cccd03 Make notify_webhooks execute in its own thread 2025-08-04 17:13:05 +02:00
Mubelotix
ddfcacbb62 Add nice error message for users trying to set uuid or isEditable 2025-08-04 16:53:41 +02:00
Mubelotix
3b26d64a5d Edit reserved webhook message 2025-08-04 16:39:34 +02:00
Mubelotix
3b0f576d56 Improve invalid uuid error message 2025-08-04 16:38:00 +02:00
Clément Renault
454f8b36f4 Make clippy happy 2025-08-04 16:36:46 +02:00
Mubelotix
1754745c42 Add URL and header validity checks 2025-08-04 16:26:20 +02:00
Clément Renault
6f30dfa41c Merge remote-tracking branch 'origin/main' into tmp-release-v1.16.0 2025-08-04 16:06:51 +02:00
Tamo
33350248c8 Merge pull request #5773 from meilisearch/snapshotception
Fix snapshotCreation task being included in snapshot
2025-08-04 13:53:43 +00:00
Mubelotix
69c59d3de3 Update security in utoipa 2025-08-04 15:43:37 +02:00
Mubelotix
8dfebbb3e7 Fix tests 2025-08-04 15:37:12 +02:00
Mubelotix
737ad3ec19 Add new api key actions 2025-08-04 15:00:45 +02:00
Mubelotix
4ec4710811 Improve logs 2025-08-04 15:00:26 +02:00
Mubelotix
c5caac95dd Format 2025-08-04 14:51:23 +02:00
Mubelotix
7acbb1e140 Remove PATCH /webhooks 2025-08-04 14:49:27 +02:00
Clémentine
a5e5afd123 Merge pull request #5794 from meilisearch/dependabot/github_actions/sigstore/cosign-installer-3.9.2
Bump sigstore/cosign-installer from 3.8.2 to 3.9.2
2025-08-04 12:39:40 +00:00
Clémentine
c70e9abf70 Merge pull request #5795 from meilisearch/dependabot/github_actions/svenstaro/upload-release-action-2.11.2
Bump svenstaro/upload-release-action from 2.11.1 to 2.11.2
2025-08-04 12:10:03 +00:00
curquiza
f8d70249a7 Update process with Ruleset branch addition 2025-08-04 13:59:11 +02:00
Clémentine
a2c96d40d3 Merge pull request #5798 from meilisearch/update-minidashboard-v0.2.22
Update mini-dashboard v0.2.22
2025-08-04 10:40:12 +00:00
Clément Renault
05dd8e0d62 update mini-dashboard to v0.2.22 2025-08-04 11:14:10 +02:00
Clémentine
4182e631d6 Potential fix for code scanning alert no. 63: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-08-04 09:59:54 +02:00
dependabot[bot]
ddea0b1570 Bump svenstaro/upload-release-action from 2.11.1 to 2.11.2
Bumps [svenstaro/upload-release-action](https://github.com/svenstaro/upload-release-action) from 2.11.1 to 2.11.2.
- [Release notes](https://github.com/svenstaro/upload-release-action/releases)
- [Changelog](https://github.com/svenstaro/upload-release-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/svenstaro/upload-release-action/compare/2.11.1...2.11.2)

---
updated-dependencies:
- dependency-name: svenstaro/upload-release-action
  dependency-version: 2.11.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-01 17:05:02 +00:00
dependabot[bot]
beb532e2a7 Bump sigstore/cosign-installer from 3.8.2 to 3.9.2
Bumps [sigstore/cosign-installer](https://github.com/sigstore/cosign-installer) from 3.8.2 to 3.9.2.
- [Release notes](https://github.com/sigstore/cosign-installer/releases)
- [Commits](3454372f43...d58896d6a1)

---
updated-dependencies:
- dependency-name: sigstore/cosign-installer
  dependency-version: 3.9.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-01 17:04:58 +00:00
Mubelotix
e3a6d63b52 Add utoipa types 2025-08-01 08:42:27 +02:00
Clément Renault
6f8c414a75 Merge pull request #5791 from meilisearch/update-minidashboard-v0.2.21
update mini-dashboard to v0.2.21
2025-07-31 16:13:35 +00:00
ManyTheFish
2ec80a1ae2 update mini-dashboard to v0.2.21 2025-07-31 17:14:38 +02:00
Mubelotix
ed147f80ac Add test and fix bug 2025-07-31 16:45:30 +02:00
Clément Renault
c37ed05f49 Merge pull request #5790 from meilisearch/adapt-go-ci
Adapt Go CI to recent change in the Go repo
2025-07-31 14:20:33 +00:00
curquiza
c1a5a545b6 Adapt Go CI to recent change in the Go repo 2025-07-31 15:23:45 +02:00
Mubelotix
35537e0b0b Add single_receives_data test 2025-07-31 14:12:09 +02:00
Mubelotix
ee80fc87c9 Add test for patch endpoint 2025-07-31 13:00:43 +02:00
Clémentine
bb43bf122e Update .github/pull_request_template.md
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-07-31 12:55:19 +02:00
Mubelotix
34590297c1 Add patch webhook endpoint 2025-07-31 12:53:57 +02:00
Mubelotix
9e43f7b419 Update tests 2025-07-31 12:44:35 +02:00
Mubelotix
94733a4a18 Add delete endpoint 2025-07-31 12:38:14 +02:00
Mubelotix
ad68245186 Update tests 2025-07-31 12:33:34 +02:00
Mubelotix
29fb4d5e2a Add post webhook route 2025-07-31 12:27:12 +02:00
Mubelotix
ca27bcaac7 Update tests 2025-07-31 11:34:47 +02:00
Mubelotix
53397e28fc Replace name by uuid 2025-07-31 11:19:46 +02:00
Mubelotix
7c2c17129f Add get webhook route 2025-07-31 10:59:06 +02:00
Mubelotix
446fce6c16 Extract logic from route 2025-07-31 10:01:25 +02:00
Clément Renault
a99538cd5f Merge pull request #5787 from meilisearch/chat-update-metrics-name
feat(chat): update prometheus metrics name
2025-07-31 07:58:07 +00:00
Mubelotix
f67043801b Add a test for concurrent cli and dump 2025-07-31 09:35:16 +02:00
nicolasvienot
941da56ee3 fix linter 2025-07-31 06:49:53 +02:00
nicolasvienot
41262b008b feat(chat): update metrics name 2025-07-30 17:55:02 +02:00
Mubelotix
fc4c5d2718 Add dump test 2025-07-30 16:16:12 +02:00
Mubelotix
a75b327b37 Add test for webhooks over limits 2025-07-30 15:59:19 +02:00
Mubelotix
c70ae91d34 Add test for reserved webhooks 2025-07-30 15:52:24 +02:00
Mubelotix
e88480c7c4 Fix reserved name check 2025-07-30 15:44:51 +02:00
Mubelotix
b565ec1497 Test cli behavior 2025-07-30 15:44:42 +02:00
Mubelotix
3e77c1d8c8 Add reserved webhook 2025-07-30 15:23:06 +02:00
Mubelotix
dc7af47371 Add new errors 2025-07-30 15:18:43 +02:00
Mubelotix
064d9d5ff8 Add dump support 2025-07-30 15:06:37 +02:00
Mubelotix
93f8b31eec Fix tests 2025-07-30 12:52:01 +02:00
Mubelotix
466e1a7aac Support legacy cli arguments 2025-07-30 12:25:59 +02:00
Mubelotix
cc37eb870f Initial implementation 2025-07-30 12:01:40 +02:00
Mubelotix
5567653c96 Fix network documentation 2025-07-29 16:47:28 +02:00
Mubelotix
5e867f7ce0 Add webhooks api key action 2025-07-29 16:47:20 +02:00
Mubelotix
48a5f4db2d Improve comment 2025-07-28 16:42:33 +02:00
Mubelotix
224892e692 Enable new algorithm every time 2025-07-28 16:28:06 +02:00
Mubelotix
691a9ae4b1 Format 2025-07-28 16:24:11 +02:00
Mubelotix
e8a818f53d Optimize the filter 2025-07-28 16:24:04 +02:00
Mubelotix
478f374b9d Add benchmark 2025-07-28 16:23:26 +02:00
Louis Dureuil
d4c88f28f3 Merge pull request #5780 from meilisearch/fix-api-keys
Fix api key action inconsistencies
2025-07-28 10:33:48 +00:00
Mubelotix
d90c76d3cc Update tests 2025-07-28 11:35:15 +02:00
Mubelotix
f6bc6854f8 Fix key action inconsistencies 2025-07-28 11:10:55 +02:00
Mubelotix
1f18f0ba77 Update little tiny comments 2025-07-23 14:33:58 +02:00
Mubelotix
44b24652d2 Change strategy to remove task instead of marking it succeeded 2025-07-23 14:30:25 +02:00
Mubelotix
5dcf79233e Remove useless parameter
Co-Authored-By: Tamo <tamo@meilisearch.com>
2025-07-23 11:30:39 +02:00
Clément Renault
42001a25ff Merge pull request #5770 from meilisearch/fix/update-index-chat-settings
fix: index chat settings `searchParameters` incorrectly set with `limit: 20` when sending empty object
2025-07-22 13:26:01 +00:00
Mubelotix
846d27354b Format 2025-07-22 15:18:21 +02:00
Mubelotix
c1aa4120ac Update test 2025-07-22 15:18:13 +02:00
Mubelotix
6394efc4c2 Turn dirty fix into beautiful fix 2025-07-22 15:17:26 +02:00
Mubelotix
9716834380 Initial fix 2025-07-22 14:31:42 +02:00
Mubelotix
2f2e42e72d Add test for issue #4653 2025-07-22 12:33:18 +02:00
Louis Dureuil
080d5f94dd Merge pull request #5763 from meilisearch/embedding-fixes
Regenerate all fragments when coming from a user provided vector
2025-07-22 08:35:07 +00:00
nicolasvienot
ba0f50e5ef fix: update default deserialization for ChatSearchParams limit field 2025-07-22 10:24:18 +02:00
Louis Dureuil
ce6230aa85 Merge pull request #5762 from meilisearch/new-document-indexer-for-dumps
Use the edition 2024 documents indexer in the dumps
2025-07-21 14:53:43 +00:00
Louis Dureuil
6dc241f9de Fix tests 2025-07-21 15:11:24 +02:00
Louis Dureuil
01d1ef65c4 Update search and docs usages 2025-07-21 15:11:24 +02:00
Louis Dureuil
3246667590 when exporting vectors, for regenerate to false when the embedder has fragments 2025-07-21 15:11:24 +02:00
Louis Dureuil
109395c199 Index::embeddings specifies if the embedder has fragments 2025-07-21 15:11:24 +02:00
Louis Dureuil
a0b71a8785 EmbedderOptions::has_fragments() 2025-07-21 15:11:24 +02:00
Louis Dureuil
00a5c86f13 Remove accidentally added db snap 2025-07-21 15:11:24 +02:00
Louis Dureuil
366c37a686 Fix new indexer 2025-07-21 15:11:23 +02:00
Louis Dureuil
afc164a271 Fix in old indexer 2025-07-21 15:11:23 +02:00
Kerollmops
bdc2d1e64d Move the edition 2024 dump parameter to the right place 2025-07-21 14:50:05 +02:00
curquiza
f3b60a1dab Minor update on doc 2025-07-20 22:20:08 +02:00
curquiza
cd0523c3f1 Remove run of SDK test on PR because cannot work 2025-07-20 22:13:07 +02:00
curquiza
7f318ee964 Adapt issue template 2025-07-20 22:11:30 +02:00
curquiza
dc1656da8e Adapt automation 2025-07-20 22:11:14 +02:00
curquiza
dc0bd9f25d Add release drafter 2025-07-20 22:10:35 +02:00
curquiza
52d8007b12 Add pull request template 2025-07-20 22:10:17 +02:00
curquiza
4f8382b159 Remove useless automation 2025-07-20 22:07:59 +02:00
curquiza
c2c82be556 Update documentation 2025-07-20 22:07:23 +02:00
Tamo
0312fb22b8 Merge pull request #5761 from meilisearch/fix-chat-settings-dumpless-upgrade
Fix chat settings dumpless upgrade
2025-07-17 15:57:39 +00:00
Clément Renault
b85657de1e Update memmap2 version everywhere 2025-07-17 17:30:44 +02:00
Clément Renault
626be0ef28 Small typo fix
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-07-17 17:27:00 +02:00
Clément Renault
1b476b8a35 Add documentation to the new documents_file dump reader method
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-07-17 17:26:41 +02:00
Clément Renault
a1b42c10e2 Make clippy happy 2025-07-17 17:21:03 +02:00
Clément Renault
d67db6e3c2 Use the edition 2024 documents indexer in the dumps 2025-07-17 17:12:51 +02:00
Clément Renault
760ccffdbd Expose the documents files from the dumps 2025-07-17 17:12:51 +02:00
Clément Renault
338806283b Do not track meilisearch databases 2025-07-17 17:12:51 +02:00
Clément Renault
fe15e11c9d Introduce a new CLI and env var to use the old document indexer when
importing dumps
2025-07-17 17:12:51 +02:00
Clément Renault
f1d92bfead Make sure the new filter chat setting is set to it's default value if
missing
2025-07-17 15:36:21 +02:00
Clément Renault
a005a062da Add security if chat settings parameters are missing 2025-07-17 15:27:53 +02:00
Clément Renault
fd8b2451d7 Merge pull request #5754 from kametsun/fix/incorrect-stats-doc-count
Fix incorrect document count in stats after clearing all documents
2025-07-17 06:48:51 +00:00
Louis Dureuil
058f9ffda5 Merge pull request #5734 from meilisearch/request-fragments-test
Tests for multimodal
2025-07-16 11:04:00 +00:00
Louis Dureuil
5d363205a5 Merge pull request #5716 from meilisearch/document-sorting
Allow sorting on the /documents route
2025-07-16 10:26:50 +00:00
Mubelotix
a683faa882 Apply review suggestions 2025-07-16 11:03:24 +02:00
Tamo
421a23ee3d Merge pull request #3265 from LeSuisse/sign-container-image-cosign
Sign container image using Cosign in keyless mode
2025-07-16 08:54:57 +00:00
Thomas Gerbet
191ea340ed Sign container image using Cosign in keyless mode
Cosign keyless mode makes possible to sign the container image using the
OIDC Identity Tokens provided by GitHub Actions [0][1].
The signature is published to the registry storing the image and to the
public Rekor transparency log instance [2].

Cosign keyless mode has already been adopted by some major projects like
Kubernetes [3].

The image signature can be manually verified using:
```
$ cosign verify \
	--certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
	--certificate-identity-regexp='^https://github.com/meilisearch/meilisearch/.github/workflows/publish-docker-images.yaml' \
	<image_name>
```

See #2179.
Note that a similar approach can be used to sign the release binaries.

[0] https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect
[1] https://docs.sigstore.dev/cosign/signing/signing_with_containers/
[2] https://docs.sigstore.dev/rekor/overview
[3] https://kubernetes.io/docs/tasks/administer-cluster/verify-signed-artifacts/#verifying-image-signatures
2025-07-16 10:04:18 +02:00
Louis Dureuil
8887cbdcd5 Merge pull request #5725 from meilisearch/fix-threshold-overcounting-bug
Fix Total Hits being wrong when rankingScoreThreshold is used
2025-07-16 07:15:24 +00:00
Tamo
8d22972d84 Merge pull request #5626 from martin-g/faster-batches-it-tests
tests: Faster batches:: IT tests
2025-07-16 07:01:16 +00:00
Many the fish
634865ff53 Merge pull request #5710 from meilisearch/chat-route-support-filters
Introduce filters in the chat completions
2025-07-15 16:10:49 +00:00
Mubelotix
36fccf8525 Merge remote-tracking branch 'origin/release-v1.16.0' into fix-threshold-overcounting-bug 2025-07-15 18:01:29 +02:00
Mubelotix
d6bd60d569 Apply review suggestions
Co-Authored-By: Louis Dureuil <louis.dureuil@xinra.net>
2025-07-15 18:00:37 +02:00
Mubelotix
48ad959fc1 Merge remote-tracking branch 'origin/release-v1.16.0' into document-sorting 2025-07-15 17:41:46 +02:00
Mubelotix
1bc30cb4c8 Restore old benchmark names 2025-07-15 17:34:04 +02:00
Mubelotix
77138a42d6 Apply review suggestions
Add preconditions

Fix underflow

Remove unwrap

Turn methods to associated functions

Apply review suggestions
2025-07-15 17:31:11 +02:00
Kerollmops
0791506124 Fix some proposals 2025-07-15 17:10:45 +02:00
Kerollmops
2a015ac3b8 Implement basic few shot prompting to improve the query capabilities 2025-07-15 14:50:10 +02:00
Martin Grigorov
8772b5af87 Merge branch 'main' into faster-batches-it-tests 2025-07-15 15:21:32 +03:00
Clément Renault
6f248b78a9 Merge pull request #5751 from meilisearch/fix-searchable-attributes-order
Fix: Preserve order of searchable attributes when modified
2025-07-15 10:38:11 +00:00
Many the fish
d694e312ff Update crates/milli/src/update/settings.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2025-07-15 11:54:59 +02:00
Clément Renault
d76dcc8998 Make clippy happy 2025-07-15 11:49:48 +02:00
Clément Renault
e654f66223 Support filtering 2025-07-15 11:49:47 +02:00
Clément Renault
34f2ab7093 WIP report search errors to the LLM 2025-07-15 11:49:46 +02:00
Clément Renault
1a9dbd364e Fix some issues 2025-07-15 11:49:46 +02:00
Clément Renault
662c5d9871 Introduce filters in the chat completions 2025-07-15 11:49:45 +02:00
Tamo
df2e7cde53 Merge pull request #5703 from martin-g/all-use-server-wait-task
tests: Use Server::wait_task() instead of Index::wait_task()
2025-07-15 09:18:12 +00:00
Clément Renault
02b2ae6142 Merge pull request #5756 from meilisearch/fix-integration-test
Fix Rails CI
2025-07-15 07:38:06 +00:00
curquiza
f813eb7ca4 Fix 2025-07-13 12:35:54 +02:00
curquiza
d072edaa49 Fix Rails CI 2025-07-13 12:26:56 +02:00
kametsun
5cd61b50f9 Fix formatting 2025-07-12 18:19:26 +09:00
kametsun
9a9be76757 add: verify that the statistics are correctly update assert 2025-07-12 11:15:44 +09:00
kametsun
cfa6ba6c3b Fix stats showing wrong document count after clear all
Update database stats after clearing documents to ensure
/stats endpoint returns correct numberOfDocuments: 0 instead
of stale count.
2025-07-12 11:15:44 +09:00
Clément Renault
f4f333dbf6 Merge pull request #5753 from meilisearch/export-fixes
Various fixes on the export route
2025-07-11 19:15:42 +00:00
Mubelotix
1ade76ba10 Remove sneaky debug 2025-07-11 12:27:04 +02:00
Mubelotix
ae26658913 Use the most appropriate unit in payload_too_large error 2025-07-11 12:27:03 +02:00
Mubelotix
aa09edb3fb Fix errors being silently dropped 2025-07-11 12:27:03 +02:00
Mubelotix
3f42f1a036 Get rid of bearer 2025-07-11 12:27:03 +02:00
Mubelotix
9bdfdd395b Fix document step overflowing 2025-07-11 12:27:03 +02:00
Mubelotix
78d0625a91 Decrease default payload size for exports 2025-07-11 12:27:03 +02:00
Martin Tzvetanov Grigorov
e3daa907c5 Update redactions
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-11 11:14:39 +03:00
Martin Tzvetanov Grigorov
a39223822a More tests fixes
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-11 11:11:46 +03:00
Martin Grigorov
1eb6cd38ce Merge branch 'main' into faster-batches-it-tests 2025-07-11 10:49:22 +03:00
Martin Tzvetanov Grigorov
eb6ad3ef9c Fix batch id detection
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-11 10:24:25 +03:00
Martin Tzvetanov Grigorov
3bef4f4413 Use Server::wait_task() instead of Index::wait_task()
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-11 10:16:25 +03:00
Martin Tzvetanov Grigorov
9f89881b0d More tests fixes
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-11 10:11:58 +03:00
ManyTheFish
3f655ea20e compare user defined searchable fields instead of internal searchable fields 2025-07-10 18:24:23 +02:00
ManyTheFish
50bc1d55f3 Add test reproducing the bug 2025-07-10 18:23:46 +02:00
Martin Tzvetanov Grigorov
126aefc207 Fix more tests
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-10 16:47:04 +03:00
Martin Tzvetanov Grigorov
e7a60555d6 Formatting
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-10 14:35:40 +03:00
Martin Tzvetanov Grigorov
ae912c4c3f Pass the Server as an extra parameter when the Index needs to wait for a task
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-10 14:28:57 +03:00
Martin Tzvetanov Grigorov
13ea29e511 Fix some search+replace issues. Make Server::wait_task() available for Index:: methods
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-10 14:03:16 +03:00
Martin Tzvetanov Grigorov
5342df26fe tests: Use Server::wait_task() instead of Index::wait_task()
The code is mostly duplicated. Server::wait_task() has better handling for errors and more retries.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-07-10 14:03:15 +03:00
Tamo
61bc95e8d6 Merge pull request #5740 from meilisearch/ignore-flaky-test-2
Ignore yet another flaky test
2025-07-09 13:25:45 +00:00
Mubelotix
0a4f2ef891 Leak mock servers 2025-07-08 15:27:35 +02:00
Tamo
faa1f7c5b7 Merge pull request #5693 from Mubelotix/default-key
Add a Read-Only Admin API Key by default
2025-07-08 12:38:29 +00:00
Mubelotix
3cc5d86598 Format 2025-07-08 13:57:17 +02:00
Mubelotix
1ae47bec77 Improve composite test 2025-07-08 13:57:07 +02:00
Mubelotix
2f1be0ff86 Ignore faulty test (see #5746) 2025-07-08 13:55:07 +02:00
Mubelotix
9cee432255 Fix broken tests 2025-07-08 13:36:26 +02:00
Mubelotix
ff8d48d2f1 Merge branch 'main' into default-key 2025-07-08 12:21:46 +02:00
Mubelotix
a56c036994 Update crates/meilisearch-types/src/keys.rs
Co-authored-by: gui machiavelli <hey@guimachiavelli.com>
2025-07-08 12:18:52 +02:00
Louis Dureuil
074744b8a6 Ignore yet-another flaky test 2025-07-08 10:54:39 +02:00
Tamo
511c48f520 Merge pull request #5737 from meilisearch/request-fragments-dumpless-upgrade
Fix the dumpless upgrade from v1.15 to v1.16 for request fragments
2025-07-08 08:49:38 +00:00
Louis Dureuil
4623691d1f Don't make the type-that-shall-not-be-written serializable
Following tamo's advice

Co-Authored-By: Tamo <tamo@meilisearch.com>
2025-07-08 10:04:33 +02:00
Mubelotix
3261aadcf2 Add composite test 2025-07-07 16:50:39 +02:00
Mubelotix
073e9f2967 Disable similarity check on composite embedders using fragments 2025-07-07 16:46:16 +02:00
Louis Dureuil
5f8f48ec95 Add new snapshot checking for regenerativeness 2025-07-07 16:43:05 +02:00
Louis Dureuil
ed2fe365a0 Fix existing snaps 2025-07-07 16:42:50 +02:00
Louis Dureuil
f7c8a77f89 Update v1.12.0 DB to contain vectors 2025-07-07 16:01:50 +02:00
Clément Renault
a8030850ee Merge pull request #5733 from meilisearch/improve-export-analytics
Improve the analytics of the `/export` route
2025-07-07 12:26:11 +00:00
Mubelotix
132065afda Minor improvements 2025-07-07 13:10:16 +02:00
Mubelotix
51c298662b Merge branch 'main' into request-fragments-test 2025-07-07 13:00:21 +02:00
Mubelotix
70a860a0f0 Merge branch 'main' into fix-threshold-overcounting-bug 2025-07-07 12:26:37 +02:00
Louis Dureuil
a3254d7d7d Implement dumpless upgrade from v1.15 to v1.16 2025-07-07 11:57:08 +02:00
Louis Dureuil
73c9c1ebdc Add compile-time checks for dumpless upgrade 2025-07-07 11:34:18 +02:00
Clément Renault
4c7a6e5c1b Do not leak private URLs 2025-07-07 11:07:58 +02:00
Tamo
ef4c87accf Merge pull request #5732 from meilisearch/chat-route-support-metrics
Add chat-related metrics on the prometheus route
2025-07-07 08:33:31 +00:00
Clément Renault
ced7ea4a5c Merge pull request #5731 from meilisearch/chat-route-support-dumps
Export and import chat completions workspace settings in dumps
2025-07-07 08:31:41 +00:00
Mubelotix
fa3990daf9 Format 2025-07-04 13:33:49 +02:00
Mubelotix
c5993196b3 Add test 2025-07-04 13:32:55 +02:00
Mubelotix
16234e1313 Add fragment swapping test 2025-07-04 13:25:42 +02:00
Mubelotix
be9f4f96df Add experimental feature test 2025-07-04 13:15:15 +02:00
Mubelotix
b274106ad3 Add test 2025-07-04 13:05:52 +02:00
Mubelotix
48527761e7 Add test 2025-07-04 12:01:15 +02:00
Mubelotix
6792d048b8 Test both fragments and document template 2025-07-04 11:47:38 +02:00
Kerollmops
07bfed99e6 Expose the host in the analytics 2025-07-04 11:08:02 +02:00
Mubelotix
8dfded2993 Update tests 2025-07-04 10:49:03 +02:00
Mubelotix
3714f16696 Fix bug 2025-07-04 10:40:50 +02:00
Mubelotix
d0cd3cacec Add a way to reproduce the bug 2025-07-03 18:18:04 +02:00
Louis Dureuil
fef089c7b6 Merge pull request #5596 from meilisearch/request-fragments
Request fragments
2025-07-03 15:01:44 +00:00
Clément Renault
d47e1e15de Merge pull request #5730 from meilisearch/update-version-v1.16.0
Update version for the next release (v1.16.0) in Cargo.toml
2025-07-03 14:45:43 +00:00
Mubelotix
caccb51814 Add a complex value test 2025-07-03 16:10:23 +02:00
Clément Renault
a76a3e8f11 Change the metric name for the search to use a label 2025-07-03 16:01:31 +02:00
ManyTheFish
32dede35c7 Update snapshots 2025-07-03 15:59:14 +02:00
Clément Renault
6397ef12a0 Use three metrics for the three different tokens 2025-07-03 15:56:56 +02:00
Mubelotix
cf9b311f71 Format 2025-07-03 15:53:09 +02:00
Mubelotix
7423243be0 Add test with multiple embedders 2025-07-03 15:52:18 +02:00
Clément Renault
b5e41f0e46 Fix the Mistral uncompatibility with the usage of OpenAI 2025-07-03 15:21:40 +02:00
Mubelotix
5690700601 Add fragment addition test 2025-07-03 15:19:31 +02:00
Mubelotix
2faad504c6 Add test 2025-07-03 15:12:47 +02:00
Mubelotix
2bcd69750f Add fragment modification test 2025-07-03 15:08:27 +02:00
Clément Renault
9f0d33ec99 Expose the number of tokens on the chat completions routes 2025-07-03 15:05:15 +02:00
Mubelotix
de24e75be8 Update test 2025-07-03 15:00:11 +02:00
Louis Dureuil
a3af9fe057 new extractor bugfixes:
- fix old_has_fragments
- new_is_user_provided is always false when generating fragments,
  even if no fragment ever matches
2025-07-03 14:44:34 +02:00
Louis Dureuil
90683d0e4e add snapshot of get settings 2025-07-03 14:43:06 +02:00
Mubelotix
5c79273748 Add TODOs 2025-07-03 14:42:49 +02:00
Louis Dureuil
90e6b6416f new extractor bugfixes:
- fix old_has_fragments
- new_is_user_provided is always false when generating fragments,
  even if no fragment ever matches
2025-07-03 14:35:02 +02:00
Clément Renault
2b75072b09 Expose the number of internal chat searches on the /metrics route 2025-07-03 14:04:27 +02:00
Clément Renault
6e6fd077d4 Ignore unexisting chat completions settings folder 2025-07-03 13:37:38 +02:00
Mubelotix
b45eea0d3e Add test for fragment deletion 2025-07-03 13:26:44 +02:00
Clément Renault
a051ab3d9a Support importing chat completions settings 2025-07-03 12:04:40 +02:00
Mubelotix
0b89ef1fd7 Make tests use a shared index 2025-07-03 11:32:49 +02:00
Mubelotix
65ba7b47af Test search fragments 2025-07-03 11:32:49 +02:00
Mubelotix
8af76a65bf Add test_fragment_indexing 2025-07-03 11:32:49 +02:00
Clément Renault
6b94033c97 Correctly export the chat completions settings in dumps 2025-07-03 11:30:24 +02:00
Louis Dureuil
dfe0c8664e Add a version of prompt::Context that has no fields 2025-07-03 11:08:31 +02:00
Louis Dureuil
0ca652de28 Extract vector points: remove the { 2025-07-03 10:52:30 +02:00
Louis Dureuil
87f105747f Add documentation to Extractor trait 2025-07-03 10:41:20 +02:00
Louis Dureuil
735634e998 Send owned metadata and clear inputs in case of error 2025-07-03 10:32:57 +02:00
Louis Dureuil
3740755d9c Compare to RawValue::NULL constant rather than explicit "null" 2025-07-03 10:11:07 +02:00
Kerollmops
bbcabc47bd Update version for the next release (v1.16.0) in Cargo.toml 2025-07-03 08:06:38 +00:00
Louis Dureuil
a06cb1bfd6 Remove Embed::process_embeddings and have it be an inherent function of the type that uses it 2025-07-03 10:02:16 +02:00
Louis Dureuil
549dc985b8 Old dump import indexer: fix the case where going from Generated to Generated 2025-07-03 09:58:41 +02:00
Louis Dureuil
428463e45c Check indexing fragments as well as search fragments 2025-07-02 16:17:22 +02:00
Louis Dureuil
7113fcf63a New error 2025-07-02 16:17:12 +02:00
Louis Dureuil
aa6855cd4f Vector settings: don't assume which kind of request is asked when looking at a settings update without fragments 2025-07-02 16:12:23 +02:00
Louis Dureuil
895db76a51 Fix snaps 2025-07-02 16:10:05 +02:00
Clément Renault
a88146d59e Merge pull request #5728 from meilisearch/bump-minidashboard-v0.2.20
Bump the mini-dashboard to v0.2.20
2025-07-02 11:03:00 +00:00
Kerollmops
91e77abf4f Bump the mini-dashboard to v0.2.20 2025-07-02 12:15:11 +02:00
Mubelotix
f60814b319 Add benchmark 2025-07-02 12:06:00 +02:00
Mubelotix
5a675bcb82 Add benchmarks 2025-07-02 11:50:32 +02:00
Louis Dureuil
82a796aea7 vector settings: fix bug where removed fragments were returned as new 2025-07-02 11:36:50 +02:00
Louis Dureuil
f6287602e9 Improve error message when request contains the wrong type of placeholder 2025-07-02 11:36:50 +02:00
Louis Dureuil
ede456c5b0 New error: rest inconsistent fragments 2025-07-02 11:36:50 +02:00
Louis Dureuil
3f5b5df139 Check consistency of fragments 2025-07-02 11:36:50 +02:00
Louis Dureuil
d72e5f5f69 Hide documentTemplate and documentTemplateMaxBytes when indexing_fragment is defined 2025-07-02 11:29:50 +02:00
Many the fish
aa366d593d Merge pull request #5726 from meilisearch/dependabot/github_actions/Swatinem/rust-cache-2.8.0
Bump Swatinem/rust-cache from 2.7.8 to 2.8.0
2025-07-02 08:09:11 +00:00
Many the fish
205430854d Merge pull request #5727 from meilisearch/dependabot/github_actions/svenstaro/upload-release-action-2.11.1
Bump svenstaro/upload-release-action from 2.7.0 to 2.11.1
2025-07-02 08:05:07 +00:00
Louis Dureuil
be64006211 Fix process export 2025-07-02 09:12:18 +02:00
Louis Dureuil
eda309d562 make sure fragments are ordered 2025-07-02 00:05:13 +02:00
Louis Dureuil
119d618a76 Do not "upgrade" regnerate fragments to regenerate prompt 2025-07-02 00:05:13 +02:00
Louis Dureuil
2b2e6c0b3a Settings changes 2025-07-02 00:05:13 +02:00
Louis Dureuil
e6329e77e1 settings fragment_diffs 2025-07-02 00:05:13 +02:00
Louis Dureuil
b086c51a23 new settings indexer 2025-07-02 00:05:13 +02:00
Louis Dureuil
9ce5598fef parsed vectors: embeddings is None when it is null when read from DB 2025-07-02 00:05:13 +02:00
Louis Dureuil
e30c24b5bf Prompt: relax lifetime constraints 2025-07-02 00:05:13 +02:00
Louis Dureuil
c1a132fa06 multimodal experimental feature 2025-07-02 00:05:13 +02:00
Louis Dureuil
e54fc59248 Fix snaps 2025-07-02 00:05:13 +02:00
Louis Dureuil
11e7c0d75f Fix tests 2025-07-02 00:05:13 +02:00
Louis Dureuil
c593fbe648 Analytics 2025-07-02 00:05:12 +02:00
Louis Dureuil
2b3327ea74 Use media to determine search kind 2025-07-02 00:05:12 +02:00
Louis Dureuil
d14184f4da Add media to search 2025-07-02 00:05:12 +02:00
Louis Dureuil
46bceb91f1 New search errors 2025-07-02 00:05:12 +02:00
Louis Dureuil
cab5e35ff7 Implement in old settings indexer and old dump import indexer 2025-07-02 00:05:12 +02:00
Louis Dureuil
f8232976ed Implement in new document indexer 2025-07-02 00:05:12 +02:00
Louis Dureuil
22d363c05a Clear DB on clear documents 2025-07-02 00:05:12 +02:00
Louis Dureuil
41620d5325 Support indexingFragments and searchFragments in settings 2025-07-02 00:05:12 +02:00
Louis Dureuil
f3d5c74c02 Vector settings to add indexingFragments and searchFragments 2025-07-02 00:05:12 +02:00
Louis Dureuil
d48baece51 New error when too many fragments in settings 2025-07-02 00:05:12 +02:00
Louis Dureuil
c45ede44a8 Add new parameters to openai and rest embedders 2025-07-02 00:05:11 +02:00
Louis Dureuil
4235a82dcf REST embedder supports fragments 2025-07-02 00:05:11 +02:00
Louis Dureuil
e7b9b8f002 Change embedder API 2025-07-02 00:05:11 +02:00
Louis Dureuil
5716ab70f3 EmbeddingConfigs -> RuntimeEmbedders 2025-07-02 00:05:11 +02:00
Louis Dureuil
422a786ffd RuntimeEmbedder and RuntimeFragments 2025-07-02 00:05:11 +02:00
Louis Dureuil
836ae19bec ArroyWrapper changes 2025-07-02 00:05:11 +02:00
Louis Dureuil
0b5bc41b79 Add new vector errors 2025-07-02 00:05:11 +02:00
Louis Dureuil
b45059e8f2 Add vector::session module 2025-07-02 00:05:11 +02:00
Louis Dureuil
c16c60b599 Add vector::extractor module 2025-07-02 00:05:11 +02:00
Louis Dureuil
0114796d2a Index uses the vector::db stuff 2025-07-02 00:05:10 +02:00
Louis Dureuil
17a94c40dc Add vector::db module 2025-07-02 00:05:10 +02:00
Louis Dureuil
76ca44b214 Expand json_template module 2025-07-02 00:05:10 +02:00
Louis Dureuil
d2e4d6dd8a prompt: Publishes some types 2025-07-02 00:04:04 +02:00
dependabot[bot]
879cf85037 Bump svenstaro/upload-release-action from 2.7.0 to 2.11.1
Bumps [svenstaro/upload-release-action](https://github.com/svenstaro/upload-release-action) from 2.7.0 to 2.11.1.
- [Release notes](https://github.com/svenstaro/upload-release-action/releases)
- [Changelog](https://github.com/svenstaro/upload-release-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/svenstaro/upload-release-action/compare/2.7.0...2.11.1)

---
updated-dependencies:
- dependency-name: svenstaro/upload-release-action
  dependency-version: 2.11.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-01 17:23:13 +00:00
dependabot[bot]
c2d5b20a42 Bump Swatinem/rust-cache from 2.7.8 to 2.8.0
Bumps [Swatinem/rust-cache](https://github.com/swatinem/rust-cache) from 2.7.8 to 2.8.0.
- [Release notes](https://github.com/swatinem/rust-cache/releases)
- [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md)
- [Commits](https://github.com/swatinem/rust-cache/compare/v2.7.8...v2.8.0)

---
updated-dependencies:
- dependency-name: Swatinem/rust-cache
  dependency-version: 2.8.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-01 17:23:08 +00:00
Mubelotix
600178c5ab Still limit to max hits 2025-07-01 18:33:09 +02:00
Louis Dureuil
b93ca3945e Merge pull request #5723 from meilisearch/fix-flaky-embedder-test
Fix flaky last_error test
2025-07-01 15:14:28 +00:00
Mubelotix
8fef48f8ca Merge pull request #5670 from meilisearch/export-and-transfer-route
Introduce a new route to export indexes
2025-07-01 14:37:02 +00:00
Mubelotix
dedae94102 Fix #5274 2025-07-01 16:22:25 +02:00
Mubelotix
7ae9a4afee Add a test for issue #5274 2025-07-01 15:42:43 +02:00
Mubelotix
d2776efb11 Fix flaky last_error test 2025-07-01 15:14:56 +02:00
Mubelotix
9211e94c4f Format 2025-07-01 15:03:20 +02:00
Mubelotix
b7bebe9bbb Fix export when index already exists 2025-07-01 15:03:04 +02:00
Mubelotix
37a692f942 Keep IndexUidPattern 2025-07-01 14:47:43 +02:00
Mubelotix
25c19a306b Rename variable
Co-authored-by: Kero <clement@meilisearch.com>
2025-07-01 14:42:44 +02:00
Mubelotix
c078efd730 Remove experimental todo 2025-07-01 14:40:59 +02:00
Mubelotix
9dac91efe0 Fix utoipa response 2025-07-01 14:40:39 +02:00
Mubelotix
074d509d92 Fix expect message 2025-07-01 14:39:52 +02:00
Mubelotix
d439a3cb9d Fix progress names 2025-07-01 14:39:24 +02:00
Mubelotix
e92b6beb20 Revert making check_sort_criteria usable without a search context 2025-07-01 14:26:55 +02:00
Mubelotix
27cc357362 Document code 2025-07-01 14:21:55 +02:00
Mubelotix
73dfeefc7c Remove plural form 2025-07-01 14:08:46 +02:00
Mubelotix
d85480de89 Move sort code out of facet 2025-07-01 14:05:47 +02:00
Mubelotix
9f55708d84 Format 2025-07-01 13:58:56 +02:00
Mubelotix
280c3907be Add test to sort the unsortable 2025-07-01 13:58:37 +02:00
Mubelotix
8419fd9b3b Ditch usage of check_sort_criteria 2025-07-01 13:42:38 +02:00
Mubelotix
283944ea89 Differentiate between document sort error and search sort error 2025-07-01 12:03:50 +02:00
Mubelotix
8aacd6374a Optimize geo sort 2025-07-01 11:50:01 +02:00
Mubelotix
8326f34ad1 Add analytics 2025-07-01 11:35:28 +02:00
Mubelotix
259fc067d3 Count exported documents by index name, not pattern 2025-07-01 11:14:59 +02:00
Many the fish
e8b2bb3ea6 Merge pull request #5709 from meilisearch/analytics-chat-completions
Add analytics to the chat completions
2025-07-01 09:14:47 +00:00
Many the fish
7dfb2071b5 Merge pull request #5683 from meilisearch/fix-recoverable-file-store-error
Make sure to recover from missing update file
2025-07-01 09:08:55 +00:00
Mubelotix
9cfbef478e Add override setttings to analytics 2025-07-01 11:04:59 +02:00
Mubelotix
efd5fd96cc Add the overrideSettings parameter 2025-07-01 11:02:42 +02:00
Mubelotix
f4a908669c Add tests 2025-07-01 10:02:15 +02:00
Mubelotix
eb2c2815b6 Fix panic 2025-07-01 10:00:10 +02:00
Louis Dureuil
0ef52941c7 Merge pull request #5687 from meilisearch/settings-indexer-edition-2024
Settings indexer edition 2024
2025-07-01 07:35:21 +00:00
Kerollmops
0d85f8fcee Make sure to recover from missing update file 2025-06-30 19:09:30 +02:00
Clément Renault
f4bb6cbca8 Better behavior when null indexes 2025-06-30 18:59:16 +02:00
Clément Renault
ad03c86c44 Display an accurate number of uploaded documents 2025-06-30 18:46:47 +02:00
Clément Renault
85037352b9 Fix most of the easy issues 2025-06-30 18:31:32 +02:00
Mubelotix
29e9c74a49 Merge two ifs 2025-06-30 16:17:04 +02:00
ManyTheFish
1b54c866e1 Link experimental feature discussion 2025-06-30 14:47:39 +02:00
ManyTheFish
e414284335 Clippy too many arguments 2025-06-30 14:25:28 +02:00
ManyTheFish
7a204609fe Move document context and identifiers in document.rs 2025-06-30 14:21:46 +02:00
Mubelotix
f6803dd7d1 Simplify iterator chaining in facet sort 2025-06-30 14:05:23 +02:00
Mubelotix
f86f4f619f Implement geo sort on documents 2025-06-30 13:57:30 +02:00
Mubelotix
e35d58b531 Move geosort code out of search 2025-06-30 13:12:00 +02:00
Mubelotix
63827bbee0 Move sorting code out of search 2025-06-30 11:59:59 +02:00
ManyTheFish
6b2b8ed676 Transform experimental_no_edition_2024_for_settings into a config 2025-06-30 11:49:03 +02:00
ManyTheFish
6db5939f84 Re-integrate embedder stats 2025-06-30 09:52:06 +02:00
ManyTheFish
d35b2d8d33 minor fixes 2025-06-30 09:52:06 +02:00
ManyTheFish
0687cf058a Avoid rewritting documents that don't change
Ensure being on a reindex action before getting embedder_category_id

Fix document skip function
2025-06-30 09:52:06 +02:00
Mubelotix
340d9e6edc Optimize facet sort
5 to 10x speedup
2025-06-27 14:40:55 +02:00
Kerollmops
7219299436 Better handle task abortion 2025-06-27 12:33:32 +02:00
Kerollmops
657bbf5d1e Fix more tests 2025-06-27 10:14:26 +02:00
Mubelotix
28adbc0d18 Update tests 2025-06-27 09:47:46 +02:00
Mubelotix
e3fba62e13 Fix typo 2025-06-27 09:40:59 +02:00
Mubelotix
fb9170b8e3 Keep name consistent with others 2025-06-27 09:40:30 +02:00
Mubelotix
c15763f910 Improve key description
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-06-27 09:39:24 +02:00
Clément Renault
7fa1c41190 Fix some api key errors 2025-06-26 18:25:49 +02:00
ManyTheFish
77802dabf6 rename DocumentChangeContext into DocumentContext 2025-06-26 18:14:48 +02:00
ManyTheFish
a685eeafeb wierd snapshot update 2025-06-26 18:14:48 +02:00
ManyTheFish
f16e6f7c37 Update snapshots 2025-06-26 18:14:48 +02:00
ManyTheFish
900be0ccad Extract or regenerate vectors related to settings changes 2025-06-26 18:14:48 +02:00
ManyTheFish
51a087b764 Write back user provided vectors from deleted embedders 2025-06-26 18:14:48 +02:00
ManyTheFish
31142b3663 Introduce extractor for setting changes 2025-06-26 18:14:48 +02:00
ManyTheFish
e60b855a54 Delete embedders from arroy 2025-06-26 18:14:48 +02:00
ManyTheFish
510a4b91be Introduce DatabaseDocument type 2025-06-26 18:14:48 +02:00
ManyTheFish
e704f4d1ec Reimplement reindexing shell 2025-06-26 18:14:48 +02:00
ManyTheFish
82fe80b360 Replace the legacy Settings::execute by the new one 2025-06-26 18:14:14 +02:00
Clément Renault
0f1dd3614c Update tasks tests 2025-06-26 18:11:12 +02:00
Louis Dureuil
3aa6c3c750 Merge pull request #5707 from Mubelotix/last_embedder_message
Add last embedder error in batches
2025-06-26 15:21:17 +00:00
Clément Renault
b956918c11 Fix clippy and more utoipa issues 2025-06-26 16:31:38 +02:00
Clément Renault
e3003c1609 Improve OpenAPI schema 2025-06-26 16:05:12 +02:00
Clément Renault
bf13268649 Better compute aggragates 2025-06-26 16:03:13 +02:00
Clément Renault
0bb7866f1e Remove the skip embeddings boolean in the settings 2025-06-26 15:48:21 +02:00
Clément Renault
e6e9a033aa Introduce new analytics to the export route 2025-06-26 15:45:24 +02:00
Kerollmops
63031219c5 Add the payload size to the parameters 2025-06-26 13:57:32 +02:00
Mubelotix
44d6430bae Rename fields 2025-06-26 12:30:08 +02:00
Mubelotix
4d26e9c6f2 Remove my comments 2025-06-26 12:21:34 +02:00
Mubelotix
2ff382c023 Remove useless clone 2025-06-26 12:15:09 +02:00
Mubelotix
0f6dd133b2 Turn to references 2025-06-26 12:15:09 +02:00
Mubelotix
29f6eeff8f Remove lots of Arcs 2025-06-26 12:15:08 +02:00
Mubelotix
ef007d547d Remove panics 2025-06-26 12:15:08 +02:00
Mubelotix
3fc16c627d Comment the delay 2025-06-26 12:15:08 +02:00
Mubelotix
9422b6d654 Update crates/meilisearch/src/lib.rs
Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>
2025-06-26 10:58:27 +02:00
Many the fish
ddba52414a Merge pull request #5702 from Nymuxyzo/fix/5688-reset-typo_tolerance-settings
Fix disableOnNumbers reset
2025-06-26 07:58:47 +00:00
Mubelotix
4534dc2cab Create another deserr error 2025-06-25 16:45:32 +02:00
Mubelotix
b05cb80803 Take sort criteria from the request 2025-06-25 16:41:08 +02:00
Mubelotix
6e0526090a Implement sorting documents 2025-06-25 15:36:12 +02:00
Kerollmops
a743da3061 Gzip-compress the content 2025-06-25 15:27:10 +02:00
Clément Renault
c6216517c7 Parallelize document upload 2025-06-25 15:27:10 +02:00
Clément Renault
2d4f7c635e Make tests happy 2025-06-25 15:27:10 +02:00
Clément Renault
ee812b31c4 Support JSON value as filters 2025-06-25 15:27:09 +02:00
Clément Renault
3329248a84 Support no pattern when exporting 2025-06-25 15:27:09 +02:00
Clément Renault
bc08cd0deb Make clippy happy again 2025-06-25 15:27:09 +02:00
Clément Renault
3e2f468213 Support task cancelation 2025-06-25 15:27:09 +02:00
Clément Renault
7c448bcc00 Make clippy happy 2025-06-25 15:27:09 +02:00
Clément Renault
acb7c0a449 Implement a retry strategy 2025-06-25 15:27:08 +02:00
Clément Renault
e8795d2608 Export embeddings 2025-06-25 15:26:47 +02:00
Clément Renault
e023ee4b6b Working first implementation 2025-06-25 15:26:47 +02:00
Clément Renault
e74c3b692a Introduce a new route to export documents and enqueue the export task 2025-06-25 15:26:46 +02:00
Mubelotix
1d3b18f774 Update test to be more reproducible 2025-06-25 14:58:21 +02:00
Clément Renault
00bc86e74b Merge pull request #5705 from meilisearch/fix-max-total-size-limit-env-var
Fix the environment variable name of the experimental limit batched tasks total size feature
2025-06-25 12:49:30 +00:00
Kerollmops
adc9976615 Simplify the analytics chat completions aggragetor 2025-06-25 11:50:26 +02:00
Mubelotix
2090e9ea31 Update test 2025-06-25 10:08:25 +02:00
Mubelotix
1c8f1c18f4 Fix constant name and key description 2025-06-25 09:59:34 +02:00
Louis Dureuil
ae8c1461e1 Merge pull request #5708 from meilisearch/unsupport-gemini
Remove Gemini from the LLM-providers list
2025-06-25 06:44:37 +00:00
Nymuxyzo
5f62274f21 Add disableOnNumbers to settings reset 2025-06-24 23:32:50 +02:00
Mubelotix
c4a96b40eb Remove KeysGet from AllGet 2025-06-24 17:40:06 +02:00
Clément Renault
5f50fc9464 Add new analytics to the chat completions route 2025-06-24 17:05:49 +02:00
Clément Renault
89498a2bea Remove Gemini from the LLM-providers list 2025-06-24 15:58:39 +02:00
Clément Renault
211c1b753f Fix the env variable name 2025-06-24 15:27:39 +02:00
Mubelotix
d08e89ea3d Remove options 2025-06-24 15:10:15 +02:00
Mubelotix
695877043a Fix warnings 2025-06-24 14:53:39 +02:00
Mubelotix
bc4d1530ee Fix tests 2025-06-24 14:50:23 +02:00
Mubelotix
d7721fe607 Format 2025-06-24 12:20:22 +02:00
Mubelotix
4a179fb3c0 Improve code quality 2025-06-24 11:38:11 +02:00
Mubelotix
59a1c5d9a7 Make test more reproducible 2025-06-24 11:08:06 +02:00
Mubelotix
2f82d94502 Fix the test and simplify types 2025-06-23 18:55:23 +02:00
Tamo
bd2bd0f33b Merge pull request #5697 from martin-g/documents-use-server-wait-task
tests: Use Server::wait_task() instead of Index::wait_task() in documents::
2025-06-23 16:33:21 +00:00
Tamo
e02733df4a Merge pull request #5698 from martin-g/index-use-server-wait-task
tests: Use Server::wait_task() instead of Index::wait_task() in index::
2025-06-23 16:31:40 +00:00
Tamo
f373ecc96a Merge pull request #5699 from martin-g/settings-use-server-wait-task
tests: Use Server::wait_task() instead of Index::wait_task() in settings::
2025-06-23 16:30:49 +00:00
Tamo
748a327271 Merge pull request #5700 from martin-g/search-use-server-wait-task
tests: Use Server::wait_task() instead of Index::wait_task() in search::
2025-06-23 16:29:53 +00:00
Mubelotix
4925b30196 Move embedder stats out of progress 2025-06-23 15:24:14 +02:00
Louis Dureuil
43c4a229b7 Merge pull request #5692 from diksipav/5684-gemini-chat-completions-fix
Fix Gemini base_url when used with OpenAI clients
2025-06-23 09:03:34 +00:00
Martin Tzvetanov Grigorov
ca112a8b95 tests: Use Server::wait_task() instead of Index::wait_task() in index::
The code is mostly duplicated. Server::wait_task() has better handling for errors and more retries.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-22 14:59:29 +03:00
Martin Tzvetanov Grigorov
855fa555a3 tests: Use Server::wait_task() instead of Index::wait_task() in search::
The code is mostly duplicated. Server::wait_task() has better handling for errors and more retries.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-22 14:54:49 +03:00
Martin Tzvetanov Grigorov
a237c0797a tests: Use Server::wait_task() instead of Index::wait_task() in settings::
The code is mostly duplicated. Server::wait_task() has better handling for errors and more retries.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-22 14:32:45 +03:00
Martin Tzvetanov Grigorov
5c46dc702a tests: Use Server::wait_task() instead of Index::wait_task()
The code is mostly duplicated.
Server::wait_task() has better handling for errors and more retries.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-22 14:22:59 +03:00
Mubelotix
4cadc8113b Add embedder stats in batches 2025-06-20 12:42:22 +02:00
Mubelotix
2d6dc83940 Format the code 2025-06-19 15:55:12 +02:00
Mubelotix
ab768f379f Fix comment 2025-06-19 15:49:34 +02:00
Mubelotix
705e9a9e5e Make the uuids random again to prevent abuse using rainbow tables 2025-06-19 15:45:09 +02:00
Dijana Pavlovic
c17031d3de Fix Gemini base_url when used with OpenAI clients 2025-06-19 15:11:37 +02:00
Mubelotix
67f2a30d7c Fix test 2025-06-19 13:10:08 +02:00
Mubelotix
99732f4084 Fix some tests 2025-06-19 13:04:55 +02:00
Mubelotix
5081d837ea Fix AllGet action being included in All 2025-06-19 12:12:30 +02:00
Mubelotix
9e1cb792f4 Rename Action::AllRead to AllGet 2025-06-19 11:55:25 +02:00
Mubelotix
b6b7ede266 Rename Action *.read to *.get 2025-06-19 11:53:42 +02:00
Mubelotix
f50e586a4f Allow management key to read other keys 2025-06-19 11:52:58 +02:00
Mubelotix
11fedea788 Set static uuids to keys 2025-06-19 11:42:45 +02:00
Mubelotix
032b34c377 Add a default management key 2025-06-19 11:29:32 +02:00
Mubelotix
b421c8e7de Add an AllRead key 2025-06-19 11:29:16 +02:00
Mubelotix
00eb258a53 Fix comment 2025-06-19 11:16:07 +02:00
Tamo
fc6cc80705 Merge pull request #5689 from Mubelotix/main
Remove old dependencies
2025-06-19 08:11:55 +00:00
Mubelotix
138d20b277 Remove old dependencies 2025-06-18 16:46:20 +02:00
Louis Dureuil
7c1a9113f9 Merge pull request #5686 from meilisearch/upgrade-dependencies-again
Upgrade dependencies
2025-06-18 09:22:18 +00:00
Louis Dureuil
07ae297ffd Merge pull request #5681 from martin-g/faster-settings-prefix_search_settings-it-tests
tests: Faster settings::prefix_search_settings IT tests
2025-06-18 09:20:56 +00:00
Clément Renault
4069dbcfca Upgrade incompatible dependencies 2025-06-17 22:23:37 +02:00
Clément Renault
03eb50fbac Upgrade dependencies 2025-06-17 22:03:06 +02:00
Tamo
2616d776f2 Merge pull request #5677 from martin-g/faster-documents-errors-it-tests
tests: Faster document::errors IT tests
2025-06-17 15:53:35 +00:00
Tamo
3004db95af Merge pull request #5680 from martin-g/faster-similar-mod-it-tests
tests: Faster similar::mod IT tests
2025-06-17 15:51:38 +00:00
Tamo
9a729bf31d Merge pull request #5682 from martin-g/faster-documents-update_documents-it-tests
tests: Faster documents::update_documents IT tests
2025-06-17 14:36:09 +00:00
Martin Tzvetanov Grigorov
8bfa6a7f54 tests: Faster documents::update_documents IT tests
Use a shared server + unique index

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-16 23:48:59 +03:00
Martin Tzvetanov Grigorov
056f18bd02 tests: Faster settings::prefix_search_settings IT tests
Use shared server + unique indices

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-16 23:20:11 +03:00
Martin Tzvetanov Grigorov
fe9866aca8 tests: Faster similar::mod IT tests
Use shared server + unique indexes

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-16 22:51:07 +03:00
Martin Tzvetanov Grigorov
60f105a4a3 tests: Faster document::errors IT tests
* Add a call to .failed() for an awaited task
* Use Server::wait_task() instead of Index::wait_task() - it has better
  error checking

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-16 16:25:15 +03:00
Clément Renault
abb399b802 Merge pull request #5674 from meilisearch/release-v1.15.2
Bring back v1.15.2 to main
2025-06-16 11:36:07 +00:00
Tamo
aeaac7270e Merge pull request #5603 from martin-g/faster-search-multi-it-tests
tests: Faster search::multi IT tests
2025-06-16 09:43:24 +00:00
Tamo
f45770a3ce Merge pull request #5672 from martin-g/reuse-bench-data
docs: Recommend using a custom path for the benches' data
2025-06-16 09:35:57 +00:00
Martin Tzvetanov Grigorov
0e10ff1aa3 docs: Recommend using a custom path for the benches' data
This reduces the build time of the `benchmarks` crate from ~220secs to
45secs (according to `cargo build --timings`) on my dev machine

Additionally I've introduced a parent folder for the Meili related cache
paths - ~/.cache/meili

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-16 09:21:47 +03:00
Martin Tzvetanov Grigorov
6ee608c2d1 Remove debug leftovers
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-14 15:45:04 +03:00
Martin Tzvetanov Grigorov
95e8a9bef1 Use a unique name for an index in a shared server
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-14 15:10:48 +03:00
Martin Tzvetanov Grigorov
0598320252 Try to debug the problem with the existing "test" index in a shared server
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-14 14:07:57 +03:00
Martin Tzvetanov Grigorov
2269104337 Use unique_index_with_prefix() instead of composing the index names manually with Uuid
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-14 13:35:03 +03:00
Clément Renault
6b4d69996c Merge pull request #5663 from meilisearch/update-version-v1.15.2
Update version for the next release (v1.15.2) in Cargo.toml
2025-06-12 16:41:47 +00:00
Clément Renault
df4e3c2e43 Fix the version everywhere 2025-06-12 16:57:59 +02:00
Clément Renault
e2b549c5ee Merge pull request #5668 from meilisearch/fix-must-regenerate
Various fixes to embedding regeneration
2025-06-12 14:48:38 +00:00
Clément Renault
8390006ebf Merge pull request #5665 from meilisearch/fix-chat-route
Fix chat route missing base URL and Mistral error handling
2025-06-12 14:11:39 +00:00
Louis Dureuil
7200437246 Comment the cases 2025-06-12 15:55:52 +02:00
Louis Dureuil
68e7bfb37f Don't fail if you cannot render previous version 2025-06-12 15:55:33 +02:00
Louis Dureuil
209c4bfc18 Switch the versions of the documents for rendering :/ 2025-06-12 15:47:47 +02:00
Louis Dureuil
396d76046d Regenerate embeddings more often:
- When `regenerate` was previously `false` and became `true`
- When rendering the old version of the docs failed
2025-06-12 15:41:53 +02:00
Clément Renault
9ae73e3c05 Better support for Mistral errors 2025-06-12 15:18:37 +02:00
Many the fish
933e319364 Merge pull request #5660 from meilisearch/reproduce-5650
Searchable fields aren't indexed when I add and remove them out of filterableAttributes
2025-06-12 14:46:21 +02:00
Clément Renault
596617dd31 Make sure Mistral base url is well defined 2025-06-12 13:45:05 +02:00
ManyTheFish
f3dd6834c6 Update version for the next release (v1.15.2) in Cargo.toml 2025-06-12 10:51:09 +00:00
Martin Tzvetanov Grigorov
e8774ad079 Extract shared indices for movies and batman documents
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-12 13:46:17 +03:00
ManyTheFish
5d191c479e Skip indexing on settings update when possible,
when removing a field from the filterable settings,
this will trigger a reindexing of the negative version of the document,
which removes the document from the searchable as well because the field was considered removed.
2025-06-12 12:37:27 +02:00
Clément Renault
c3368e6859 Merge pull request #5659 from meilisearch/tmp-release-v1.15.1
Bring back v1.15.0 and v1.15.1 changes
2025-06-12 09:16:56 +00:00
ManyTheFish
40776ed4cd add test reproducing #5650 2025-06-12 11:09:31 +02:00
Clément Renault
9bda9a9a64 Merge remote-tracking branch 'origin/main' into tmp-release-v1.15.1 2025-06-12 10:21:07 +02:00
Many the fish
aefebdeb8b Merge pull request #5617 from workbackai/workback/patch/5594/FB6ED899-E821-4C88-AA79-8BB975E1937A
fix(milli/search): Cyrillic has different typo tolerance due to byte counting bug
2025-06-12 07:39:19 +00:00
Martin Tzvetanov Grigorov
646e44ddf9 Re-use the shared_index_with_score_documents since the settings are as the default
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-12 08:59:19 +03:00
Clément Renault
9275ce1503 Merge pull request #5655 from meilisearch/update-version-v1.15.1
Update version for the next release (v1.15.1) in Cargo.toml
2025-06-11 14:54:01 +00:00
Clément Renault
48d2d3a5cd Fix more tests 2025-06-11 14:53:34 +02:00
Louis Dureuil
7ec0c9aa83 Merge pull request #5556 from meilisearch/chat-route
Chat route
2025-06-11 12:09:30 +00:00
Clément Renault
484fdd9ce2 Fix the insta snapshots 2025-06-11 10:59:14 +02:00
Louis Dureuil
7533a11143 Make sure to send the tool response before the error message 2025-06-11 10:49:21 +02:00
Kerollmops
19d077a4b1 Update version for the next release (v1.15.1) in Cargo.toml 2025-06-11 08:35:24 +00:00
Martin Tzvetanov Grigorov
b8845d1015 Sort the imports
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 11:29:33 +03:00
Martin Tzvetanov Grigorov
620867d611 Use unique indices for the searches in non-existing indices
By using hardcoded there is a chance that the index could exist

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 11:01:05 +03:00
Clément Renault
77cc3678b5 Make sure template errors are reported to the LLM and front-end without panicking 2025-06-11 09:27:14 +02:00
Martin Tzvetanov Grigorov
a73d3c03e9 Make the dynamic assertion for facetsByIndex JSON key more broader
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 09:10:10 +03:00
Martin Tzvetanov Grigorov
824f5b12ce Formatting
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 08:54:58 +03:00
Martin Tzvetanov Grigorov
bb4baf7fae Remove useless dynamic redactions. They are covered by their .**.xyz counterparts
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 08:52:28 +03:00
Martin Tzvetanov Grigorov
0263eb0aec More assertion fixes
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 08:42:35 +03:00
Martin Tzvetanov Grigorov
8a916a4e42 More assertion fixes
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-11 07:54:04 +03:00
Kerollmops
506ee40dc5 Improve errors and other stuff 2025-06-10 17:52:35 +02:00
Kerollmops
952fabf8a0 Better document function names 2025-06-10 17:01:00 +02:00
Kerollmops
7ea2e4ec7b Better document why we duplicate structs 2025-06-10 16:51:39 +02:00
Kerollmops
a0a4ac66ec Better document the done streamed event 2025-06-10 16:48:28 +02:00
Kerollmops
b037e416d3 Make an unreachable case, unreachable 2025-06-10 16:43:20 +02:00
Kerollmops
e9d547556d Better error reporting when multi choices is used 2025-06-10 16:41:02 +02:00
Kerollmops
ab0eba2f72 Remove useless double check 2025-06-10 16:31:58 +02:00
Kerollmops
5ceb3c6a10 Report an error when the document template max bytes is zero 2025-06-10 16:27:18 +02:00
Kerollmops
34d572e3e5 Reove useless commented code 2025-06-10 16:17:41 +02:00
Kerollmops
28e6adc435 Remove the SearchQuery Default impl and change the From impl 2025-06-10 16:16:11 +02:00
Martin Tzvetanov Grigorov
6a683975bf More fixes of the tests
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 16:58:48 +03:00
Kerollmops
c60d11fb42 Clean up the prompts 2025-06-10 14:56:13 +02:00
Kerollmops
32207f9f19 Rename the error code about ranking score threshold 2025-06-10 14:07:53 +02:00
Kerollmops
7c1b15fd06 Remove useless liquid dependency for Meilisearch 2025-06-10 14:05:35 +02:00
Kerollmops
4352a924d7 Remove useless filters parameter 2025-06-10 14:05:02 +02:00
Kerollmops
bbe802c656 Remove the write txn method from the index scheduler 2025-06-10 14:03:05 +02:00
Kerollmops
b32e30ad27 Make the chat setting db name a const 2025-06-10 14:02:43 +02:00
Kerollmops
ae115cee78 Make clippy happy 2025-06-10 13:51:04 +02:00
Martin Tzvetanov Grigorov
1824fbd1b5 Introduce Index::unique_index_with_prefix(&str)
It could be used when we want to see the index name in the assertions,
e.g. `movies-[uuid]`

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:49:18 +03:00
Martin Tzvetanov Grigorov
34d8a54c4b Fix typos in comments and update assertions
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:48:59 +03:00
Martin Tzvetanov Grigorov
9e31d6ceff Add batch_uid to all successful and failed tasks too
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:48 +03:00
Martin Tzvetanov Grigorov
139ec8c782 Add task.batch_uid() helper method
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:48 +03:00
Martin Tzvetanov Grigorov
2691999bd3 Add a helper method for getting the latest batch
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:47 +03:00
Martin Tzvetanov Grigorov
48460678df More assertion fixes
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:47 +03:00
Martin Tzvetanov Grigorov
cb15e5c67e WIP: More snapshot updates
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:46 +03:00
Martin Tzvetanov Grigorov
7380808b26 tests: Faster batches:: IT tests
Use shared server + unique indices where possible

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:12:46 +03:00
Martin Tzvetanov Grigorov
8fa6e8670a tests: Faster search::multi IT tests
Use shared server + unique indices where possible

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-10 14:10:43 +03:00
Kerollmops
c640856cc1 Improve code comments 2025-06-10 11:13:32 +02:00
Kerollmops
1a1317ab0f Make clippy happy 2025-06-10 11:12:27 +02:00
Kerollmops
9cab754942 Update insta snapshots 2025-06-10 11:11:34 +02:00
Kerollmops
4a0ec15ad2 Make cargo fmt happy 2025-06-10 11:00:14 +02:00
Kerollmops
985b892b7a Add a basic chat setting validation 2025-06-10 10:57:43 +02:00
Kerollmops
605dea4f85 Do not leak the chat "workspace" term 2025-06-10 10:34:30 +02:00
Kerollmops
95d4775d4a Remove the preQuery chat setting 2025-06-10 10:32:58 +02:00
Kerollmops
416fcf47f1 Use the same units 2025-06-10 10:28:06 +02:00
Kerollmops
6433e49882 Remove useless code 2025-06-10 10:27:22 +02:00
Kerollmops
85939ae8ad Add support for missing sources 2025-06-10 10:25:22 +02:00
Kerollmops
e654eddf56 Improve the chat workspace REST endpoints 2025-06-10 10:21:34 +02:00
Tamo
170ad87e44 Merge pull request #5622 from martin-g/faster-search-filters-it-tests
tests: Faster search::filters IT tests
2025-06-10 08:17:52 +00:00
Kerollmops
bc56087a17 Fix the chatCompletions key 2025-06-10 10:08:01 +02:00
Kerollmops
29d82ade56 Rename base_api into base_rul 2025-06-10 09:24:07 +02:00
Kerollmops
a7f5d3bb7a Redact the API Key when patching chat workspace settings 2025-06-10 09:21:45 +02:00
Kerollmops
48e8356a16 Mark the non-streaming chat completions route unimplemented 2025-06-10 09:18:36 +02:00
Clément Renault
1fda05c2fd Delete chat.rs 2025-06-09 15:26:13 +02:00
Martin Grigorov
8f96724adf Set max_attempts to 400 for Server::wait_task()
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-06-09 14:03:49 +03:00
Tamo
01e5b0effa Merge pull request #5611 from martin-g/faster-stats-mod-it-tests
tests: Faster stats::mod IT tests
2025-06-09 11:02:12 +00:00
Martin Tzvetanov Grigorov
2ec9664878 chore: Fix English grammar in SearchQueue's comments
No functional changes!

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-09 12:05:36 +02:00
Clément Renault
7f5a0c0013 Merge pull request #5646 from meilisearch/revert-5635-prompt-for-email 2025-06-09 12:03:11 +02:00
Clément Renault
f5c3dad3ed Revert "Prompt for Email" 2025-06-09 10:47:21 +02:00
Martin Tzvetanov Grigorov
10028515ac Use a unique server for the summarized dump creation test
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:52:05 +03:00
Martin Tzvetanov Grigorov
63ccd19ab1 Use Server::wait_task() instead of Index::wait_task() for tasks IT tests
Revert the debugging helper that dumped the thread stack traces.
Try with 400 max attempts for the task success/failure (200 secs)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:16:50 +03:00
Martin Tzvetanov Grigorov
1b4d344e18 Increase the wait time in the tests
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:32 +03:00
Martin Tzvetanov Grigorov
89c0cf9b12 temporary: Dump the threads stack traces when .wait_task() times out
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:32 +03:00
Martin Tzvetanov Grigorov
3770e70581 Optimize the imports
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:31 +03:00
Martin Tzvetanov Grigorov
e497008161 Add cattos to the shared_index_with_nested_documents() as a filterable attribute
This allows to make some more search::filters IT tests using shared
server + unique/shared indices

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:31 +03:00
Martin Tzvetanov Grigorov
a15ebb283f Remove unused import
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:30 +03:00
Martin Tzvetanov Grigorov
3f256a7959 Use the shared index with DOCUMENTS where possible
Remove useless assertion that is covered by the earlier call of
.succeeded()

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:30 +03:00
Martin Tzvetanov Grigorov
b41af0d0f6 Formatting
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:30 +03:00
Martin Tzvetanov Grigorov
3ebff65ef3 tests: Faster search::filters IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-06 14:13:29 +03:00
Clément Renault
717a026fdd Make sure to use the system prompt 2025-06-06 12:32:40 +02:00
Clément Renault
70670c3be4 Introduce the support of Azure, Gemini, vLLM 2025-06-06 12:08:37 +02:00
Clément Renault
62e2a5a324 Merge pull request #5635 from meilisearch/prompt-for-email
Prompt for Email
2025-06-05 19:18:23 +00:00
Clément Renault
90d96ee415 Make clippy happy 2025-06-05 18:21:55 +02:00
Clément Renault
38b317857d Improve the wording again 2025-06-05 18:19:19 +02:00
Tamo
765e76857f store the email file in the global config directory instead of the local data.ms so it's shared between all instances 2025-06-05 16:01:30 +02:00
Clément Renault
204cf423b2 Fix Docker Image 2025-06-05 15:02:09 +02:00
Clément Renault
e575b5af74 Improve the contact email flag to make it friendly to disable prompt 2025-06-05 14:49:08 +02:00
Clément Renault
4fc24cb691 Improve prompting again 2025-06-05 14:45:05 +02:00
Clément Renault
8bc8484e95 Skip the prompt when the email was once provided 2025-06-05 14:43:09 +02:00
Clément Renault
7b49c30d8c Change the email prompting 2025-06-05 12:02:30 +02:00
Clément Renault
239851046d Send requests to Hubspot 2025-06-05 12:00:23 +02:00
Clément Renault
60796dfb14 Disable it by default in our Docker image 2025-06-05 11:02:30 +02:00
Clément Renault
c7cb72a77a Make sure we skip empty prompted emails 2025-06-05 10:59:06 +02:00
Clément Renault
4d819ea636 Initial working version for a prompt for email 2025-06-05 10:54:46 +02:00
Clément Renault
4dfb89168b Add a test for the chat route 2025-06-04 15:41:33 +02:00
Clément Renault
258e6a115b Fix some other tests 2025-06-04 15:29:55 +02:00
arthurgousset
666680bd87 test(meilisearch/search/locales.rs): updates snapshot
Used `cargo insta test`
Reviewed with `cargo insta review`
2025-06-04 14:18:20 +01:00
arthurgousset
27527849bb test(meilisearch/search/locales.rs): updates snapshot
Used `cargo insta test`
Reviewed with `cargo insta review`
2025-06-04 14:17:10 +01:00
Clément Renault
cf2bc03bed Fix the API key issue by reordering the default keys 2025-06-04 14:50:20 +02:00
Tamo
1d02efeab9 Merge pull request #5615 from martin-g/faster-tasks-mod-it-tests
tests: Faster tasks::mod IT tests
2025-06-04 12:38:39 +00:00
Tamo
53fc98d3b0 Merge pull request #5632 from martin-g/db-change-label
ci: Use `GITHUB_TOKEN` secret for the `db change check` workflow
2025-06-04 12:23:01 +00:00
arthurgousset
263300b3a3 style(milli): linting 2025-06-04 12:19:00 +01:00
arthurgousset
ab3d92d163 chore(parse_query): delete println and move test inside tests module 2025-06-04 12:19:00 +01:00
arthurgousset
ef9fc6c854 fix(parse_query): cyrillic bug 2025-06-04 12:19:00 +01:00
Martin Tzvetanov Grigorov
61b0f50d4d Trigger build
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-04 13:37:42 +03:00
Martin Tzvetanov Grigorov
0557a4dd2f Trigger build
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-04 13:08:13 +03:00
Martin Tzvetanov Grigorov
930d5a09a8 Use unique server + its own index for #stats() test
Using a shared server will make this test fragile

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-04 13:08:13 +03:00
Martin Tzvetanov Grigorov
8b0c4291ae tests: Fater stats::mod IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-04 13:08:13 +03:00
Martin Tzvetanov Grigorov
c9efdf8c88 Render details.dumpUid as [dump_uid] in Value's Display
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-04 13:00:47 +03:00
Many the fish
72736c0ea9 Merge pull request #5627 from meilisearch/skip_remote_test
ignore flaky test
2025-06-04 08:28:24 +00:00
Kerollmops
92d0d36ff6 Fix a bunch of snapshot tests 2025-06-04 10:25:35 +02:00
Kerollmops
352ac759b5 Update dependencies 2025-06-04 09:35:43 +02:00
Clément Renault
28dc7b836b Fix the chat completions feature gate 2025-06-03 17:10:53 +02:00
Clément Renault
c4e1407e77 Fix the chat, chats, and chatsSettings actions 2025-06-03 16:11:54 +02:00
Tamo
49317bbee4 Merge pull request #5625 from martin-g/faster-search-hybrid-it-tests
tests: Faster search::hybrid IT tests
2025-06-03 13:54:38 +00:00
Clément Renault
82313a4444 Cargo fmt 2025-06-03 15:39:26 +02:00
Clément Renault
8fdcdee0cc Do a first clippy pass 2025-06-03 15:39:26 +02:00
Clément Renault
3c218cc3a0 Update the default chat completions prompt
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
2025-06-03 15:39:26 +02:00
Clément Renault
7d574433b6 Clean up chat completions modules a bit 2025-06-03 15:39:26 +02:00
Clément Renault
201a808fe2 Better report errors happening with the underlying LLM 2025-06-03 15:39:26 +02:00
Kerollmops
f827c2442c Mark tool calls to be implemented later for non-streaming 2025-06-03 15:36:35 +02:00
Kerollmops
87d2e213f3 Update chat keys 2025-06-03 15:36:35 +02:00
Kerollmops
3b931e75d9 Make the chats settings and chat completions route experimental 2025-06-03 15:36:35 +02:00
Clément Renault
ae135d1d46 Implement a first version of a streamed chat API 2025-06-03 15:36:35 +02:00
Clément Renault
0efb72fe66 Introduce the first version of the /chat route that mimics the OpenAI API 2025-06-03 15:36:35 +02:00
ManyTheFish
bed442528f Update charabia v0.9.4 2025-06-03 15:31:28 +02:00
Kerollmops
496685fa26 Implement deserr on ChatCompletions settings structs 2025-06-03 15:31:28 +02:00
Kerollmops
02cbcea3db Better chat completions settings management 2025-06-03 15:31:28 +02:00
Kerollmops
0f7f5fa104 Introduce listing/getting/deleting/updating chat workspace settings 2025-06-03 15:31:28 +02:00
Kerollmops
50fafbbc8b Implement useful conversion strategies and clean up the code 2025-06-03 15:31:28 +02:00
Clément Renault
2821163b95 Clean up the code a bit 2025-06-03 15:31:27 +02:00
Clément Renault
2da64e835e Factorize the code a bit more and support reporting errors 2025-06-03 15:31:27 +02:00
Clément Renault
420c6e1932 Report the sources 2025-06-03 15:31:27 +02:00
Kerollmops
2a067d3327 Fix compilation error in test 2025-06-03 15:31:27 +02:00
Clément Renault
564cad1163 Call specific tools to show progression and results. 2025-06-03 15:31:27 +02:00
Clément Renault
33dfd422db Introduce a lot of search parameters and make Deserr happy 2025-06-03 15:31:27 +02:00
Clément Renault
036a9d5dbc Expose a well defined set of sources 2025-06-03 15:31:26 +02:00
Clément Renault
7b74810b03 Add the index descriptions to the function description 2025-06-03 15:31:26 +02:00
Clément Renault
3e53527bff redact the chat settings API key 2025-06-03 15:31:26 +02:00
Clément Renault
7929872091 Better chat settings management 2025-06-03 15:31:26 +02:00
Clément Renault
afb43d266e Correctly list the chat settings key actions 2025-06-03 15:31:26 +02:00
Clément Renault
05828ff2c7 Always use the frequency matching strategy 2025-06-03 15:31:26 +02:00
Clément Renault
75c3f33478 Correctly support document templates on the chat API 2025-06-03 15:31:25 +02:00
Clément Renault
c6930c8819 Introduce the new index chat settings 2025-06-03 15:31:25 +02:00
Clément Renault
439146289e Make sure errorneous calls are handled and forwarded to the LLM 2025-06-03 15:31:25 +02:00
Clément Renault
6bf214bb14 Catch invalid argument calls to search function 2025-06-03 15:31:25 +02:00
Clément Renault
fcf694026d Support multiple indexes and not only main 2025-06-03 15:31:25 +02:00
Clément Renault
0b675bd530 Limit the number of internal loop calls and change the function name 2025-06-03 15:31:25 +02:00
Clément Renault
7636365a65 Correctly support tenant tokens and filters 2025-06-03 15:31:24 +02:00
Clément Renault
46680585ae Stream errors 2025-06-03 15:31:24 +02:00
Clément Renault
bcec8d8984 Stop the stream when the connexion stops and chnage the events 2025-06-03 15:31:24 +02:00
Clément Renault
56c1bd3afe Generate a new default chat API key 2025-06-03 15:31:24 +02:00
Clément Renault
1a84f00fbf Change the /chat route to /chat/completions to be OpenAI-compatible 2025-06-03 15:31:24 +02:00
Clément Renault
39320a6fce Better stop the stream 2025-06-03 15:31:24 +02:00
Clément Renault
1d2dbcb51f Update the streaming detection to work with Mistral 2025-06-03 15:31:23 +02:00
Clément Renault
341183cd57 Make it compatible with the Mistral API 2025-06-03 15:31:23 +02:00
Clément Renault
b9716ec346 Support base_api in the settings 2025-06-03 15:31:03 +02:00
Clément Renault
564f85280c Make clippy happy 2025-06-03 15:31:03 +02:00
Clément Renault
7fa74b4931 Display pre-query prompt in search tool response 2025-06-03 15:31:03 +02:00
Clément Renault
7d8415448c Commit when putting stuff in LMDB 2025-06-03 15:31:03 +02:00
Clément Renault
c7839b5a84 Remove useless function 2025-06-03 15:31:03 +02:00
Clément Renault
a52b513023 Expose new chat settings routes 2025-06-03 15:31:02 +02:00
Clément Renault
77e03e3f8c Factorise a bit the code 2025-06-03 15:31:02 +02:00
Clément Renault
148816a3da Display the different tool calls we need to do 2025-06-03 15:31:02 +02:00
Clément Renault
511eef87bf Send an event with the content of the tool calling 2025-06-03 15:31:02 +02:00
Clément Renault
aef8448fc6 Streaming supports tool calling 2025-06-03 15:31:02 +02:00
Clément Renault
5fab2aee51 Nearly support tools on the streaming route 2025-06-03 15:31:02 +02:00
Clément Renault
1235523918 Return the right message format 2025-06-03 15:31:01 +02:00
Clément Renault
d4a16f2349 Aggregate tool calls and display the calls to make. 2025-06-03 15:31:01 +02:00
Clément Renault
0f05c0eb6f Implement a first version of a streamed chat API 2025-06-03 15:31:01 +02:00
Clément Renault
2cd85c732a Make it work by retrieving content from the index 2025-06-03 15:30:48 +02:00
Clément Renault
82fa70da83 Support overwriten prompts of the search query 2025-06-03 15:30:48 +02:00
Clément Renault
951be67060 Support querying the index named main 2025-06-03 15:30:48 +02:00
Clément Renault
5400f3941a Introduce the first version of the /chat route that mimics the OpenAI API 2025-06-03 15:30:48 +02:00
Martin Tzvetanov Grigorov
af54c8381e Use ${{ github.repository }} instead of hardcoding the repo/owner
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 15:46:16 +03:00
Martin Tzvetanov Grigorov
693fcd5752 Try with GITHUB_TOKEN
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 15:40:40 +03:00
Martin Tzvetanov Grigorov
733175359a Update the new test case to use the new signature of index_with_documents_user_provided()
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 15:29:45 +03:00
Martin Tzvetanov Grigorov
7c6162f0bf Fix clippy error
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 15:26:21 +03:00
Martin Tzvetanov Grigorov
d6ae39bf0f tests: Faster search::hybrid IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 15:26:21 +03:00
Tamo
e416bbc1de Merge pull request #5623 from martin-g/faster-search-geo-it-tests
tests: Faster search::geo IT tests
2025-06-03 12:25:48 +00:00
Many the fish
5d0d12dfbd Merge pull request #5630 from meilisearch/fix-test_meilisearch_1714
Adapt tests to the Chinese word segmenter changes
2025-06-03 12:20:08 +00:00
Tamo
2cfd363dc6 Merge pull request #5619 from martin-g/faster-documents-delete_documents-it-tests
tests: Faster documents::delete_documents IT tests
2025-06-03 12:06:07 +00:00
Martin Tzvetanov Grigorov
70aa78a2c2 Remove unused import
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 14:04:15 +03:00
Martin Grigorov
96c81762ed Apply suggestions from code review
Do not redactions for the snapshot assertions

Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-06-03 14:00:38 +03:00
Martin Tzvetanov Grigorov
0b1f634afa Remove useless code
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 13:52:55 +03:00
Martin Tzvetanov Grigorov
d3d5015854 Use the cancelled task uid
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 13:50:04 +03:00
Martin Grigorov
f95f29c492 Use unique server+index for list_tasks_type_filtered() test case
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-06-03 13:45:46 +03:00
Martin Grigorov
a50b69b868 Use unique server+index for list_tasks_status_filtered() test case
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-06-03 13:45:17 +03:00
Martin Grigorov
3668f5f021 Use unique server+index for list_tasks() test case
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-06-03 13:44:38 +03:00
Martin Tzvetanov Grigorov
54fdf379bb Use shared_does_not_exists_index() index for delete_one_document_unexisting_index() test case
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 13:41:13 +03:00
Martin Tzvetanov Grigorov
41b1cd5a73 Extract GEO_DOCUMENTS static variable and shared index with these docs
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 13:08:12 +03:00
Tamo
5c14a25d5a Merge pull request #5624 from martin-g/faster-documents-get_documents-it-tests
tests: Faster documents::get_documents IT tests
2025-06-03 09:37:07 +00:00
Tamo
fda2843135 Merge pull request #5621 from martin-g/faster-similar-errors-it-tests
tests: Faster similar::errors IT tests
2025-06-03 09:27:27 +00:00
Tamo
9347330f3a Merge pull request #5620 from martin-g/faster-search-distinct-it-tests
tests: Faster search::distinct IT tests
2025-06-03 09:24:39 +00:00
Tamo
56c9190dab Merge pull request #5618 from martin-g/faster-vector-binary_quantized-it-tests
tests: Faster vector::binary_quantized IT tests
2025-06-03 09:20:08 +00:00
Tamo
6b986dceaf Merge pull request #5607 from martin-g/faster-settings-get_settings-it-tests
tests: Faster settings::get_settings IT tests
2025-06-03 08:53:17 +00:00
ManyTheFish
cb7bb36080 update charabia v0.9.6 2025-06-03 10:48:41 +02:00
ManyTheFish
161cb736ea Adapt tests to the Chinese word segmenter changes
The new Chinese segmenter is splitting words in smaller parts.
The words `小化妆包` was previously seegmented as `小 / 化妆包` and is now segmented as `小 / 化妆 / 包`,
which changes the tests results.
2025-06-03 10:37:29 +02:00
Many the fish
ea6bb4df1d Merge pull request #5614 from meilisearch/fix-hybrid-distinct
Fix distinct for hybrid search
2025-06-03 07:20:55 +00:00
Martin Tzvetanov Grigorov
a3d2f64725 tests: Faster search::distinct IT tests
Use shared server + unique indices

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-03 08:23:26 +03:00
Many the fish
d5526cffff Merge pull request #5527 from nnethercott/all-cpus-in-import-dump
Use all CPUs during an import dump
2025-06-02 15:24:59 +00:00
Louis Dureuil
5cb75d1f2a ignore flaky test 2025-06-02 17:06:53 +02:00
Martin Tzvetanov Grigorov
921e3c4ffe tests: Faster documents::get_documents IT tests
Use shared server + unique index

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 15:36:08 +03:00
Martin Tzvetanov Grigorov
52591761af tests: Faster search::geo IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 15:32:32 +03:00
Martin Tzvetanov Grigorov
f80182f0a9 tests: Faster similar::errors IT tests
Use shared server + unique indices

Related to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 15:20:17 +03:00
Martin Tzvetanov Grigorov
3b30b6a57a tests: Faster documents::delete_documents IT tests
Use shared server + unique indices
Assert .succeeded()/.failed() for the waited tasks

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 15:04:48 +03:00
Martin Tzvetanov Grigorov
5efc78db55 tests: Faster vector::binary_quantized IT tests
Use shared server + unique indices where possible

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 14:47:18 +03:00
Martin Tzvetanov Grigorov
cffbe3fcb6 Trigger build
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 14:17:19 +03:00
Martin Tzvetanov Grigorov
8d8fcb9846 Revert to unique server + named index for some tests
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 11:44:21 +03:00
Many the fish
20049669c9 Merge pull request #5600 from martin-g/faster-search-facet_search-it-tests
tests: Faster search::facet_search IT tests
2025-06-02 08:39:30 +00:00
Martin Tzvetanov Grigorov
db28d13cb1 Remove useless assertion.
.succeeded() does the same

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 10:59:46 +03:00
Martin Tzvetanov Grigorov
5a7cfc57fd tests: Faster tasks::mode IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 10:56:43 +03:00
Martin Grigorov
790621dc29 Remove useless assert
Co-authored-by: Many the fish <many@meilisearch.com>
2025-06-02 10:55:28 +03:00
Many the fish
1d577ae98b Merge pull request #5610 from martin-g/faster-settings-tokenizer_customization-it-tests
tests: Faster settings::tokenizer_customization IT tests
2025-06-02 07:09:41 +00:00
Many the fish
88e9a55d44 Merge pull request #5609 from martin-g/faster-settings-proximity_settings-it-tests
tests: Faster settings::proximity_settings IT tests
2025-06-02 07:09:06 +00:00
Many the fish
dbe551cf99 Merge pull request #5606 from martin-g/faster-settings-distinct-it-tests
tests: Faster settings::distinct IT tests
2025-06-02 07:07:23 +00:00
Many the fish
a299fbd33b Merge pull request #5605 from martin-g/faster-search-restricted_searchable-it-tests
tests: Faster search::restricted_searchable IT tests
2025-06-02 07:06:50 +00:00
Many the fish
193119acb9 Merge pull request #5604 from martin-g/search-pagination-it-tests
tests: search::pagination IT tests
2025-06-02 07:05:52 +00:00
Many the fish
4c71118699 Merge pull request #5602 from martin-g/faster-search-matching_strategy-it-tests
tests: Faster search::matching_strategy IT tests
2025-06-02 07:04:43 +00:00
Many the fish
5fe2943d3c Merge pull request #5601 from martin-g/faster-search-locales-it-tests
tests: Faster search::locales IT tests
2025-06-02 07:02:28 +00:00
Many the fish
86ff502327 Merge pull request #5599 from martin-g/faster-index-search-errors-tests
tests: Faster search::errors IT tests
2025-06-02 06:54:32 +00:00
Martin Tzvetanov Grigorov
6b1a345dce tests: Faster settings::tokenizer_customization IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 08:23:09 +03:00
Martin Tzvetanov Grigorov
b54ece690b tests: Faster settings::proximity_settings IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-06-02 08:20:05 +03:00
Martin Tzvetanov Grigorov
3ea167bade tests: Faster settings::get_settings IT tests
Use shared server + unique indices

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-30 16:33:27 +03:00
Martin Tzvetanov Grigorov
1158d6689f tests: Faster settings::distinct IT tests
Use shared server + unique indices

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-30 15:41:31 +03:00
Martin Tzvetanov Grigorov
d9b0463a0b tests: Faster search::restricted_searchable IT tests
Use shared server + unique indices

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-30 15:37:27 +03:00
Martin Tzvetanov Grigorov
ae9899f179 tests: search::pagination IT tests
Minor cleanup.

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-30 15:26:55 +03:00
Martin Tzvetanov Grigorov
308fd7128e Fix clippy errors
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 11:36:56 +03:00
Martin Tzvetanov Grigorov
27e7c00622 Add dynamic redactions for taskUid and enqueuedAt properties
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 11:33:10 +03:00
Martin Tzvetanov Grigorov
58207da934 Trigger build
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 10:56:33 +03:00
Martin Tzvetanov Grigorov
fb8b832192 Trigger build
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 10:54:31 +03:00
Martin Tzvetanov Grigorov
17207b5405 tests: Faster search::matching_strategy IT tests
Use shared server + unique indices for all tests

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 09:09:02 +03:00
Martin Tzvetanov Grigorov
bd95503eba tests: Faster search::locales IT tests
Use a shared server + unique indices where possible

Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 09:03:23 +03:00
Martin Tzvetanov Grigorov
8b8b0d802c tests: Faster search::facet_search IT tests
Use shared server + unique indices where possible.
Assert .succeeded() for the waited tasks.
Drop usage of dbg!() in the assertions. It caused noise in the logs

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 08:53:10 +03:00
Martin Tzvetanov Grigorov
d329e86250 tests: Use shared server + unique server where possible
Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-29 08:42:10 +03:00
Many the fish
d416b3b390 Merge pull request #5592 from nnethercott/extract-geo-facets-seperately
Decouple geo facet extraction from rest of document
2025-05-28 16:22:10 +00:00
Louis Dureuil
54f5e74744 Support distinct in hybrid search 2025-05-28 17:58:58 +02:00
Louis Dureuil
fd4b192a39 Add distinct_fid function and expose distinct_single_docid 2025-05-28 17:58:58 +02:00
Louis Dureuil
3c13feebf7 Test that distinct is applied for hybrid search 2025-05-28 17:58:58 +02:00
nnethercott
1811168b96 remove duplicated check on geo field changes 2025-05-28 15:45:13 +02:00
Nate Nethercott
b06cc1e0a2 Update crates/milli/src/update/new/extract/faceted/extract_facets.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2025-05-28 15:38:23 +02:00
Nate Nethercott
44f812c36d Update crates/milli/src/update/new/extract/faceted/extract_facets.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2025-05-28 15:38:12 +02:00
Tamo
c8e77b5f25 Merge pull request #5574 from martin-g/faster-add_documents-it-tests
perf: Faster integration tests for add_documents.rs
2025-05-28 13:13:38 +00:00
Tamo
283f516e15 Merge pull request #5579 from martin-g/faster-index-update_index-it-tests
perf: Faster index::update_index IT tests
2025-05-28 13:11:56 +00:00
Martin Tzvetanov Grigorov
b4ca0a8c98 Update the tests related to updating indices
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 15:02:41 +03:00
Martin Tzvetanov Grigorov
b658e38acd Fix formatting
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 15:02:41 +03:00
Martin Tzvetanov Grigorov
f87e46cc16 Ignore the result from #wait_task()
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 15:02:41 +03:00
Martin Grigorov
65354b414a Update crates/meilisearch/tests/index/update_index.rs
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-05-28 15:02:40 +03:00
Martin Grigorov
025df397c0 Update crates/meilisearch/tests/index/update_index.rs
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-05-28 15:02:40 +03:00
Martin Grigorov
f77abc9dc8 Update crates/meilisearch/tests/index/update_index.rs
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-05-28 15:02:40 +03:00
Martin Tzvetanov Grigorov
7e9909ee45 perf: Faster index::update_index IT tests
Use a shared server where possible.
Assert succeeded/failed task waits.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 15:02:40 +03:00
Martin Tzvetanov Grigorov
43ec97fe45 format the code
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 15:01:04 +03:00
Martin Tzvetanov Grigorov
02929e241b Update the status code
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:36:13 +03:00
Martin Tzvetanov Grigorov
c13efde042 uuid is a production dependency of meili-snap
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:35:50 +03:00
Martin Grigorov
36f0a1492c Apply suggestions from code review
Co-authored-by: Tamo <irevoire@protonmail.ch>
2025-05-28 14:22:04 +03:00
Martin Tzvetanov Grigorov
ce65ad213b Add dynamic redactions for uid, batchUid and taskUid
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:22:04 +03:00
Martin Tzvetanov Grigorov
3e0de6cb83 Wait for the batched tasks bu their real uid.
Some of them succeed, others fail.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:22:04 +03:00
Martin Tzvetanov Grigorov
f3d691667d Use a Regex in insta dynamic redaction to replace Uuids with [uuid]
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:22:01 +03:00
Martin Tzvetanov Grigorov
ce9c930d10 Fix clippy and fmt
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:21:25 +03:00
Martin Tzvetanov Grigorov
fc88b003b4 Use shared server and unique indices for add_documents IT tests
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:20:07 +03:00
Martin Tzvetanov Grigorov
cf5d26124a Call .succeeded() or .failed() on the waited task
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:18:34 +03:00
Martin Tzvetanov Grigorov
38b1c57fa8 Faster IT tests for add_documents.rs
Use Shared server where possible

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-28 14:18:33 +03:00
Many the fish
25c525b057 Merge pull request #5589 from mcmah309/typo_fix
Typo fix
2025-05-28 11:02:22 +00:00
Tamo
83cd28b60b Merge pull request #5584 from martin-g/faster-index-search-mod-tests
tests: Faster index::search::mod IT tests
2025-05-28 08:40:37 +00:00
Martin Tzvetanov Grigorov
48cad4132a Fix clippy - ignore code variable
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-27 16:44:57 +03:00
Martin Tzvetanov Grigorov
4897ad99d0 Wait for the add_documents task
Format the code

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-27 14:26:29 +03:00
Many the fish
5b67de0367 Merge pull request #5593 from meilisearch/remove-template-checker
Remove TemplateChecker
2025-05-27 09:11:51 +00:00
Martin Tzvetanov Grigorov
46ff78b4ec Update the regex to replace all occurrences of uuids in the redaction
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-27 11:47:02 +03:00
Louis Dureuil
5810fb239f Reference PR in comments 2025-05-27 10:24:04 +02:00
Louis Dureuil
b007ed6be9 Remove TemplateChecker 2025-05-27 10:04:14 +02:00
nnethercott
9ad43b6841 rename has_changed to has_changed_for_facets 2025-05-26 18:37:20 +02:00
nnethercott
c9ec502ed9 refactor for readability 2025-05-26 18:32:59 +02:00
nnethercott
18aed75d3b fix logic 2025-05-26 18:20:55 +02:00
nnethercott
6738a4f6ee feat: mettre a jour the insta snapshots 2025-05-26 16:36:36 +02:00
Many the fish
a1ff41cabb Merge pull request #5541 from meilisearch/deactivate-numbers-in-typos-enhancements
Minor fixes: Deactivate numbers in typos
2025-05-26 14:36:21 +00:00
Martin Tzvetanov Grigorov
d2948adea3 Migrate more tests to assert with "[uuid]" instead of real Uuid
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-26 14:31:58 +03:00
Martin Tzvetanov Grigorov
f54b57e5be Use a Regex in insta dynamic redaction to replace Uuids with [uuid]
(cherry picked from commit f8b8c6ab71a28052cf9b271ca8aa5d4175f9e8f9)
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-26 14:03:48 +03:00
nnethercott
95821d0bde refactor: update macro 2025-05-26 10:07:13 +02:00
nnethercott
f690fa0686 feat: add macro_rules to factorize 2025-05-26 09:46:14 +02:00
nnethercott
24e94b28c1 feat: uncouple geo extraction from full doc 2025-05-26 09:22:20 +02:00
Martin Tzvetanov Grigorov
34d58f35c8 Print [uuid] instead of the Uuid index name for MeilisearchHttpError::Milli errors
This way the tests' assertions/snapshots for unique indices would be stable

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-25 15:48:55 +03:00
mcmah309
1d5265caf4 Fix typo in method name 2025-05-22 14:25:04 +00:00
Many the fish
97aeb6db4d Merge pull request #5548 from lblack00/attributes-to-search-on-nested-fields
Added support for nested wildcards to attributes_to_search_on
2025-05-22 13:58:23 +00:00
Many the fish
ff64c64abe Merge pull request #5587 from meilisearch/fix-derivations-again
Fix another derivation-related panic in the search
2025-05-22 13:39:04 +00:00
Tamo
ee326a1ecc Merge pull request #5588 from meilisearch/rename-batch-stopped-reason
Rename batch creation complete
2025-05-22 12:39:34 +00:00
Louis Dureuil
c204a7bb12 Update snapshots 2025-05-22 12:39:37 +02:00
Louis Dureuil
cf4798bd2b Change batch stop reason messages to match the new batch_strategy API name 2025-05-22 12:20:17 +02:00
Louis Dureuil
4d761d3444 Rename batch_creation_complete to batch_strategy 2025-05-22 12:19:54 +02:00
Louis Dureuil
c9b78970c9 Remove lambdas from the find_*_derivations
Make sure their number of insert in the interner are bounded
2025-05-22 11:06:14 +02:00
Tamo
ae3c4e27c4 Merge pull request #5557 from meilisearch/update-charabia-v0.9.4
Update charabia v0.9.5
2025-05-21 10:56:41 +00:00
ManyTheFish
1b718afd11 Update charabia removing a lot of dependencies 2025-05-21 11:52:19 +02:00
ManyTheFish
01ef055f40 Update charabia v0.9.4 2025-05-21 11:52:19 +02:00
Lucas Black
f888f87635 Updated formatting using RustFmt 2025-05-21 02:07:25 -07:00
Many the fish
293a425183 Apply suggestions from code review
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
2025-05-21 10:49:43 +02:00
ManyTheFish
699ec18de8 Fix warnings 2025-05-21 10:49:43 +02:00
ManyTheFish
73e4206b3c Pass a progress callback to recompute_word_fst_from_word_docids_database
fixes https://github.com/meilisearch/meilisearch/pull/5494#discussion_r2069377991
2025-05-21 10:49:43 +02:00
ManyTheFish
a964251cee Remove useless reset
fixes https://github.com/meilisearch/meilisearch/pull/5494#discussion_r2069373494
2025-05-21 10:49:43 +02:00
Martin Tzvetanov Grigorov
8c8d98eeaa Use shared server and unique indices for all tests where possible
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-21 10:48:20 +03:00
Lucas Black
c5ae43cac6 Updated all additional test cases 2025-05-20 09:03:26 -07:00
Martin Tzvetanov Grigorov
57eecd6197 Remove an empty line
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-20 14:37:45 +03:00
Martin Tzvetanov Grigorov
2fe5c78cb6 tests: Faster index::search::mod IT tests
* Use shared index where possible.
* Call .succeeded/.failed when waiting for a task.
* Use newer format_args syntax
* Do not use fully qualified name for meili_snap:: functions. The
  functions are already imported in scope

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-20 14:26:26 +03:00
Louis Dureuil
8068337b07 Merge pull request #5573 from CodeMan62/fix-5555
Only intern in case of typo when looking for one or two typoes
2025-05-20 09:17:35 +00:00
Tamo
8047cfe438 Merge pull request #5580 from martin-g/better-assertions-index-delete_index-it-tests
tests: Assert succeeded/failed for the index::delete_index IT tests
2025-05-20 08:49:24 +00:00
CodeMan62
f26826f115 fix issue 5555 2025-05-20 10:41:32 +02:00
Tamo
5717e5c1af Merge pull request #5578 from martin-g/faster-index-get_index-it-tests
perf: Faster index::get_index IT tests
2025-05-20 08:41:11 +00:00
Martin Tzvetanov Grigorov
bb07038c31 tests: Assert succeeded/failed for the index::delete_index IT tests
Related-to: https://github.com/meilisearch/meilisearch/issues/4840

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 16:57:53 +03:00
Martin Tzvetanov Grigorov
d1a088ea0b Format the code
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 16:52:43 +03:00
Martin Tzvetanov Grigorov
b68e22c0e6 Revert the improvements for get_and_paginate_indexes()
Because they won't work in multi-threaded execution of the tests

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 16:36:45 +03:00
Martin Tzvetanov Grigorov
03a36f116e 1. Use a unique Server for no_index_return_empty_list test
... because a Shared one could see indices created by other tests

2. List at least 1000 indices to make sure we get the newly created ones
   in list_multiple_indexes()

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 16:20:16 +03:00
Tamo
8a0bf24ed5 Merge pull request #5572 from martin-g/faster-stats-it-tests
perf: Faster IT tests - stats.rs
2025-05-19 12:44:08 +00:00
Martin Tzvetanov Grigorov
e2763471e5 Faster index::get_index IT tests
Use shared server for all tests in get_index.rs

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 15:36:25 +03:00
Martin Tzvetanov Grigorov
b2f2c5d69f Remove an assertion of a task uid.
It differs for every run of the IT test suite.

Format the imports

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-19 14:44:08 +03:00
Tamo
e547bfb428 Merge pull request #5577 from meilisearch/comment-out-swarmia
Comment out swarmia deployment for now
2025-05-19 10:13:41 +00:00
Lucas Black
1594c54e23 Provide more information about resulting documents on test case 2025-05-19 02:37:23 -07:00
Louis Dureuil
768cfb6c2d Comment out swarmia deployment for now 2025-05-19 11:34:21 +02:00
Lucas Black
13b607bd68 Removed matches_wildcard_pattern() and integrated match_pattern() into attributes_to_search_on(), updated test cases 2025-05-18 20:24:52 -07:00
Martin Tzvetanov Grigorov
3d130d31c8 Do not hard code the non-exiting index name/uid
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-16 15:49:50 +03:00
Martin Tzvetanov Grigorov
4cda584b0c Fix the build of stats.rs
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
2025-05-16 15:45:25 +03:00
Santhosh Reddy Vootukuri (SUNNY) (from Dev Box)
248c90bad5 removing .await 2025-05-16 15:29:24 +03:00
Santhosh Reddy Vootukuri (SUNNY) (from Dev Box)
0e9040e605 remove warnings 2025-05-16 15:29:23 +03:00
Santhosh Reddy Vootukuri (SUNNY) (from Dev Box)
3e3c00f44c fix for test failure 2025-05-16 15:29:23 +03:00
Santhosh Reddy Vootukuri (SUNNY) (from Dev Box)
d986a3bbaf Changes to index and expected_response as per feedback 2025-05-16 15:29:22 +03:00
Santhosh Reddy Vootukuri (SUNNY) (from Dev Box)
c2ceb8e41b Improve Integration tests in the file stats.rs 2025-05-16 15:29:18 +03:00
Tamo
a25eb9c136 Merge pull request #5566 from meilisearch/bad-max-total-hits
Forbid 0 in maxTotalHits
2025-05-15 15:01:22 +00:00
Tamo
cc2011a27f Merge pull request #5565 from meilisearch/fix-0-batched-task
Fix 0 batched task
2025-05-15 12:41:48 +00:00
Tamo
604e156c2b add the snapshots 2025-05-15 11:35:31 +02:00
Tamo
1d6777ee68 Forbid 0 in maxTotalHits 2025-05-15 11:32:08 +02:00
Nate Nethercott
79db2e67fb refactor: prefer helper over explicit pool construction
Co-authored-by: Many the fish <many@meilisearch.com>
2025-05-15 11:24:34 +02:00
Tamo
0940f0e4f4 add a test 2025-05-15 11:10:08 +02:00
Louis Dureuil
d40290aaaf Merge pull request #5560 from meilisearch/experimental-no-snapshot-compression
Add an experimental cli flag to disable snapshot compaction
2025-05-15 07:51:06 +00:00
nnethercott
865f24cfef refactor: helper methods for pool and max threads 2025-05-14 23:45:24 +02:00
Louis Dureuil
fd2de7c668 Merge pull request #5564 from meilisearch/dont-intern-without-typo-v15
Port to v1.15: Only intern in case of single-typo when looking for single typoes
2025-05-14 16:30:57 +00:00
Many the fish
448564b674 Merge pull request #5563 from meilisearch/fix-swarmia-deploy
Fix swarmia deployement
2025-05-14 16:12:05 +00:00
Louis Dureuil
c5dd8e7d6f Add test 2025-05-14 17:36:09 +02:00
Louis Dureuil
c9b4c1fb81 Only intern in case of single-typo when looking for single typoes 2025-05-14 17:36:03 +02:00
curquiza
0f10ec96af Fix swarmia deployement 2025-05-14 17:35:47 +02:00
Tamo
8608d10fa2 Don't process any tasks if the max number of batched tasks is set to 0 2025-05-14 17:09:10 +02:00
Tamo
83e71cd7b9 Add an experimental cli flag to disable snapshot compaction 2025-05-14 15:59:35 +02:00
Lucas Black
3fbe1df770 Updated nested_search_all_details_with_deep_wildcard() to test deeply nested attributes 2025-05-14 00:18:30 -07:00
Lucas Black
150d1db86b Implemented integration tests for restrict_searchable.rs on nested wildcard attributes 2025-05-13 21:44:24 -07:00
Nate Nethercott
806e983aa5 fix: lazy computation in thread default
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
2025-05-13 14:14:48 +02:00
nnethercott
e96c1d4b0f style: change fmt from empty str to "unlimited" 2025-05-13 12:16:34 +02:00
nnethercott
15cdc6924b refactor: remove runtime cfg!(test) check
Won't work in integration tests and consequently all threads would be
used. To remedy this we make explicit `max_threads=Some(1)` in the
IndexerConfig::default
2025-05-13 09:18:19 +02:00
Clément Renault
677e8b122c Merge pull request #5551 from meilisearch/dont-intern-without-typo
Only intern in case of single-typo when looking for single typoes
2025-05-12 20:23:39 +00:00
nnethercott
75a7e40a27 Merge branch 'main' into all-cpus-in-import-dump 2025-05-12 21:48:12 +02:00
Louis Dureuil
c8939944c6 Add test 2025-05-12 12:40:55 +02:00
Louis Dureuil
4e6252fb03 Only intern in case of single-typo when looking for single typoes 2025-05-12 11:59:21 +02:00
Lucas Black
8bd8e744f3 Attributes to search on supports nested wildcards 2025-05-09 02:42:48 -07:00
nnethercott
53f32a7dd7 refactor: change thread_pool from Option<ThreadPoolNoAbort> to
ThreadPoolNoAbort
2025-05-07 17:00:08 +02:00
nnethercott
47a7ed93d3 feat: Make MaxThreads None by default 2025-05-06 09:11:55 +02:00
Nate Nethercott
2ac826edca Apply suggested changes
Co-authored-by: Clément Renault <renault.cle@gmail.com>

Update crates/meilisearch/src/lib.rs

Co-authored-by: Clément Renault <renault.cle@gmail.com>
2025-05-01 16:12:06 +02:00
nnethercott
89aff2081c Fix clippy warnings 2025-04-30 14:17:32 +02:00
nnethercott
3b773b3416 Revert thread_pool type back to Option in config 2025-04-28 11:56:37 +02:00
nnethercott
648b2876f6 Create temp threadpool with all CPUs in dump 2025-04-27 00:52:10 +02:00
426 changed files with 26241 additions and 10138 deletions

View File

@@ -1,28 +1,26 @@
---
name: New sprint issue
about: ⚠️ Should only be used by the engine team ⚠️
name: New feature issue
about: ⚠️ Should only be used by the internal Meili team ⚠️
title: ''
labels: 'missing usage in PRD, impacts docs'
labels: 'impacts docs, impacts integrations'
assignees: ''
---
Related product team resources: [PRD]() (_internal only_)
Related product discussion:
## Motivation
<!---Copy/paste the information in PRD or briefly detail the product motivation. Ask product team if any hesitation.-->
## Usage
<!---Link to the public part of the PRD, or to the related product discussion for experimental features-->
TBD
## TODO
<!---If necessary, create a list with technical/product steps-->
### Are you modifying a database?
- [ ] If not, add the `no db change` label to your PR, and you're good to merge.
- [ ] If yes, add the `db change` label to your PR. You'll receive a message explaining you what to do.
@@ -54,5 +52,5 @@ Related product discussion:
## Impacted teams
<!---Ping the related teams. Ask for the engine manager if any hesitation-->
<!---@meilisearch/docs-team when there is any API change, e.g. settings addition-->
<!---Ping the related teams. Ask on Slack if any hesitation-->
<!---@meilisearch/docs-team and @meilisearch/integration-team when there is any API change, e.g. settings addition-->

16
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,16 @@
## Related issue
Fixes #...
## Requirements
⚠️ Ensure the following requirements before merging ⚠️
- [ ] Automated tests have been added.
- [ ] If some tests cannot be automated, manual rigorous tests should be applied.
- [ ] ⚠️ If there is any change in the DB:
- [ ] Test that any impacted DB still works as expected after using `--experimental-dumpless-upgrade` on a DB created with the last released Meilisearch
- [ ] Test that during the upgrade, **search is still available** (artificially make the upgrade longer if needed)
- [ ] Set the `db change` label.
- [ ] If necessary, the feature have been tested in the Cloud production environment (with [prototypes](./documentation/prototypes.md)) and the Cloud UI is ready.
- [ ] If necessary, the [documentation](https://github.com/meilisearch/documentation) related to the implemented feature in the PR is ready.
- [ ] If necessary, the [integrations](https://github.com/meilisearch/integration-guides) related to the implemented feature in the PR are ready.

33
.github/release-draft-template.yml vendored Normal file
View File

@@ -0,0 +1,33 @@
name-template: 'v$RESOLVED_VERSION'
tag-template: 'v$RESOLVED_VERSION'
exclude-labels:
- 'skip changelog'
version-resolver:
minor:
labels:
- 'enhancement'
default: patch
categories:
- title: '⚠️ Breaking changes'
label: 'breaking-change'
- title: '🚀 Enhancements'
label: 'enhancement'
- title: '🐛 Bug Fixes'
label: 'bug'
- title: '🔒 Security'
label: 'security'
- title: '⚙️ Maintenance/misc'
label:
- 'maintenance'
- 'documentation'
template: |
$CHANGES
❤️ Huge thanks to our contributors: $CONTRIBUTORS.
no-changes-template: 'Changes are coming soon 😎'
sort-direction: 'ascending'
replacers:
- search: '/(?:and )?@dependabot-preview(?:\[bot\])?,?/g'
replace: ''
- search: '/(?:and )?@dependabot(?:\[bot\])?,?/g'
replace: ''

22
.github/templates/dependency-issue.md vendored Normal file
View File

@@ -0,0 +1,22 @@
This issue is about updating Meilisearch dependencies:
- [ ] Update Meilisearch dependencies with the help of `cargo +nightly udeps --all-targets` (remove unused dependencies) and `cargo upgrade` (upgrade dependencies versions) - ⚠️ Some repositories may contain subdirectories (like heed, charabia, or deserr). Take care of updating these in the main crate as well. This won't be done automatically by `cargo upgrade`.
- [ ] [deserr](https://github.com/meilisearch/deserr)
- [ ] [charabia](https://github.com/meilisearch/charabia/)
- [ ] [heed](https://github.com/meilisearch/heed/)
- [ ] [roaring-rs](https://github.com/RoaringBitmap/roaring-rs/)
- [ ] [obkv](https://github.com/meilisearch/obkv)
- [ ] [grenad](https://github.com/meilisearch/grenad/)
- [ ] [arroy](https://github.com/meilisearch/arroy/)
- [ ] [segment](https://github.com/meilisearch/segment)
- [ ] [bumparaw-collections](https://github.com/meilisearch/bumparaw-collections)
- [ ] [bbqueue](https://github.com/meilisearch/bbqueue)
- [ ] Finally, [Meilisearch](https://github.com/meilisearch/MeiliSearch)
- [ ] If new Rust versions have been released, update the minimal Rust version in use at Meilisearch:
- [ ] in this [GitHub Action file](https://github.com/meilisearch/meilisearch/blob/main/.github/workflows/test-suite.yml), by changing the `toolchain` field of the `rustfmt` job to the latest available nightly (of the day before or the current day).
- [ ] in every [GitHub Action files](https://github.com/meilisearch/meilisearch/blob/main/.github/workflows), by changing all the `dtolnay/rust-toolchain@` references to use the latest stable version.
- [ ] in this [`rust-toolchain.toml`](https://github.com/meilisearch/meilisearch/blob/main/rust-toolchain.toml), by changing the `channel` field to the latest stable version.
- [ ] in the [Dockerfile](https://github.com/meilisearch/meilisearch/blob/main/Dockerfile), by changing the base image to `rust:<target_rust_version>-alpine<alpine_version>`. Check that the image exists on [Dockerhub](https://hub.docker.com/_/rust/tags?page=1&name=alpine). Also, build and run the image to check everything still works!
⚠️ This issue should be prioritized to avoid any deprecation and vulnerability issues.
The GitHub action dependencies are managed by [Dependabot](https://github.com/meilisearch/meilisearch/blob/main/.github/dependabot.yml), so no need to update them when solving this issue.

View File

@@ -1,100 +0,0 @@
name: PR Milestone Check
on:
pull_request:
types: [opened, reopened, edited, synchronize, milestoned, demilestoned]
branches:
- "main"
- "release-v*.*.*"
jobs:
check-milestone:
name: Check PR Milestone
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Validate PR milestone
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
// Get PR number directly from the event payload
const prNumber = context.payload.pull_request.number;
// Get PR details
const { data: prData } = await github.rest.pulls.get({
owner: 'meilisearch',
repo: 'meilisearch',
pull_number: prNumber
});
// Get base branch name
const baseBranch = prData.base.ref;
console.log(`Base branch: ${baseBranch}`);
// Get PR milestone
const prMilestone = prData.milestone;
if (!prMilestone) {
core.setFailed('PR must have a milestone assigned');
return;
}
console.log(`PR milestone: ${prMilestone.title}`);
// Validate milestone format: vx.y.z
const milestoneRegex = /^v\d+\.\d+\.\d+$/;
if (!milestoneRegex.test(prMilestone.title)) {
core.setFailed(`Milestone "${prMilestone.title}" does not follow the required format vx.y.z`);
return;
}
// For main branch PRs, check if the milestone is the highest one
if (baseBranch === 'main') {
// Get all milestones
const { data: milestones } = await github.rest.issues.listMilestones({
owner: 'meilisearch',
repo: 'meilisearch',
state: 'open',
sort: 'due_on',
direction: 'desc'
});
// Sort milestones by version number (vx.y.z)
const sortedMilestones = milestones
.filter(m => milestoneRegex.test(m.title))
.sort((a, b) => {
const versionA = a.title.substring(1).split('.').map(Number);
const versionB = b.title.substring(1).split('.').map(Number);
// Compare major version
if (versionA[0] !== versionB[0]) return versionB[0] - versionA[0];
// Compare minor version
if (versionA[1] !== versionB[1]) return versionB[1] - versionA[1];
// Compare patch version
return versionB[2] - versionA[2];
});
if (sortedMilestones.length === 0) {
core.setFailed('No valid milestones found in the repository. Please create at least one milestone with the format vx.y.z');
return;
}
const highestMilestone = sortedMilestones[0];
console.log(`Highest milestone: ${highestMilestone.title}`);
if (prMilestone.title !== highestMilestone.title) {
core.setFailed(`PRs targeting the main branch must use the highest milestone (${highestMilestone.title}), but this PR uses ${prMilestone.title}`);
return;
}
} else {
// For release branches, the milestone should match the branch version
const branchVersion = baseBranch.substring(8); // remove 'release-'
if (prMilestone.title !== branchVersion) {
core.setFailed(`PRs targeting release branch "${baseBranch}" must use the matching milestone "${branchVersion}", but this PR uses "${prMilestone.title}"`);
return;
}
}
console.log('PR milestone validation passed!');

View File

@@ -4,22 +4,22 @@ on:
pull_request:
types: [opened, synchronize, reopened, labeled, unlabeled]
env:
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs:
check-labels:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Check db change labels
id: check_labels
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
URL=/repos/meilisearch/meilisearch/pulls/${{ github.event.pull_request.number }}/labels
echo ${{ github.event.pull_request.number }}
echo $URL
LABELS=$(gh api -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" /repos/meilisearch/meilisearch/issues/${{ github.event.pull_request.number }}/labels -q .[].name)
LABELS=$(gh api -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" /repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels -q .[].name)
echo "Labels: $LABELS"
if [[ ! "$LABELS" =~ "db change" && ! "$LABELS" =~ "no db change" ]]; then
echo "::error::Pull request must contain either the 'db change' or 'no db change' label."
exit 1

View File

@@ -15,7 +15,7 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/dependency-issue.md > $ISSUE_TEMPLATE
run: curl -s https://raw.githubusercontent.com/meilisearch/meilisearch/main/.github/templates/dependency-issue.md > $ISSUE_TEMPLATE
- name: Create issue
run: |
gh issue create \

View File

@@ -3,7 +3,7 @@ name: Look for flaky tests
on:
workflow_dispatch:
schedule:
- cron: "0 12 * * FRI" # Every Friday at 12:00PM
- cron: '0 4 * * *' # Every day at 4:00AM
jobs:
flaky:

View File

@@ -1,224 +0,0 @@
name: Milestone's workflow
# /!\ No git flow are handled here
# For each Milestone created (not opened!), and if the release is NOT a patch release (only the patch changed)
# - the roadmap issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/roadmap-issue.md
# - the changelog issue is created, see https://github.com/meilisearch/engine-team/blob/main/issue-templates/changelog-issue.md
# - update the ruleset to add the current release version to the list of allowed versions and be able to use the merge queue.
# For each Milestone closed
# - the `release_version` label is created
# - this label is applied to all issues/PRs in the Milestone
on:
milestone:
types: [created, closed]
env:
MILESTONE_VERSION: ${{ github.event.milestone.title }}
MILESTONE_URL: ${{ github.event.milestone.html_url }}
MILESTONE_DUE_ON: ${{ github.event.milestone.due_on }}
GH_TOKEN: ${{ secrets.MEILI_BOT_GH_PAT }}
jobs:
# -----------------
# MILESTONE CREATED
# -----------------
get-release-version:
if: github.event.action == 'created'
runs-on: ubuntu-latest
outputs:
is-patch: ${{ steps.check-patch.outputs.is-patch }}
steps:
- uses: actions/checkout@v3
- name: Check if this release is a patch release only
id: check-patch
run: |
echo version: $MILESTONE_VERSION
if [[ $MILESTONE_VERSION =~ ^v[0-9]+\.[0-9]+\.0$ ]]; then
echo 'This is NOT a patch release'
echo "is-patch=false" >> $GITHUB_OUTPUT
elif [[ $MILESTONE_VERSION =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo 'This is a patch release'
echo "is-patch=true" >> $GITHUB_OUTPUT
else
echo "Not a valid format of release, check the Milestone's title."
echo 'Should be vX.Y.Z'
exit 1
fi
create-roadmap-issue:
needs: get-release-version
# Create the roadmap issue if the release is not only a patch release
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/roadmap-issue.md > $ISSUE_TEMPLATE
- name: Replace all empty occurrences in the templates
run: |
# Replace all <<version>> occurrences
sed -i "s/<<version>>/$MILESTONE_VERSION/g" $ISSUE_TEMPLATE
# Replace all <<milestone_id>> occurrences
milestone_id=$(echo $MILESTONE_URL | cut -d '/' -f 7)
sed -i "s/<<milestone_id>>/$milestone_id/g" $ISSUE_TEMPLATE
# Replace release date if exists
if [[ ! -z $MILESTONE_DUE_ON ]]; then
date=$(echo $MILESTONE_DUE_ON | cut -d 'T' -f 1)
sed -i "s/Release date\: 20XX-XX-XX/Release date\: $date/g" $ISSUE_TEMPLATE
fi
- name: Create the issue
run: |
gh issue create \
--title "$MILESTONE_VERSION ROADMAP" \
--label 'epic,impacts docs,impacts integrations,impacts cloud' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION
create-changelog-issue:
needs: get-release-version
# Create the changelog issue if the release is not only a patch release
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/changelog-issue.md > $ISSUE_TEMPLATE
- name: Replace all empty occurrences in the templates
run: |
# Replace all <<version>> occurrences
sed -i "s/<<version>>/$MILESTONE_VERSION/g" $ISSUE_TEMPLATE
# Replace all <<milestone_id>> occurrences
milestone_id=$(echo $MILESTONE_URL | cut -d '/' -f 7)
sed -i "s/<<milestone_id>>/$milestone_id/g" $ISSUE_TEMPLATE
- name: Create the issue
run: |
gh issue create \
--title "Create release changelogs for $MILESTONE_VERSION" \
--label 'impacts docs,documentation' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION \
--assignee curquiza
create-update-version-issue:
needs: get-release-version
# Create the update-version issue even if the release is a patch release
if: github.event.action == 'created'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-version-issue.md > $ISSUE_TEMPLATE
- name: Create the issue
run: |
gh issue create \
--title "Update version in Cargo.toml for $MILESTONE_VERSION" \
--label 'maintenance' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION
create-update-openapi-issue:
needs: get-release-version
# Create the openAPI issue if the release is not only a patch release
if: github.event.action == 'created' && needs.get-release-version.outputs.is-patch == 'false'
runs-on: ubuntu-latest
env:
ISSUE_TEMPLATE: issue-template.md
steps:
- uses: actions/checkout@v3
- name: Download the issue template
run: curl -s https://raw.githubusercontent.com/meilisearch/engine-team/main/issue-templates/update-openapi-issue.md > $ISSUE_TEMPLATE
- name: Create the issue
run: |
gh issue create \
--title "Update Open API file for $MILESTONE_VERSION" \
--label 'maintenance' \
--body-file $ISSUE_TEMPLATE \
--milestone $MILESTONE_VERSION
update-ruleset:
runs-on: ubuntu-latest
if: github.event.action == 'created'
steps:
- uses: actions/checkout@v3
- name: Install jq
run: |
sudo apt-get update
sudo apt-get install -y jq
- name: Update ruleset
env:
# gh api repos/meilisearch/meilisearch/rulesets --jq '.[] | {name: .name, id: .id}'
RULESET_ID: 4253297
BRANCH_NAME: ${{ github.event.inputs.branch_name }}
run: |
echo "RULESET_ID: ${{ env.RULESET_ID }}"
echo "BRANCH_NAME: ${{ env.BRANCH_NAME }}"
# Get current ruleset conditions
CONDITIONS=$(gh api repos/meilisearch/meilisearch/rulesets/${{ env.RULESET_ID }} --jq '{ conditions: .conditions }')
# Update the conditions by appending the milestone version
UPDATED_CONDITIONS=$(echo $CONDITIONS | jq '.conditions.ref_name.include += ["refs/heads/release-'${{ env.MILESTONE_VERSION }}'"]')
# Update the ruleset from stdin (-)
echo $UPDATED_CONDITIONS |
gh api repos/meilisearch/meilisearch/rulesets/${{ env.RULESET_ID }} \
--method PUT \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
--input -
# ----------------
# MILESTONE CLOSED
# ----------------
create-release-label:
if: github.event.action == 'closed'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Create the ${{ env.MILESTONE_VERSION }} label
run: |
label_description="PRs/issues solved in $MILESTONE_VERSION"
if [[ ! -z $MILESTONE_DUE_ON ]]; then
date=$(echo $MILESTONE_DUE_ON | cut -d 'T' -f 1)
label_description="$label_description released on $date"
fi
gh api repos/meilisearch/meilisearch/labels \
--method POST \
-H "Accept: application/vnd.github+json" \
-f name="$MILESTONE_VERSION" \
-f description="$label_description" \
-f color='ff5ba3'
labelize-all-milestone-content:
if: github.event.action == 'closed'
needs: create-release-label
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Add label ${{ env.MILESTONE_VERSION }} to all PRs in the Milestone
run: |
prs=$(gh pr list --search milestone:"$MILESTONE_VERSION" --limit 1000 --state all --json number --template '{{range .}}{{tablerow (printf "%v" .number)}}{{end}}')
for pr in $prs; do
gh pr edit $pr --add-label $MILESTONE_VERSION
done
- name: Add label ${{ env.MILESTONE_VERSION }} to all issues in the Milestone
run: |
issues=$(gh issue list --search milestone:"$MILESTONE_VERSION" --limit 1000 --state all --json number --template '{{range .}}{{tablerow (printf "%v" .number)}}{{end}}')
for issue in $issues; do
gh issue edit $issue --add-label $MILESTONE_VERSION
done

View File

@@ -32,7 +32,7 @@ jobs:
- name: Build deb package
run: cargo deb -p meilisearch -o target/debian/meilisearch.deb
- name: Upload debian pkg to release
uses: svenstaro/upload-release-action@2.7.0
uses: svenstaro/upload-release-action@2.11.2
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/debian/meilisearch.deb

View File

@@ -51,7 +51,7 @@ jobs:
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
uses: svenstaro/upload-release-action@2.11.2
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/meilisearch
@@ -81,7 +81,7 @@ jobs:
# No need to upload binaries for dry run (cron)
- name: Upload binaries to release
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
uses: svenstaro/upload-release-action@2.11.2
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/release/${{ matrix.artifact_name }}
@@ -113,7 +113,7 @@ jobs:
- name: Upload the binary to release
# No need to upload binaries for dry run (cron)
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
uses: svenstaro/upload-release-action@2.11.2
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/${{ matrix.target }}/release/meilisearch
@@ -178,7 +178,7 @@ jobs:
- name: Upload the binary to release
# No need to upload binaries for dry run (cron)
if: github.event_name == 'release'
uses: svenstaro/upload-release-action@2.7.0
uses: svenstaro/upload-release-action@2.11.2
with:
repo_token: ${{ secrets.MEILI_BOT_GH_PAT }}
file: target/${{ matrix.target }}/release/meilisearch

View File

@@ -16,6 +16,8 @@ on:
jobs:
docker:
runs-on: docker
permissions:
id-token: write # This is needed to use Cosign in keyless mode
steps:
- uses: actions/checkout@v3
@@ -62,6 +64,9 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Install cosign
uses: sigstore/cosign-installer@d58896d6a1865668819e1d91763c7751a165e159 # tag=v3.9.2
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
@@ -85,6 +90,7 @@ jobs:
- name: Build and push
uses: docker/build-push-action@v6
id: build-and-push
with:
push: true
platforms: linux/amd64,linux/arm64
@@ -94,6 +100,17 @@ jobs:
COMMIT_DATE=${{ steps.build-metadata.outputs.date }}
GIT_TAG=${{ github.ref_name }}
- name: Sign the images with GitHub OIDC Token
env:
DIGEST: ${{ steps.build-and-push.outputs.digest }}
TAGS: ${{ steps.meta.outputs.tags }}
run: |
images=""
for tag in ${TAGS}; do
images+="${tag}@${DIGEST} "
done
cosign sign --yes ${images}
# /!\ Don't touch this without checking with Cloud team
- name: Send CI information to Cloud team
# Do not send if nightly build (i.e. 'schedule' or 'workflow_dispatch' event)
@@ -106,18 +123,20 @@ jobs:
client-payload: '{ "meilisearch_version": "${{ github.ref_name }}", "stable": "${{ steps.check-tag-format.outputs.stable }}" }'
# Send notification to Swarmia to notify of a deployment: https://app.swarmia.com
- name: Send deployment to Swarmia
if: github.event_name == 'push' && success()
run: |
JSON_STRING=$( jq --null-input --compact-output \
--arg version "${{ github.ref_name }}" \
--arg appName "meilisearch" \
--arg environment "production" \
--arg commitSha "${{ github.sha }}" \
--arg repositoryFullName "${{ github.repository }}" \
'{"version": $version, "appName": $appName, "environment": $environment, "commitSha": $commitSha, "repositoryFullName": $repositoryFullName}' )
# - name: 'Setup jq'
# uses: dcarbone/install-jq-action
# - name: Send deployment to Swarmia
# if: github.event_name == 'push' && success()
# run: |
# JSON_STRING=$( jq --null-input --compact-output \
# --arg version "${{ github.ref_name }}" \
# --arg appName "meilisearch" \
# --arg environment "production" \
# --arg commitSha "${{ github.sha }}" \
# --arg repositoryFullName "${{ github.repository }}" \
# '{"version": $version, "appName": $appName, "environment": $environment, "commitSha": $commitSha, "repositoryFullName": $repositoryFullName}' )
curl -H "Authorization: ${{ secrets.SWARMIA_DEPLOYMENTS_AUTHORIZATION }}" \
-H "Content-Type: application/json" \
-d "$JSON_STRING" \
https://hook.swarmia.com/deployments
# curl -H "Authorization: ${{ secrets.SWARMIA_DEPLOYMENTS_AUTHORIZATION }}" \
# -H "Content-Type: application/json" \
# -d "$JSON_STRING" \
# https://hook.swarmia.com/deployments

20
.github/workflows/release-drafter.yml vendored Normal file
View File

@@ -0,0 +1,20 @@
name: Release Drafter
permissions:
contents: read
pull-requests: write
on:
push:
branches:
- main
jobs:
update_release_draft:
runs-on: ubuntu-latest
steps:
- uses: release-drafter/release-drafter@v6
with:
config-name: release-draft-template.yml
env:
GITHUB_TOKEN: ${{ secrets.RELEASE_DRAFTER_TOKEN }}

View File

@@ -9,7 +9,7 @@ on:
required: false
default: nightly
schedule:
- cron: "0 6 * * MON" # Every Monday at 6:00AM
- cron: '0 6 * * *' # Every day at 6:00am
env:
MEILI_MASTER_KEY: 'masterKey'
@@ -114,7 +114,7 @@ jobs:
dep ensure
fi
- name: Run integration tests
run: go test -v ./...
run: go test --race -v ./integration
meilisearch-java-tests:
needs: define-docker-image
@@ -344,15 +344,23 @@ jobs:
MEILI_NO_ANALYTICS: ${{ env.MEILI_NO_ANALYTICS }}
ports:
- '7700:7700'
env:
RAILS_VERSION: '7.0'
steps:
- uses: actions/checkout@v3
with:
repository: meilisearch/meilisearch-rails
- name: Set up Ruby 3
- name: Install SQLite dependencies
run: sudo apt-get update && sudo apt-get install -y libsqlite3-dev
- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: 3
bundler-cache: true
- name: Start MongoDB
uses: supercharge/mongodb-github-action@1.12.0
with:
mongodb-version: 8.0
- name: Run tests
run: bundle exec rspec

View File

@@ -3,7 +3,7 @@ name: Test suite
on:
workflow_dispatch:
schedule:
# Everyday at 5:00am
# Every day at 5:00am
- cron: "0 5 * * *"
pull_request:
merge_group:
@@ -29,7 +29,7 @@ jobs:
- name: Setup test with Rust stable
uses: dtolnay/rust-toolchain@1.85
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.8
uses: Swatinem/rust-cache@v2.8.0
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
with:
@@ -51,7 +51,7 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.8
uses: Swatinem/rust-cache@v2.8.0
- uses: dtolnay/rust-toolchain@1.85
- name: Run cargo check without any default features
uses: actions-rs/cargo@v1
@@ -155,7 +155,7 @@ jobs:
apt-get install build-essential -y
- uses: dtolnay/rust-toolchain@1.85
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.8
uses: Swatinem/rust-cache@v2.8.0
- name: Run tests in debug
uses: actions-rs/cargo@v1
with:
@@ -172,7 +172,7 @@ jobs:
profile: minimal
components: clippy
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.8
uses: Swatinem/rust-cache@v2.8.0
- name: Run cargo clippy
uses: actions-rs/cargo@v1
with:
@@ -191,7 +191,7 @@ jobs:
override: true
components: rustfmt
- name: Cache dependencies
uses: Swatinem/rust-cache@v2.7.8
uses: Swatinem/rust-cache@v2.8.0
- name: Run cargo fmt
# Since we never ran the `build.rs` script in the benchmark directory we are missing one auto-generated import file.
# Since we want to trigger (and fail) this action as fast as possible, instead of building the benchmark crate

13
.gitignore vendored
View File

@@ -5,18 +5,27 @@
**/*.json_lines
**/*.rs.bk
/*.mdb
/data.ms
/*.ms
/snapshots
/dumps
/bench
/_xtask_benchmark.ms
/benchmarks
.DS_Store
# Snapshots
## ... large
*.full.snap
## ... unreviewed
## ... unreviewed
*.snap.new
## ... pending
*.pending-snap
# Tmp files
.tmp*
# Database snapshot
crates/meilisearch/db.snapshot
# Fuzzcheck data for the facet indexing fuzz test
crates/milli/fuzz/update::facet::incremental::fuzz::fuzz/

View File

@@ -57,9 +57,17 @@ This command will be triggered to each PR as a requirement for merging it.
You can set the `LINDERA_CACHE` environment variable to speed up your successive builds by up to 2 minutes.
It'll store some built artifacts in the directory of your choice.
We recommend using the standard `$HOME/.cache/lindera` directory:
We recommend using the `$HOME/.cache/meili/lindera` directory:
```sh
export LINDERA_CACHE=$HOME/.cache/lindera
export LINDERA_CACHE=$HOME/.cache/meili/lindera
```
You can set the `MILLI_BENCH_DATASETS_PATH` environment variable to further speed up your builds.
It'll store some big files used for the benchmarks in the directory of your choice.
We recommend using the `$HOME/.cache/meili/benches` directory:
```sh
export MILLI_BENCH_DATASETS_PATH=$HOME/.cache/meili/benches
```
Furthermore, you can improve incremental compilation by setting the `MEILI_NO_VERGEN` environment variable.
@@ -98,7 +106,13 @@ Run `cargo xtask --help` from the root of the repository to find out what is ava
#### Update the openAPI file if the APIchanged
To update the openAPI file in the code, see [sprint_issue.md](https://github.com/meilisearch/meilisearch/blob/main/.github/ISSUE_TEMPLATE/sprint_issue.md#reminders-when-modifying-the-api).
If you want to update the openAPI file on the [open-api repository](https://github.com/meilisearch/open-api), see [update-openapi-issue.md](https://github.com/meilisearch/engine-team/blob/main/issue-templates/update-openapi-issue.md).
If you want to update the openAPI file on the [open-api repository](https://github.com/meilisearch/open-api):
- Pull the latest version of the latest rc of Meilisearch `git checkout release-vX.Y.Z; git pull`
- Starts Meilisearch with the `swagger` feature flag: `cargo run --features swagger`
- On a browser, open the following URL: http://localhost:7700/scalar
- Click the « Download openAPI file »
- Open a PR replacing [this file](https://github.com/meilisearch/open-api/blob/main/open-api.json) with the one downloaded
### Logging
@@ -152,25 +166,37 @@ Some notes on GitHub PRs:
The draft PRs are recommended when you want to show that you are working on something and make your work visible.
- The branch related to the PR must be **up-to-date with `main`** before merging. Fortunately, this project uses [GitHub Merge Queues](https://github.blog/news-insights/product-news/github-merge-queue-is-generally-available/) to automatically enforce this requirement without the PR author having to rebase manually.
## Release Process (for internal team only)
Meilisearch tools follow the [Semantic Versioning Convention](https://semver.org/).
### Automation to rebase and Merge the PRs
## Merging PRs
This project uses GitHub Merge Queues that helps us manage pull requests merging.
### How to Publish a new Release
Before merging a PR, the maintainer should ensure the following requirements are met
- Automated tests have been added.
- If some tests cannot be automated, manual rigorous tests should be applied.
- ⚠️ If there is an change in the DB: it's mandatory to manually test the `--experimental-dumpless-upgrade` on a DB of the previous Meilisearch minor version (e.g. v1.13 for the v1.14 release).
- If necessary, the feature have been tested in the Cloud production environment (with [prototypes](./documentation/prototypes.md)) and the Cloud UI is ready.
- If necessary, the [documentation](https://github.com/meilisearch/documentation) related to the implemented feature in the PR is ready.
- If necessary, the [integrations](https://github.com/meilisearch/integration-guides) related to the implemented feature in the PR are ready.
The full Meilisearch release process is described in [this guide](https://github.com/meilisearch/engine-team/blob/main/resources/meilisearch-release.md). Please follow it carefully before doing any release.
## Publish Process (for internal team only)
Meilisearch tools follow the [Semantic Versioning Convention](https://semver.org/).
### How to publish a new release
The full Meilisearch release process is described in [this guide](./documentation/release.md).
### How to publish a prototype
Depending on the developed feature, you might need to provide a prototyped version of Meilisearch to make it easier to test by the users.
This happens in two steps:
- [Release the prototype](https://github.com/meilisearch/engine-team/blob/main/resources/prototypes.md#how-to-publish-a-prototype)
- [Communicate about it](https://github.com/meilisearch/engine-team/blob/main/resources/prototypes.md#communication)
- [Release the prototype](./documentation/prototypes.md#how-to-publish-a-prototype)
- [Communicate about it](./documentation/prototypes.md#communication)
### How to implement and publish an experimental feature
Here is our [guidelines and process](./documentation/experimental-features.md) to implement and publish an experimental feature.
### Release assets

3327
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -22,7 +22,7 @@ members = [
]
[workspace.package]
version = "1.15.0"
version = "1.17.0"
authors = [
"Quentin de Quelen <quentin@dequelen.me>",
"Clément Renault <clement@meilisearch.com>",

View File

@@ -119,6 +119,6 @@ Meilisearch is, and will always be, open-source! If you want to contribute to th
Meilisearch releases and their associated binaries are available on the project's [releases page](https://github.com/meilisearch/meilisearch/releases).
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](https://github.com/meilisearch/engine-team/blob/main/resources/versioning-policy.md).
The binaries are versioned following [SemVer conventions](https://semver.org/). To know more, read our [versioning policy](./documentation/versioning-policy.md).
Differently from the binaries, crates in this repository are not currently available on [crates.io](https://crates.io/) and do not follow [SemVer conventions](https://semver.org).

View File

@@ -11,27 +11,27 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.95"
bumpalo = "3.16.0"
anyhow = "1.0.98"
bumpalo = "3.18.1"
csv = "1.3.1"
memmap2 = "0.9.5"
memmap2 = "0.9.7"
milli = { path = "../milli" }
mimalloc = { version = "0.1.43", default-features = false }
serde_json = { version = "1.0.135", features = ["preserve_order"] }
tempfile = "3.15.0"
mimalloc = { version = "0.1.47", default-features = false }
serde_json = { version = "1.0.140", features = ["preserve_order"] }
tempfile = "3.20.0"
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
criterion = { version = "0.6.0", features = ["html_reports"] }
rand = "0.8.5"
rand_chacha = "0.3.1"
roaring = "0.10.10"
roaring = "0.10.12"
[build-dependencies]
anyhow = "1.0.95"
bytes = "1.9.0"
convert_case = "0.6.0"
flate2 = "1.0.35"
reqwest = { version = "0.12.12", features = ["blocking", "rustls-tls"], default-features = false }
anyhow = "1.0.98"
bytes = "1.10.1"
convert_case = "0.8.0"
flate2 = "1.1.2"
reqwest = { version = "0.12.20", features = ["blocking", "rustls-tls"], default-features = false }
[features]
default = ["milli/all-tokenizations"]
@@ -51,3 +51,11 @@ harness = false
[[bench]]
name = "indexing"
harness = false
[[bench]]
name = "sort"
harness = false
[[bench]]
name = "filter_starts_with"
harness = false

View File

@@ -0,0 +1,66 @@
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use milli::FilterableAttributesRule;
use utils::Conf;
#[cfg(not(windows))]
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields = ["geonameid", "name"].iter().map(|s| s.to_string()).collect();
builder.set_displayed_fields(displayed_fields);
let filterable_fields =
["name"].iter().map(|s| FilterableAttributesRule::Field(s.to_string())).collect();
builder.set_filterable_fields(filterable_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
dataset_format: "jsonl",
queries: &[
"",
],
configure: base_conf,
primary_key: Some("geonameid"),
..Conf::BASE
};
fn filter_starts_with(c: &mut criterion::Criterion) {
#[rustfmt::skip]
let confs = &[
utils::Conf {
group_name: "1 letter",
filter: Some("name STARTS WITH e"),
..BASE_CONF
},
utils::Conf {
group_name: "2 letters",
filter: Some("name STARTS WITH es"),
..BASE_CONF
},
utils::Conf {
group_name: "3 letters",
filter: Some("name STARTS WITH est"),
..BASE_CONF
},
utils::Conf {
group_name: "6 letters",
filter: Some("name STARTS WITH estoni"),
..BASE_CONF
}
];
utils::run_benches(c, confs);
}
criterion_group!(benches, filter_starts_with);
criterion_main!(benches);

View File

@@ -11,7 +11,7 @@ use milli::heed::{EnvOpenOptions, RwTxn};
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::vector::RuntimeEmbedders;
use milli::{FilterableAttributesRule, Index};
use rand::seq::SliceRandom;
use rand_chacha::rand_core::SeedableRng;
@@ -65,7 +65,7 @@ fn setup_settings<'t>(
let sortable_fields = sortable_fields.iter().map(|s| s.to_string()).collect();
builder.set_sortable_fields(sortable_fields);
builder.execute(|_| (), || false).unwrap();
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap();
}
fn setup_index_with_settings(
@@ -166,9 +166,10 @@ fn indexing_songs_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -232,9 +233,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -276,9 +278,10 @@ fn reindexing_songs_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -344,9 +347,10 @@ fn deleting_songs_in_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -420,9 +424,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -464,9 +469,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -504,9 +510,10 @@ fn indexing_songs_in_three_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -571,9 +578,10 @@ fn indexing_songs_without_faceted_numbers(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -637,9 +645,10 @@ fn indexing_songs_without_faceted_fields(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -703,9 +712,10 @@ fn indexing_wiki(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -768,9 +778,10 @@ fn reindexing_wiki(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -812,9 +823,10 @@ fn reindexing_wiki(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -879,9 +891,10 @@ fn deleting_wiki_in_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -955,9 +968,10 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1000,9 +1014,10 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1041,9 +1056,10 @@ fn indexing_wiki_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1107,9 +1123,10 @@ fn indexing_movies_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1172,9 +1189,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1216,9 +1234,10 @@ fn reindexing_movies_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1283,9 +1302,10 @@ fn deleting_movies_in_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1331,9 +1351,10 @@ fn delete_documents_from_ids(index: Index, document_ids_to_delete: Vec<RoaringBi
new_fields_ids_map,
Some(primary_key),
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1395,9 +1416,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1439,9 +1461,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1479,9 +1502,10 @@ fn indexing_movies_in_three_batches(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1568,9 +1592,10 @@ fn indexing_nested_movies_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1658,9 +1683,10 @@ fn deleting_nested_movies_in_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1740,9 +1766,10 @@ fn indexing_nested_movies_without_faceted_fields(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1806,9 +1833,10 @@ fn indexing_geo(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1871,9 +1899,10 @@ fn reindexing_geo(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1915,9 +1944,10 @@ fn reindexing_geo(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -1982,9 +2012,10 @@ fn deleting_geo_in_batches_default(c: &mut Criterion) {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();

View File

@@ -2,7 +2,8 @@ mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::{update::Settings, FilterableAttributesRule};
use milli::update::Settings;
use milli::FilterableAttributesRule;
use utils::Conf;
#[cfg(not(windows))]

View File

@@ -2,7 +2,8 @@ mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::{update::Settings, FilterableAttributesRule};
use milli::update::Settings;
use milli::FilterableAttributesRule;
use utils::Conf;
#[cfg(not(windows))]

View File

@@ -0,0 +1,114 @@
//! This benchmark module is used to compare the performance of sorting documents in /search VS /documents
//!
//! The tests/benchmarks were designed in the context of a query returning only 20 documents.
mod datasets_paths;
mod utils;
use criterion::{criterion_group, criterion_main};
use milli::update::Settings;
use utils::Conf;
#[cfg(not(windows))]
#[global_allocator]
static ALLOC: mimalloc::MiMalloc = mimalloc::MiMalloc;
fn base_conf(builder: &mut Settings) {
let displayed_fields =
["geonameid", "name", "asciiname", "alternatenames", "_geo", "population"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_displayed_fields(displayed_fields);
let sortable_fields =
["_geo", "name", "population", "elevation", "timezone", "modification-date"]
.iter()
.map(|s| s.to_string())
.collect();
builder.set_sortable_fields(sortable_fields);
}
#[rustfmt::skip]
const BASE_CONF: Conf = Conf {
dataset: datasets_paths::SMOL_ALL_COUNTRIES,
dataset_format: "jsonl",
configure: base_conf,
primary_key: Some("geonameid"),
queries: &[""],
offsets: &[
Some((0, 20)), // The most common query in the real world
Some((0, 500)), // A query that ranges over many documents
Some((980, 20)), // The worst query that could happen in the real world
Some((800_000, 20)) // The worst query
],
get_documents: true,
..Conf::BASE
};
fn bench_sort(c: &mut criterion::Criterion) {
#[rustfmt::skip]
let confs = &[
utils::Conf {
group_name: "without sort",
sort: None,
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different values",
sort: Some(vec!["name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar values",
sort: Some(vec!["timezone:desc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar then different values",
sort: Some(vec!["timezone:desc", "name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different then similar values",
sort: Some(vec!["timezone:desc", "name:asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "geo sort",
sample_size: Some(10),
sort: Some(vec!["_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many similar values then geo sort",
sample_size: Some(50),
sort: Some(vec!["timezone:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many different values then geo sort",
sample_size: Some(50),
sort: Some(vec!["name:desc", "_geoPoint(45.4777599, 9.1967508):asc"]),
..BASE_CONF
},
utils::Conf {
group_name: "sort on many fields",
sort: Some(vec!["population:asc", "name:asc", "elevation:asc", "timezone:asc"]),
..BASE_CONF
},
];
utils::run_benches(c, confs);
}
criterion_group!(benches, bench_sort);
criterion_main!(benches);

View File

@@ -9,11 +9,12 @@ use anyhow::Context;
use bumpalo::Bump;
use criterion::BenchmarkId;
use memmap2::Mmap;
use milli::documents::sort::recursive_sort;
use milli::heed::EnvOpenOptions;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::{IndexerConfig, Settings};
use milli::vector::EmbeddingConfigs;
use milli::vector::RuntimeEmbedders;
use milli::{Criterion, Filter, Index, Object, TermsMatchingStrategy};
use serde_json::Value;
@@ -35,6 +36,12 @@ pub struct Conf<'a> {
pub configure: fn(&mut Settings),
pub filter: Option<&'a str>,
pub sort: Option<Vec<&'a str>>,
/// set to skip documents (offset, limit)
pub offsets: &'a [Option<(usize, usize)>],
/// enable if you want to bench getting documents without querying
pub get_documents: bool,
/// configure the benchmark sample size
pub sample_size: Option<usize>,
/// enable or disable the optional words on the query
pub optional_words: bool,
/// primary key, if there is None we'll auto-generate docids for every documents
@@ -52,6 +59,9 @@ impl Conf<'_> {
configure: |_| (),
filter: None,
sort: None,
offsets: &[None],
get_documents: false,
sample_size: None,
optional_words: true,
primary_key: None,
};
@@ -90,7 +100,7 @@ pub fn base_setup(conf: &Conf) -> Index {
(conf.configure)(&mut builder);
builder.execute(|_| (), || false).unwrap();
builder.execute(&|| false, &Progress::default(), Default::default()).unwrap();
wtxn.commit().unwrap();
let config = IndexerConfig::default();
@@ -125,9 +135,10 @@ pub fn base_setup(conf: &Conf) -> Index {
new_fields_ids_map,
primary_key,
&document_changes,
EmbeddingConfigs::default(),
RuntimeEmbedders::default(),
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();
@@ -144,25 +155,79 @@ pub fn run_benches(c: &mut criterion::Criterion, confs: &[Conf]) {
let file_name = Path::new(conf.dataset).file_name().and_then(|f| f.to_str()).unwrap();
let name = format!("{}: {}", file_name, conf.group_name);
let mut group = c.benchmark_group(&name);
if let Some(sample_size) = conf.sample_size {
group.sample_size(sample_size);
}
for &query in conf.queries {
group.bench_with_input(BenchmarkId::from_parameter(query), &query, |b, &query| {
b.iter(|| {
let rtxn = index.read_txn().unwrap();
let mut search = index.search(&rtxn);
search.query(query).terms_matching_strategy(TermsMatchingStrategy::default());
if let Some(filter) = conf.filter {
let filter = Filter::from_str(filter).unwrap().unwrap();
search.filter(filter);
}
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
search.sort_criteria(sort);
}
let _ids = search.execute().unwrap();
});
});
for offset in conf.offsets {
let parameter = match offset {
None => query.to_string(),
Some((offset, limit)) => format!("{query}[{offset}:{limit}]"),
};
group.bench_with_input(
BenchmarkId::from_parameter(parameter),
&query,
|b, &query| {
b.iter(|| {
let rtxn = index.read_txn().unwrap();
let mut search = index.search(&rtxn);
search
.query(query)
.terms_matching_strategy(TermsMatchingStrategy::default());
if let Some(filter) = conf.filter {
let filter = Filter::from_str(filter).unwrap().unwrap();
search.filter(filter);
}
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
search.sort_criteria(sort);
}
if let Some((offset, limit)) = offset {
search.offset(*offset).limit(*limit);
}
let _ids = search.execute().unwrap();
});
},
);
}
}
if conf.get_documents {
for offset in conf.offsets {
let parameter = match offset {
None => String::from("get_documents"),
Some((offset, limit)) => format!("get_documents[{offset}:{limit}]"),
};
group.bench_with_input(BenchmarkId::from_parameter(parameter), &(), |b, &()| {
b.iter(|| {
let rtxn = index.read_txn().unwrap();
if let Some(sort) = &conf.sort {
let sort = sort.iter().map(|sort| sort.parse().unwrap()).collect();
let all_docs = index.documents_ids(&rtxn).unwrap();
let facet_sort =
recursive_sort(&index, &rtxn, sort, &all_docs).unwrap();
let iter = facet_sort.iter().unwrap();
if let Some((offset, limit)) = offset {
let _results = iter.skip(*offset).take(*limit).collect::<Vec<_>>();
} else {
let _results = iter.collect::<Vec<_>>();
}
} else {
let all_docs = index.documents_ids(&rtxn).unwrap();
if let Some((offset, limit)) = offset {
let _results =
all_docs.iter().skip(*offset).take(*limit).collect::<Vec<_>>();
} else {
let _results = all_docs.iter().collect::<Vec<_>>();
}
}
});
});
}
}
group.finish();
index.prepare_for_closing().wait();

View File

@@ -67,7 +67,7 @@ fn main() -> anyhow::Result<()> {
writeln!(
&mut manifest_paths_file,
r#"pub const {}: &str = {:?};"#,
dataset.to_case(Case::ScreamingSnake),
dataset.to_case(Case::UpperSnake),
out_file.display(),
)?;

View File

@@ -11,8 +11,8 @@ license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
time = { version = "0.3.37", features = ["parsing"] }
time = { version = "0.3.41", features = ["parsing"] }
[build-dependencies]
anyhow = "1.0.95"
vergen-git2 = "1.0.2"
anyhow = "1.0.98"
vergen-git2 = "1.0.7"

View File

@@ -11,21 +11,21 @@ readme.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.95"
flate2 = "1.0.35"
http = "1.2.0"
anyhow = "1.0.98"
flate2 = "1.1.2"
http = "1.3.1"
meilisearch-types = { path = "../meilisearch-types" }
once_cell = "1.20.2"
once_cell = "1.21.3"
regex = "1.11.1"
roaring = { version = "0.10.10", features = ["serde"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.135", features = ["preserve_order"] }
tar = "0.4.43"
tempfile = "3.15.0"
thiserror = "2.0.9"
time = { version = "0.3.37", features = ["serde-well-known", "formatting", "parsing", "macros"] }
roaring = { version = "0.10.12", features = ["serde"] }
serde = { version = "1.0.219", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] }
tar = "0.4.44"
tempfile = "3.20.0"
thiserror = "2.0.12"
time = { version = "0.3.41", features = ["serde-well-known", "formatting", "parsing", "macros"] }
tracing = "0.1.41"
uuid = { version = "1.11.0", features = ["serde", "v4"] }
uuid = { version = "1.17.0", features = ["serde", "v4"] }
[dev-dependencies]
big_s = "1.0.2"

View File

@@ -1,12 +1,17 @@
#![allow(clippy::type_complexity)]
#![allow(clippy::wrong_self_convention)]
use std::collections::BTreeMap;
use meilisearch_types::batches::BatchId;
use meilisearch_types::byte_unit::Byte;
use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::Key;
use meilisearch_types::milli::update::IndexDocumentsMethod;
use meilisearch_types::settings::Unchecked;
use meilisearch_types::tasks::{Details, IndexSwap, KindWithContent, Status, Task, TaskId};
use meilisearch_types::tasks::{
Details, ExportIndexSettings, IndexSwap, KindWithContent, Status, Task, TaskId,
};
use meilisearch_types::InstanceUid;
use roaring::RoaringBitmap;
use serde::{Deserialize, Serialize};
@@ -141,6 +146,12 @@ pub enum KindDump {
instance_uid: Option<InstanceUid>,
},
SnapshotCreation,
Export {
url: String,
api_key: Option<String>,
payload_size: Option<Byte>,
indexes: BTreeMap<String, ExportIndexSettings>,
},
UpgradeDatabase {
from: (u32, u32, u32),
},
@@ -213,6 +224,15 @@ impl From<KindWithContent> for KindDump {
KindDump::DumpCreation { keys, instance_uid }
}
KindWithContent::SnapshotCreation => KindDump::SnapshotCreation,
KindWithContent::Export { url, api_key, payload_size, indexes } => KindDump::Export {
url,
api_key,
payload_size,
indexes: indexes
.into_iter()
.map(|(pattern, settings)| (pattern.to_string(), settings))
.collect(),
},
KindWithContent::UpgradeDatabase { from: version } => {
KindDump::UpgradeDatabase { from: version }
}
@@ -305,6 +325,7 @@ pub(crate) mod test {
localized_attributes: Setting::NotSet,
facet_search: Setting::NotSet,
prefix_search: Setting::NotSet,
chat: Setting::NotSet,
_kind: std::marker::PhantomData,
};
settings.check()
@@ -328,6 +349,7 @@ pub(crate) mod test {
write_channel_congestion: None,
internal_database_sizes: Default::default(),
},
embedder_stats: Default::default(),
enqueued_at: Some(BatchEnqueuedAt {
earliest: datetime!(2022-11-11 0:00 UTC),
oldest: datetime!(2022-11-11 0:00 UTC),

View File

@@ -1,3 +1,4 @@
use std::fs::File;
use std::str::FromStr;
use super::v2_to_v3::CompatV2ToV3;
@@ -94,6 +95,10 @@ impl CompatIndexV1ToV2 {
self.from.documents().map(|it| Box::new(it) as Box<dyn Iterator<Item = _>>)
}
pub fn documents_file(&self) -> &File {
self.from.documents_file()
}
pub fn settings(&mut self) -> Result<v2::settings::Settings<v2::settings::Checked>> {
Ok(v2::settings::Settings::<v2::settings::Unchecked>::from(self.from.settings()?).check())
}

View File

@@ -1,3 +1,4 @@
use std::fs::File;
use std::str::FromStr;
use time::OffsetDateTime;
@@ -122,6 +123,13 @@ impl CompatIndexV2ToV3 {
}
}
pub fn documents_file(&self) -> &File {
match self {
CompatIndexV2ToV3::V2(v2) => v2.documents_file(),
CompatIndexV2ToV3::Compat(compat) => compat.documents_file(),
}
}
pub fn settings(&mut self) -> Result<v3::Settings<v3::Checked>> {
let settings = match self {
CompatIndexV2ToV3::V2(from) => from.settings()?,

View File

@@ -1,3 +1,5 @@
use std::fs::File;
use super::v2_to_v3::{CompatIndexV2ToV3, CompatV2ToV3};
use super::v4_to_v5::CompatV4ToV5;
use crate::reader::{v3, v4, UpdateFile};
@@ -252,6 +254,13 @@ impl CompatIndexV3ToV4 {
}
}
pub fn documents_file(&self) -> &File {
match self {
CompatIndexV3ToV4::V3(v3) => v3.documents_file(),
CompatIndexV3ToV4::Compat(compat) => compat.documents_file(),
}
}
pub fn settings(&mut self) -> Result<v4::Settings<v4::Checked>> {
Ok(match self {
CompatIndexV3ToV4::V3(v3) => {

View File

@@ -1,3 +1,5 @@
use std::fs::File;
use super::v3_to_v4::{CompatIndexV3ToV4, CompatV3ToV4};
use super::v5_to_v6::CompatV5ToV6;
use crate::reader::{v4, v5, Document};
@@ -241,6 +243,13 @@ impl CompatIndexV4ToV5 {
}
}
pub fn documents_file(&self) -> &File {
match self {
CompatIndexV4ToV5::V4(v4) => v4.documents_file(),
CompatIndexV4ToV5::Compat(compat) => compat.documents_file(),
}
}
pub fn settings(&mut self) -> Result<v5::Settings<v5::Checked>> {
match self {
CompatIndexV4ToV5::V4(v4) => Ok(v5::Settings::from(v4.settings()?).check()),

View File

@@ -1,3 +1,5 @@
use std::fs::File;
use std::num::NonZeroUsize;
use std::str::FromStr;
use super::v4_to_v5::{CompatIndexV4ToV5, CompatV4ToV5};
@@ -200,6 +202,10 @@ impl CompatV5ToV6 {
pub fn network(&self) -> Result<Option<&v6::Network>> {
Ok(None)
}
pub fn webhooks(&self) -> Option<&v6::Webhooks> {
None
}
}
pub enum CompatIndexV5ToV6 {
@@ -242,6 +248,13 @@ impl CompatIndexV5ToV6 {
}
}
pub fn documents_file(&self) -> &File {
match self {
CompatIndexV5ToV6::V5(v5) => v5.documents_file(),
CompatIndexV5ToV6::Compat(compat) => compat.documents_file(),
}
}
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
match self {
CompatIndexV5ToV6::V5(v5) => Ok(v6::Settings::from(v5.settings()?).check()),
@@ -388,7 +401,13 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
},
pagination: match settings.pagination {
v5::Setting::Set(pagination) => v6::Setting::Set(v6::PaginationSettings {
max_total_hits: pagination.max_total_hits.into(),
max_total_hits: match pagination.max_total_hits {
v5::Setting::Set(max_total_hits) => v6::Setting::Set(
max_total_hits.try_into().unwrap_or(NonZeroUsize::new(1).unwrap()),
),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
},
}),
v5::Setting::Reset => v6::Setting::Reset,
v5::Setting::NotSet => v6::Setting::NotSet,
@@ -398,6 +417,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
search_cutoff_ms: v6::Setting::NotSet,
facet_search: v6::Setting::NotSet,
prefix_search: v6::Setting::NotSet,
chat: v6::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}

View File

@@ -116,6 +116,15 @@ impl DumpReader {
}
}
pub fn chat_completions_settings(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(String, v6::ChatCompletionSettings)>> + '_>> {
match self {
DumpReader::Current(current) => current.chat_completions_settings(),
DumpReader::Compat(_compat) => Ok(Box::new(std::iter::empty())),
}
}
pub fn features(&self) -> Result<Option<v6::RuntimeTogglableFeatures>> {
match self {
DumpReader::Current(current) => Ok(current.features()),
@@ -129,6 +138,13 @@ impl DumpReader {
DumpReader::Compat(compat) => compat.network(),
}
}
pub fn webhooks(&self) -> Option<&v6::Webhooks> {
match self {
DumpReader::Current(current) => current.webhooks(),
DumpReader::Compat(compat) => compat.webhooks(),
}
}
}
impl From<V6Reader> for DumpReader {
@@ -183,6 +199,14 @@ impl DumpIndexReader {
}
}
/// A reference to a file in the NDJSON format containing all the documents of the index
pub fn documents_file(&self) -> &File {
match self {
DumpIndexReader::Current(v6) => v6.documents_file(),
DumpIndexReader::Compat(compat) => compat.documents_file(),
}
}
pub fn settings(&mut self) -> Result<v6::Settings<v6::Checked>> {
match self {
DumpIndexReader::Current(v6) => v6.settings(),
@@ -348,6 +372,7 @@ pub(crate) mod test {
assert_eq!(dump.features().unwrap().unwrap(), RuntimeTogglableFeatures::default());
assert_eq!(dump.network().unwrap(), None);
assert_eq!(dump.webhooks(), None);
}
#[test]
@@ -418,6 +443,43 @@ pub(crate) mod test {
insta::assert_snapshot!(network.remotes.get("ms-2").as_ref().unwrap().search_api_key.as_ref().unwrap(), @"foo");
}
#[test]
fn import_dump_v6_webhooks() {
let dump = File::open("tests/assets/v6-with-webhooks.dump").unwrap();
let dump = DumpReader::open(dump).unwrap();
// top level infos
insta::assert_snapshot!(dump.date().unwrap(), @"2025-07-31 9:21:30.479544 +00:00:00");
insta::assert_debug_snapshot!(dump.instance_uid().unwrap(), @r"
Some(
cb887dcc-34b3-48d1-addd-9815ae721a81,
)
");
// webhooks
let webhooks = dump.webhooks().unwrap();
insta::assert_json_snapshot!(webhooks, @r#"
{
"webhooks": {
"627ea538-733d-4545-8d2d-03526eb381ce": {
"url": "https://example.com/authorization-less",
"headers": {}
},
"771b0a28-ef28-4082-b984-536f82958c65": {
"url": "https://example.com/hook",
"headers": {
"authorization": "TOKEN"
}
},
"f3583083-f8a7-4cbf-a5e7-fb3f1e28a7e9": {
"url": "https://third.com",
"headers": {}
}
}
}
"#);
}
#[test]
fn import_dump_v5() {
let dump = File::open("tests/assets/v5.dump").unwrap();

View File

@@ -72,6 +72,10 @@ impl V1IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<self::settings::Settings> {
Ok(serde_json::from_reader(&mut self.settings)?)
}

View File

@@ -203,6 +203,10 @@ impl V2IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}

View File

@@ -215,6 +215,10 @@ impl V3IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}

View File

@@ -210,6 +210,10 @@ impl V4IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}

View File

@@ -247,6 +247,10 @@ impl V5IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
Ok(self.settings.clone())
}

View File

@@ -1,3 +1,4 @@
use std::ffi::OsStr;
use std::fs::{self, File};
use std::io::{BufRead, BufReader, ErrorKind};
use std::path::Path;
@@ -21,8 +22,10 @@ pub type Unchecked = meilisearch_types::settings::Unchecked;
pub type Task = crate::TaskDump;
pub type Batch = meilisearch_types::batches::Batch;
pub type Key = meilisearch_types::keys::Key;
pub type ChatCompletionSettings = meilisearch_types::features::ChatCompletionSettings;
pub type RuntimeTogglableFeatures = meilisearch_types::features::RuntimeTogglableFeatures;
pub type Network = meilisearch_types::features::Network;
pub type Webhooks = meilisearch_types::webhooks::WebhooksDumpView;
// ===== Other types to clarify the code of the compat module
// everything related to the tasks
@@ -57,6 +60,7 @@ pub struct V6Reader {
keys: BufReader<File>,
features: Option<RuntimeTogglableFeatures>,
network: Option<Network>,
webhooks: Option<Webhooks>,
}
impl V6Reader {
@@ -91,8 +95,8 @@ impl V6Reader {
Err(e) => return Err(e.into()),
};
let network_file = match fs::read(dump.path().join("network.json")) {
Ok(network_file) => Some(network_file),
let network = match fs::read(dump.path().join("network.json")) {
Ok(network_file) => Some(serde_json::from_reader(&*network_file)?),
Err(error) => match error.kind() {
// Allows the file to be missing, this will only result in all experimental features disabled.
ErrorKind::NotFound => {
@@ -102,10 +106,16 @@ impl V6Reader {
_ => return Err(error.into()),
},
};
let network = if let Some(network_file) = network_file {
Some(serde_json::from_reader(&*network_file)?)
} else {
None
let webhooks = match fs::read(dump.path().join("webhooks.json")) {
Ok(webhooks_file) => Some(serde_json::from_reader(&*webhooks_file)?),
Err(error) => match error.kind() {
ErrorKind::NotFound => {
debug!("`webhooks.json` not found in dump");
None
}
_ => return Err(error.into()),
},
};
Ok(V6Reader {
@@ -117,6 +127,7 @@ impl V6Reader {
features,
network,
dump,
webhooks,
})
}
@@ -192,6 +203,34 @@ impl V6Reader {
)
}
pub fn chat_completions_settings(
&mut self,
) -> Result<Box<dyn Iterator<Item = Result<(String, ChatCompletionSettings)>> + '_>> {
let entries = match fs::read_dir(self.dump.path().join("chat-completions-settings")) {
Ok(entries) => entries,
Err(e) if e.kind() == ErrorKind::NotFound => return Ok(Box::new(std::iter::empty())),
Err(e) => return Err(e.into()),
};
Ok(Box::new(
entries
.map(|entry| -> Result<Option<_>> {
let entry = entry?;
let file_name = entry.file_name();
let path = Path::new(&file_name);
if entry.file_type()?.is_file() && path.extension() == Some(OsStr::new("json"))
{
let name = path.file_stem().unwrap().to_str().unwrap().to_string();
let file = File::open(entry.path())?;
let settings = serde_json::from_reader(file)?;
Ok(Some((name, settings)))
} else {
Ok(None)
}
})
.filter_map(|entry| entry.transpose()),
))
}
pub fn features(&self) -> Option<RuntimeTogglableFeatures> {
self.features
}
@@ -199,6 +238,10 @@ impl V6Reader {
pub fn network(&self) -> Option<&Network> {
self.network.as_ref()
}
pub fn webhooks(&self) -> Option<&Webhooks> {
self.webhooks.as_ref()
}
}
pub struct UpdateFile {
@@ -254,6 +297,10 @@ impl V6IndexReader {
.map(|line| -> Result<_> { Ok(serde_json::from_str(&line?)?) }))
}
pub fn documents_file(&self) -> &File {
self.documents.get_ref()
}
pub fn settings(&mut self) -> Result<Settings<Checked>> {
let mut settings: Settings<Unchecked> = serde_json::from_reader(&mut self.settings)?;
patch_embedders(&mut settings);

View File

@@ -5,9 +5,10 @@ use std::path::PathBuf;
use flate2::write::GzEncoder;
use flate2::Compression;
use meilisearch_types::batches::Batch;
use meilisearch_types::features::{Network, RuntimeTogglableFeatures};
use meilisearch_types::features::{ChatCompletionSettings, Network, RuntimeTogglableFeatures};
use meilisearch_types::keys::Key;
use meilisearch_types::settings::{Checked, Settings};
use meilisearch_types::webhooks::WebhooksDumpView;
use serde_json::{Map, Value};
use tempfile::TempDir;
use time::OffsetDateTime;
@@ -51,6 +52,10 @@ impl DumpWriter {
KeyWriter::new(self.dir.path().to_path_buf())
}
pub fn create_chat_completions_settings(&self) -> Result<ChatCompletionsSettingsWriter> {
ChatCompletionsSettingsWriter::new(self.dir.path().join("chat-completions-settings"))
}
pub fn create_tasks_queue(&self) -> Result<TaskWriter> {
TaskWriter::new(self.dir.path().join("tasks"))
}
@@ -70,6 +75,13 @@ impl DumpWriter {
Ok(std::fs::write(self.dir.path().join("network.json"), serde_json::to_string(&network)?)?)
}
pub fn create_webhooks(&self, webhooks: WebhooksDumpView) -> Result<()> {
Ok(std::fs::write(
self.dir.path().join("webhooks.json"),
serde_json::to_string(&webhooks)?,
)?)
}
pub fn persist_to(self, mut writer: impl Write) -> Result<()> {
let gz_encoder = GzEncoder::new(&mut writer, Compression::default());
let mut tar_encoder = tar::Builder::new(gz_encoder);
@@ -104,6 +116,24 @@ impl KeyWriter {
}
}
pub struct ChatCompletionsSettingsWriter {
path: PathBuf,
}
impl ChatCompletionsSettingsWriter {
pub(crate) fn new(path: PathBuf) -> Result<Self> {
std::fs::create_dir(&path)?;
Ok(ChatCompletionsSettingsWriter { path })
}
pub fn push_settings(&mut self, name: &str, settings: &ChatCompletionSettings) -> Result<()> {
let mut settings_file = File::create(self.path.join(name).with_extension("json"))?;
serde_json::to_writer(&mut settings_file, &settings)?;
settings_file.flush()?;
Ok(())
}
}
pub struct TaskWriter {
queue: BufWriter<File>,
update_files: PathBuf,

Binary file not shown.

View File

@@ -11,7 +11,7 @@ edition.workspace = true
license.workspace = true
[dependencies]
tempfile = "3.15.0"
thiserror = "2.0.9"
tempfile = "3.20.0"
thiserror = "2.0.12"
tracing = "0.1.41"
uuid = { version = "1.11.0", features = ["serde", "v4"] }
uuid = { version = "1.17.0", features = ["serde", "v4"] }

View File

@@ -14,7 +14,7 @@ license.workspace = true
[dependencies]
nom = "7.1.3"
nom_locate = "4.2.0"
unescaper = "0.1.5"
unescaper = "0.1.6"
[dev-dependencies]
# fixed version due to format breakages in v1.40

View File

@@ -165,9 +165,9 @@ impl<'a> FilterCondition<'a> {
| Condition::Exists
| Condition::LowerThan(_)
| Condition::LowerThanOrEqual(_)
| Condition::Between { .. } => None,
Condition::Contains { keyword, word: _ }
| Condition::StartsWith { keyword, word: _ } => Some(keyword),
| Condition::Between { .. }
| Condition::StartsWith { .. } => None,
Condition::Contains { keyword, word: _ } => Some(keyword),
},
FilterCondition::Not(this) => this.use_contains_operator(),
FilterCondition::Or(seq) | FilterCondition::And(seq) => {

View File

@@ -16,7 +16,7 @@ license.workspace = true
serde_json = "1.0"
[dev-dependencies]
criterion = { version = "0.5.1", features = ["html_reports"] }
criterion = { version = "0.6.0", features = ["html_reports"] }
[[bench]]
name = "benchmarks"

View File

@@ -12,11 +12,11 @@ license.workspace = true
[dependencies]
arbitrary = { version = "1.4.1", features = ["derive"] }
bumpalo = "3.16.0"
clap = { version = "4.5.24", features = ["derive"] }
either = "1.13.0"
bumpalo = "3.18.1"
clap = { version = "4.5.40", features = ["derive"] }
either = "1.15.0"
fastrand = "2.3.0"
milli = { path = "../milli" }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.135", features = ["preserve_order"] }
tempfile = "3.15.0"
serde = { version = "1.0.219", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] }
tempfile = "3.20.0"

View File

@@ -13,7 +13,7 @@ use milli::heed::EnvOpenOptions;
use milli::progress::Progress;
use milli::update::new::indexer;
use milli::update::IndexerConfig;
use milli::vector::EmbeddingConfigs;
use milli::vector::RuntimeEmbedders;
use milli::Index;
use serde_json::Value;
use tempfile::TempDir;
@@ -89,7 +89,7 @@ fn main() {
let mut new_fields_ids_map = db_fields_ids_map.clone();
let indexer_alloc = Bump::new();
let embedders = EmbeddingConfigs::default();
let embedders = RuntimeEmbedders::default();
let mut indexer = indexer::DocumentOperation::new();
let mut operations = Vec::new();
@@ -144,6 +144,7 @@ fn main() {
embedders,
&|| false,
&Progress::default(),
&Default::default(),
)
.unwrap();

View File

@@ -11,31 +11,31 @@ edition.workspace = true
license.workspace = true
[dependencies]
anyhow = "1.0.95"
anyhow = "1.0.98"
bincode = "1.3.3"
byte-unit = "5.1.6"
bumpalo = "3.16.0"
bumpalo = "3.18.1"
bumparaw-collections = "0.1.4"
convert_case = "0.6.0"
convert_case = "0.8.0"
csv = "1.3.1"
derive_builder = "0.20.2"
dump = { path = "../dump" }
enum-iterator = "2.1.0"
file-store = { path = "../file-store" }
flate2 = "1.0.35"
indexmap = "2.7.0"
flate2 = "1.1.2"
indexmap = "2.9.0"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
memmap2 = "0.9.5"
memmap2 = "0.9.7"
page_size = "0.6.0"
rayon = "1.10.0"
roaring = { version = "0.10.10", features = ["serde"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = { version = "1.0.138", features = ["preserve_order"] }
roaring = { version = "0.10.12", features = ["serde"] }
serde = { version = "1.0.219", features = ["derive"] }
serde_json = { version = "1.0.140", features = ["preserve_order"] }
synchronoise = "1.0.1"
tempfile = "3.15.0"
thiserror = "2.0.9"
time = { version = "0.3.37", features = [
tempfile = "3.20.0"
thiserror = "2.0.12"
time = { version = "0.3.41", features = [
"serde-well-known",
"formatting",
"parsing",
@@ -43,7 +43,8 @@ time = { version = "0.3.37", features = [
] }
tracing = "0.1.41"
ureq = "2.12.1"
uuid = { version = "1.11.0", features = ["serde", "v4"] }
uuid = { version = "1.17.0", features = ["serde", "v4"] }
backoff = "0.4.0"
[dev-dependencies]
big_s = "1.0.2"

View File

@@ -4,6 +4,7 @@ use std::io;
use dump::{KindDump, TaskDump, UpdateFile};
use meilisearch_types::batches::{Batch, BatchId};
use meilisearch_types::heed::RwTxn;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::milli;
use meilisearch_types::tasks::{Kind, KindWithContent, Status, Task};
use roaring::RoaringBitmap;
@@ -211,6 +212,23 @@ impl<'a> Dump<'a> {
KindWithContent::DumpCreation { keys, instance_uid }
}
KindDump::SnapshotCreation => KindWithContent::SnapshotCreation,
KindDump::Export { url, api_key, payload_size, indexes } => {
KindWithContent::Export {
url,
api_key,
payload_size,
indexes: indexes
.into_iter()
.map(|(pattern, settings)| {
Ok((
IndexUidPattern::try_from(pattern)
.map_err(|_| Error::CorruptedDump)?,
settings,
))
})
.collect::<Result<_, Error>>()?,
}
}
KindDump::UpgradeDatabase { from } => KindWithContent::UpgradeDatabase { from },
},
};

View File

@@ -151,6 +151,10 @@ pub enum Error {
CorruptedTaskQueue,
#[error(transparent)]
DatabaseUpgrade(Box<Self>),
#[error(transparent)]
Export(Box<Self>),
#[error("Failed to export documents to remote server {code} ({type}): {message} <{link}>")]
FromRemoteWhenExporting { message: String, code: String, r#type: String, link: String },
#[error("Failed to rollback for index `{index}`: {rollback_outcome} ")]
RollbackFailed { index: String, rollback_outcome: RollbackOutcome },
#[error(transparent)]
@@ -212,6 +216,7 @@ impl Error {
| Error::BatchNotFound(_)
| Error::TaskDeletionWithEmptyQuery
| Error::TaskCancelationWithEmptyQuery
| Error::FromRemoteWhenExporting { .. }
| Error::AbortedTask
| Error::Dump(_)
| Error::Heed(_)
@@ -221,6 +226,7 @@ impl Error {
| Error::IoError(_)
| Error::Persist(_)
| Error::FeatureNotEnabled(_)
| Error::Export(_)
| Error::Anyhow(_) => true,
Error::CreateBatch(_)
| Error::CorruptedTaskQueue
@@ -282,6 +288,7 @@ impl ErrorCode for Error {
Error::Dump(e) => e.error_code(),
Error::Milli { error, .. } => error.error_code(),
Error::ProcessBatchPanicked(_) => Code::Internal,
Error::FromRemoteWhenExporting { .. } => Code::Internal,
Error::Heed(e) => e.error_code(),
Error::HeedTransaction(e) => e.error_code(),
Error::FileStore(e) => e.error_code(),
@@ -294,6 +301,7 @@ impl ErrorCode for Error {
Error::CorruptedTaskQueue => Code::Internal,
Error::CorruptedDump => Code::Internal,
Error::DatabaseUpgrade(_) => Code::Internal,
Error::Export(_) => Code::Internal,
Error::RollbackFailed { .. } => Code::Internal,
Error::UnrecoverableError(_) => Code::Internal,
Error::IndexSchedulerVersionMismatch { .. } => Code::Internal,

View File

@@ -85,7 +85,7 @@ impl RoFeatures {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action: "Using `CONTAINS` or `STARTS WITH` in a filter",
disabled_action: "Using `CONTAINS` in a filter",
feature: "contains filter",
issue_link: "https://github.com/orgs/meilisearch/discussions/763",
}
@@ -131,6 +131,32 @@ impl RoFeatures {
.into())
}
}
pub fn check_chat_completions(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.chat_completions {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "chat completions",
issue_link: "https://github.com/orgs/meilisearch/discussions/835",
}
.into())
}
}
pub fn check_multimodal(&self, disabled_action: &'static str) -> Result<()> {
if self.runtime.multimodal {
Ok(())
} else {
Err(FeatureNotEnabledError {
disabled_action,
feature: "multimodal",
issue_link: "https://github.com/orgs/meilisearch/discussions/846",
}
.into())
}
}
}
impl FeatureData {
@@ -156,6 +182,7 @@ impl FeatureData {
..persisted_features
}));
// Once this is stabilized, network should be stored along with webhooks in index-scheduler's persisted database
let network_db = runtime_features_db.remap_data_type::<SerdeJson<Network>>();
let network: Network = network_db.get(wtxn, db_keys::NETWORK)?.unwrap_or_default();

View File

@@ -71,7 +71,7 @@ pub struct IndexMapper {
/// Path to the folder where the LMDB environments of each index are.
base_path: PathBuf,
/// The map size an index is opened with on the first time.
index_base_map_size: usize,
pub(crate) index_base_map_size: usize,
/// The quantity by which the map size of an index is incremented upon reopening, in bytes.
index_growth_amount: usize,
/// Whether we open a meilisearch index with the MDB_WRITEMAP option or not.

View File

@@ -20,20 +20,22 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
let IndexScheduler {
cleanup_enabled: _,
experimental_no_edition_2024_for_dumps: _,
processing_tasks,
env,
version,
queue,
scheduler,
persisted,
index_mapper,
features: _,
webhook_url: _,
webhook_authorization_header: _,
webhooks: _,
test_breakpoint_sdr: _,
planned_failures: _,
run_loop_iteration: _,
embedders: _,
chat_settings: _,
} = scheduler;
let rtxn = env.read_txn().unwrap();
@@ -60,6 +62,13 @@ pub fn snapshot_index_scheduler(scheduler: &IndexScheduler) -> String {
}
snap.push_str("\n----------------------------------------------------------------------\n");
let persisted_db_snapshot = snapshot_persisted_db(&rtxn, persisted);
if !persisted_db_snapshot.is_empty() {
snap.push_str("### Persisted:\n");
snap.push_str(&persisted_db_snapshot);
snap.push_str("----------------------------------------------------------------------\n");
}
snap.push_str("### All Tasks:\n");
snap.push_str(&snapshot_all_tasks(&rtxn, queue.tasks.all_tasks));
snap.push_str("----------------------------------------------------------------------\n");
@@ -198,6 +207,16 @@ pub fn snapshot_date_db(rtxn: &RoTxn, db: Database<BEI128, CboRoaringBitmapCodec
snap
}
pub fn snapshot_persisted_db(rtxn: &RoTxn, db: &Database<Str, Str>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
for next in iter {
let (key, value) = next.unwrap();
snap.push_str(&format!("{key}: {value}\n"));
}
snap
}
pub fn snapshot_task(task: &Task) -> String {
let mut snap = String::new();
let Task {
@@ -288,6 +307,9 @@ fn snapshot_details(d: &Details) -> String {
Details::IndexSwap { swaps } => {
format!("{{ swaps: {swaps:?} }}")
}
Details::Export { url, api_key, payload_size, indexes } => {
format!("{{ url: {url:?}, api_key: {api_key:?}, payload_size: {payload_size:?}, indexes: {indexes:?} }}")
}
Details::UpgradeDatabase { from, to } => {
format!("{{ from: {from:?}, to: {to:?} }}")
}
@@ -306,6 +328,7 @@ pub fn snapshot_status(
}
snap
}
pub fn snapshot_kind(rtxn: &RoTxn, db: Database<SerdeBincode<Kind>, RoaringBitmapCodec>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
@@ -326,6 +349,7 @@ pub fn snapshot_index_tasks(rtxn: &RoTxn, db: Database<Str, RoaringBitmapCodec>)
}
snap
}
pub fn snapshot_canceled_by(rtxn: &RoTxn, db: Database<BEU32, RoaringBitmapCodec>) -> String {
let mut snap = String::new();
let iter = db.iter(rtxn).unwrap();
@@ -342,6 +366,7 @@ pub fn snapshot_batch(batch: &Batch) -> String {
uid,
details,
stats,
embedder_stats,
started_at,
finished_at,
progress: _,
@@ -365,6 +390,12 @@ pub fn snapshot_batch(batch: &Batch) -> String {
snap.push_str(&format!("uid: {uid}, "));
snap.push_str(&format!("details: {}, ", serde_json::to_string(details).unwrap()));
snap.push_str(&format!("stats: {}, ", serde_json::to_string(&stats).unwrap()));
if !embedder_stats.skip_serializing() {
snap.push_str(&format!(
"embedder stats: {}, ",
serde_json::to_string(&embedder_stats).unwrap()
));
}
snap.push_str(&format!("stop reason: {}, ", serde_json::to_string(&stop_reason).unwrap()));
snap.push('}');
snap

View File

@@ -51,22 +51,30 @@ pub use features::RoFeatures;
use flate2::bufread::GzEncoder;
use flate2::Compression;
use meilisearch_types::batches::Batch;
use meilisearch_types::features::{InstanceTogglableFeatures, Network, RuntimeTogglableFeatures};
use meilisearch_types::features::{
ChatCompletionSettings, InstanceTogglableFeatures, Network, RuntimeTogglableFeatures,
};
use meilisearch_types::heed::byteorder::BE;
use meilisearch_types::heed::types::I128;
use meilisearch_types::heed::{self, Env, RoTxn, WithoutTls};
use meilisearch_types::milli::index::IndexEmbeddingConfig;
use meilisearch_types::heed::types::{DecodeIgnore, SerdeJson, Str, I128};
use meilisearch_types::heed::{self, Database, Env, RoTxn, WithoutTls};
use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::vector::{Embedder, EmbedderOptions, EmbeddingConfigs};
use meilisearch_types::milli::vector::json_template::JsonTemplate;
use meilisearch_types::milli::vector::{
Embedder, EmbedderOptions, RuntimeEmbedder, RuntimeEmbedders, RuntimeFragment,
};
use meilisearch_types::milli::{self, Index};
use meilisearch_types::task_view::TaskView;
use meilisearch_types::tasks::{KindWithContent, Task};
use meilisearch_types::webhooks::{Webhook, WebhooksDumpView, WebhooksView};
use milli::vector::db::IndexEmbeddingConfig;
use processing::ProcessingTasks;
pub use queue::Query;
use queue::Queue;
use roaring::RoaringBitmap;
use scheduler::Scheduler;
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;
use uuid::Uuid;
use versioning::Versioning;
use crate::index_mapper::IndexMapper;
@@ -76,6 +84,15 @@ pub(crate) type BEI128 = I128<BE>;
const TASK_SCHEDULER_SIZE_THRESHOLD_PERCENT_INT: u64 = 40;
mod db_name {
pub const CHAT_SETTINGS: &str = "chat-settings";
pub const PERSISTED: &str = "persisted";
}
mod db_keys {
pub const WEBHOOKS: &str = "webhooks";
}
#[derive(Debug)]
pub struct IndexSchedulerOptions {
/// The path to the version file of Meilisearch.
@@ -92,10 +109,10 @@ pub struct IndexSchedulerOptions {
pub snapshots_path: PathBuf,
/// The path to the folder containing the dumps.
pub dumps_path: PathBuf,
/// The URL on which we must send the tasks statuses
pub webhook_url: Option<String>,
/// The value we will send into the Authorization HTTP header on the webhook URL
pub webhook_authorization_header: Option<String>,
/// The webhook url that was set by the CLI.
pub cli_webhook_url: Option<String>,
/// The Authorization header to send to the webhook URL that was set by the CLI.
pub cli_webhook_authorization: Option<String>,
/// The maximum size, in bytes, of the task index.
pub task_db_size: usize,
/// The size, in bytes, with which a meilisearch index is opened the first time of each meilisearch index.
@@ -131,6 +148,8 @@ pub struct IndexSchedulerOptions {
///
/// 0 disables the cache.
pub embedding_cache_cap: usize,
/// Snapshot compaction status.
pub experimental_no_snapshot_compaction: bool,
}
/// Structure which holds meilisearch's indexes and schedules the tasks
@@ -151,16 +170,23 @@ pub struct IndexScheduler {
/// In charge of fetching and setting the status of experimental features.
features: features::FeatureData,
/// Stores the custom chat prompts and other settings of the indexes.
pub(crate) chat_settings: Database<Str, SerdeJson<ChatCompletionSettings>>,
/// Everything related to the processing of the tasks
pub scheduler: scheduler::Scheduler,
/// Whether we should automatically cleanup the task queue or not.
pub(crate) cleanup_enabled: bool,
/// The webhook url we should send tasks to after processing every batches.
pub(crate) webhook_url: Option<String>,
/// The Authorization header to send to the webhook URL.
pub(crate) webhook_authorization_header: Option<String>,
/// Whether we should use the old document indexer or the new one.
pub(crate) experimental_no_edition_2024_for_dumps: bool,
/// A database to store single-keyed data that is persisted across restarts.
persisted: Database<Str, Str>,
/// Webhook, loaded and stored in the `persisted` database
webhooks: Arc<Webhooks>,
/// A map to retrieve the runtime representation of an embedder depending on its configuration.
///
@@ -199,8 +225,10 @@ impl IndexScheduler {
index_mapper: self.index_mapper.clone(),
cleanup_enabled: self.cleanup_enabled,
webhook_url: self.webhook_url.clone(),
webhook_authorization_header: self.webhook_authorization_header.clone(),
experimental_no_edition_2024_for_dumps: self.experimental_no_edition_2024_for_dumps,
persisted: self.persisted,
webhooks: self.webhooks.clone(),
embedders: self.embedders.clone(),
#[cfg(test)]
test_breakpoint_sdr: self.test_breakpoint_sdr.clone(),
@@ -209,11 +237,17 @@ impl IndexScheduler {
#[cfg(test)]
run_loop_iteration: self.run_loop_iteration.clone(),
features: self.features.clone(),
chat_settings: self.chat_settings,
}
}
pub(crate) const fn nb_db() -> u32 {
Versioning::nb_db() + Queue::nb_db() + IndexMapper::nb_db() + features::FeatureData::nb_db()
Versioning::nb_db()
+ Queue::nb_db()
+ IndexMapper::nb_db()
+ features::FeatureData::nb_db()
+ 1 // chat-prompts
+ 1 // persisted
}
/// Create an index scheduler and start its run loop.
@@ -264,9 +298,18 @@ impl IndexScheduler {
let version = versioning::Versioning::new(&env, from_db_version)?;
let mut wtxn = env.write_txn()?;
let features = features::FeatureData::new(&env, &mut wtxn, options.instance_features)?;
let queue = Queue::new(&env, &mut wtxn, &options)?;
let index_mapper = IndexMapper::new(&env, &mut wtxn, &options, budget)?;
let chat_settings = env.create_database(&mut wtxn, Some(db_name::CHAT_SETTINGS))?;
let persisted = env.create_database(&mut wtxn, Some(db_name::PERSISTED))?;
let webhooks_db = persisted.remap_data_type::<SerdeJson<Webhooks>>();
let mut webhooks = webhooks_db.get(&wtxn, db_keys::WEBHOOKS)?.unwrap_or_default();
webhooks
.with_cli(options.cli_webhook_url.clone(), options.cli_webhook_authorization.clone());
wtxn.commit()?;
// allow unreachable_code to get rids of the warning in the case of a test build.
@@ -279,8 +322,11 @@ impl IndexScheduler {
index_mapper,
env,
cleanup_enabled: options.cleanup_enabled,
webhook_url: options.webhook_url,
webhook_authorization_header: options.webhook_authorization_header,
experimental_no_edition_2024_for_dumps: options
.indexer_config
.experimental_no_edition_2024_for_dumps,
persisted,
webhooks: Arc::new(webhooks),
embedders: Default::default(),
#[cfg(test)]
@@ -290,12 +336,17 @@ impl IndexScheduler {
#[cfg(test)]
run_loop_iteration: Arc::new(RwLock::new(0)),
features,
chat_settings,
};
this.run();
Ok(this)
}
fn read_txn(&self) -> Result<RoTxn<WithoutTls>> {
self.env.read_txn().map_err(|e| e.into())
}
/// Return `Ok(())` if the index scheduler is able to access one of its database.
pub fn health(&self) -> Result<()> {
let rtxn = self.env.read_txn()?;
@@ -372,15 +423,16 @@ impl IndexScheduler {
}
}
pub fn read_txn(&self) -> Result<RoTxn<WithoutTls>> {
self.env.read_txn().map_err(|e| e.into())
}
/// Start the run loop for the given index scheduler.
///
/// This function will execute in a different thread and must be called
/// only once per index scheduler.
fn run(&self) {
// If the number of batched tasks is 0, we don't need to run the scheduler at all.
// It will never be able to process any tasks.
if self.scheduler.max_number_of_batched_tasks == 0 {
return;
}
let run = self.private_clone();
std::thread::Builder::new()
.name(String::from("scheduler"))
@@ -488,7 +540,7 @@ impl IndexScheduler {
/// Returns the total number of indexes available for the specified filter.
/// And a `Vec` of the index_uid + its stats
pub fn get_paginated_indexes_stats(
pub fn paginated_indexes_stats(
&self,
filters: &meilisearch_auth::AuthFilter,
from: usize,
@@ -529,6 +581,24 @@ impl IndexScheduler {
ret.map(|ret| (total, ret))
}
/// Returns the total number of chat workspaces available ~~for the specified filter~~.
/// And a `Vec` of the workspace_uids
pub fn paginated_chat_workspace_uids(
&self,
from: usize,
limit: usize,
) -> Result<(usize, Vec<String>)> {
let rtxn = self.read_txn()?;
let total = self.chat_settings.len(&rtxn)?;
let mut iter = self.chat_settings.iter(&rtxn)?.skip(from);
iter.by_ref()
.take(limit)
.map(|ret| ret.map_err(Error::from))
.map(|ret| ret.map(|(uid, _)| uid.to_string()))
.collect::<Result<Vec<_>, Error>>()
.map(|ret| (total as usize, ret))
}
/// The returned structure contains:
/// 1. The name of the property being observed can be `statuses`, `types`, or `indexes`.
/// 2. The name of the specific data related to the property can be `enqueued` for the `statuses`, `settingsUpdate` for the `types`, or the name of the index for the `indexes`, for example.
@@ -553,6 +623,11 @@ impl IndexScheduler {
Ok(nbr_index_processing_tasks > 0)
}
/// Whether the index should use the old document indexer.
pub fn no_edition_2024_for_dumps(&self) -> bool {
self.experimental_no_edition_2024_for_dumps
}
/// Return the tasks matching the query from the user's point of view along
/// with the total number of tasks matching the query, ignoring from and limit.
///
@@ -699,86 +774,92 @@ impl IndexScheduler {
Ok(())
}
/// Once the tasks changes have been committed we must send all the tasks that were updated to our webhook if there is one.
fn notify_webhook(&self, updated: &RoaringBitmap) -> Result<()> {
if let Some(ref url) = self.webhook_url {
struct TaskReader<'a, 'b> {
rtxn: &'a RoTxn<'a>,
index_scheduler: &'a IndexScheduler,
tasks: &'b mut roaring::bitmap::Iter<'b>,
buffer: Vec<u8>,
written: usize,
}
/// Once the tasks changes have been committed we must send all the tasks that were updated to our webhooks
fn notify_webhooks(&self, updated: RoaringBitmap) {
struct TaskReader<'a, 'b> {
rtxn: &'a RoTxn<'a>,
index_scheduler: &'a IndexScheduler,
tasks: &'b mut roaring::bitmap::Iter<'b>,
buffer: Vec<u8>,
written: usize,
}
impl Read for TaskReader<'_, '_> {
fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> {
if self.buffer.is_empty() {
match self.tasks.next() {
None => return Ok(0),
Some(task_id) => {
let task = self
.index_scheduler
.queue
.tasks
.get_task(self.rtxn, task_id)
.map_err(|err| io::Error::new(io::ErrorKind::Other, err))?
.ok_or_else(|| {
io::Error::new(
io::ErrorKind::Other,
Error::CorruptedTaskQueue,
)
})?;
impl Read for TaskReader<'_, '_> {
fn read(&mut self, mut buf: &mut [u8]) -> std::io::Result<usize> {
if self.buffer.is_empty() {
match self.tasks.next() {
None => return Ok(0),
Some(task_id) => {
let task = self
.index_scheduler
.queue
.tasks
.get_task(self.rtxn, task_id)
.map_err(|err| io::Error::new(io::ErrorKind::Other, err))?
.ok_or_else(|| {
io::Error::new(io::ErrorKind::Other, Error::CorruptedTaskQueue)
})?;
serde_json::to_writer(
&mut self.buffer,
&TaskView::from_task(&task),
)?;
self.buffer.push(b'\n');
}
serde_json::to_writer(&mut self.buffer, &TaskView::from_task(&task))?;
self.buffer.push(b'\n');
}
}
let mut to_write = &self.buffer[self.written..];
let wrote = io::copy(&mut to_write, &mut buf)?;
self.written += wrote as usize;
// we wrote everything and must refresh our buffer on the next call
if self.written == self.buffer.len() {
self.written = 0;
self.buffer.clear();
}
Ok(wrote as usize)
}
}
let rtxn = self.env.read_txn()?;
let mut to_write = &self.buffer[self.written..];
let wrote = io::copy(&mut to_write, &mut buf)?;
self.written += wrote as usize;
let task_reader = TaskReader {
rtxn: &rtxn,
index_scheduler: self,
tasks: &mut updated.into_iter(),
buffer: Vec::with_capacity(50), // on average a task is around ~100 bytes
written: 0,
};
// we wrote everything and must refresh our buffer on the next call
if self.written == self.buffer.len() {
self.written = 0;
self.buffer.clear();
}
// let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
let request = ureq::post(url)
.timeout(Duration::from_secs(30))
.set("Content-Encoding", "gzip")
.set("Content-Type", "application/x-ndjson");
let request = match &self.webhook_authorization_header {
Some(header) => request.set("Authorization", header),
None => request,
};
if let Err(e) = request.send(reader) {
tracing::error!("While sending data to the webhook: {e}");
Ok(wrote as usize)
}
}
Ok(())
let webhooks = self.webhooks.get_all();
if webhooks.is_empty() {
return;
}
let this = self.private_clone();
// We must take the RoTxn before entering the thread::spawn otherwise another batch may be
// processed before we had the time to take our txn.
let rtxn = match self.env.clone().static_read_txn() {
Ok(rtxn) => rtxn,
Err(e) => {
tracing::error!("Couldn't get an rtxn to notify the webhook: {e}");
return;
}
};
std::thread::spawn(move || {
for (uuid, Webhook { url, headers }) in webhooks.iter() {
let task_reader = TaskReader {
rtxn: &rtxn,
index_scheduler: &this,
tasks: &mut updated.iter(),
buffer: Vec::with_capacity(page_size::get()),
written: 0,
};
let reader = GzEncoder::new(BufReader::new(task_reader), Compression::default());
let mut request = ureq::post(url)
.timeout(Duration::from_secs(30))
.set("Content-Encoding", "gzip")
.set("Content-Type", "application/x-ndjson");
for (header_name, header_value) in headers.iter() {
request = request.set(header_name, header_value);
}
if let Err(e) = request.send(reader) {
tracing::error!("While sending data to the webhook {uuid}: {e}");
}
}
});
}
pub fn index_stats(&self, index_uid: &str) -> Result<IndexStats> {
@@ -809,33 +890,69 @@ impl IndexScheduler {
self.features.network()
}
pub fn update_runtime_webhooks(&self, runtime: RuntimeWebhooks) -> Result<()> {
let webhooks = Webhooks::from_runtime(runtime);
let mut wtxn = self.env.write_txn()?;
let webhooks_db = self.persisted.remap_data_type::<SerdeJson<Webhooks>>();
webhooks_db.put(&mut wtxn, db_keys::WEBHOOKS, &webhooks)?;
wtxn.commit()?;
self.webhooks.update_runtime(webhooks.into_runtime());
Ok(())
}
pub fn webhooks_dump_view(&self) -> WebhooksDumpView {
// We must not dump the cli api key
WebhooksDumpView { webhooks: self.webhooks.get_runtime() }
}
pub fn webhooks_view(&self) -> WebhooksView {
WebhooksView { webhooks: self.webhooks.get_all() }
}
pub fn retrieve_runtime_webhooks(&self) -> RuntimeWebhooks {
self.webhooks.get_runtime()
}
pub fn embedders(
&self,
index_uid: String,
embedding_configs: Vec<IndexEmbeddingConfig>,
) -> Result<EmbeddingConfigs> {
) -> Result<RuntimeEmbedders> {
let res: Result<_> = embedding_configs
.into_iter()
.map(
|IndexEmbeddingConfig {
name,
config: milli::vector::EmbeddingConfig { embedder_options, prompt, quantized },
..
}| {
let prompt = Arc::new(
prompt
.try_into()
.map_err(meilisearch_types::milli::Error::from)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
fragments,
}|
-> Result<(String, Arc<RuntimeEmbedder>)> {
let document_template = prompt
.try_into()
.map_err(meilisearch_types::milli::Error::from)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
let fragments = fragments
.into_inner()
.into_iter()
.map(|fragment| {
let value = embedder_options.fragment(&fragment.name).unwrap();
let template = JsonTemplate::new(value.clone()).unwrap();
RuntimeFragment { name: fragment.name, id: fragment.id, template }
})
.collect();
// optimistically return existing embedder
{
let embedders = self.embedders.read().unwrap();
if let Some(embedder) = embedders.get(&embedder_options) {
return Ok((
name,
(embedder.clone(), prompt, quantized.unwrap_or_default()),
let runtime = Arc::new(RuntimeEmbedder::new(
embedder.clone(),
document_template,
fragments,
quantized.unwrap_or_default(),
));
return Ok((name, runtime));
}
}
@@ -851,11 +968,44 @@ impl IndexScheduler {
let mut embedders = self.embedders.write().unwrap();
embedders.insert(embedder_options, embedder.clone());
}
Ok((name, (embedder, prompt, quantized.unwrap_or_default())))
let runtime = Arc::new(RuntimeEmbedder::new(
embedder.clone(),
document_template,
fragments,
quantized.unwrap_or_default(),
));
Ok((name, runtime))
},
)
.collect();
res.map(EmbeddingConfigs::new)
res.map(RuntimeEmbedders::new)
}
pub fn chat_settings(&self, uid: &str) -> Result<Option<ChatCompletionSettings>> {
let rtxn = self.env.read_txn()?;
self.chat_settings.get(&rtxn, uid).map_err(Into::into)
}
/// Return true if chat workspace exists.
pub fn chat_workspace_exists(&self, name: &str) -> Result<bool> {
let rtxn = self.env.read_txn()?;
Ok(self.chat_settings.remap_data_type::<DecodeIgnore>().get(&rtxn, name)?.is_some())
}
pub fn put_chat_settings(&self, uid: &str, settings: &ChatCompletionSettings) -> Result<()> {
let mut wtxn = self.env.write_txn()?;
self.chat_settings.put(&mut wtxn, uid, settings)?;
wtxn.commit()?;
Ok(())
}
pub fn delete_chat_settings(&self, uid: &str) -> Result<bool> {
let mut wtxn = self.env.write_txn()?;
let deleted = self.chat_settings.delete(&mut wtxn, uid)?;
wtxn.commit()?;
Ok(deleted)
}
}
@@ -891,3 +1041,72 @@ pub struct IndexStats {
/// Internal stats computed from the index.
pub inner_stats: index_mapper::IndexStats,
}
/// These structure are not meant to be exposed to the end user, if needed, use the meilisearch-types::webhooks structure instead.
/// /!\ Everytime you deserialize this structure you should fill the cli_webhook later on with the `with_cli` method. /!\
#[derive(Debug, Serialize, Deserialize, Default)]
#[serde(rename_all = "camelCase")]
struct Webhooks {
// The cli webhook should *never* be stored in a database.
// It represent a state that only exists for this execution of meilisearch
#[serde(skip)]
pub cli: Option<CliWebhook>,
#[serde(default)]
pub runtime: RwLock<RuntimeWebhooks>,
}
type RuntimeWebhooks = BTreeMap<Uuid, Webhook>;
impl Webhooks {
pub fn with_cli(&mut self, url: Option<String>, auth: Option<String>) {
if let Some(url) = url {
let webhook = CliWebhook { url, auth };
self.cli = Some(webhook);
}
}
pub fn from_runtime(webhooks: RuntimeWebhooks) -> Self {
Self { cli: None, runtime: RwLock::new(webhooks) }
}
pub fn into_runtime(self) -> RuntimeWebhooks {
// safe because we own self and it cannot be cloned
self.runtime.into_inner().unwrap()
}
pub fn update_runtime(&self, webhooks: RuntimeWebhooks) {
*self.runtime.write().unwrap() = webhooks;
}
/// Returns all the webhooks in an unified view. The cli webhook is represented with an uuid set to 0
pub fn get_all(&self) -> BTreeMap<Uuid, Webhook> {
self.cli
.as_ref()
.map(|wh| (Uuid::nil(), Webhook::from(wh)))
.into_iter()
.chain(self.runtime.read().unwrap().iter().map(|(uuid, wh)| (*uuid, wh.clone())))
.collect()
}
/// Returns all the runtime webhooks.
pub fn get_runtime(&self) -> BTreeMap<Uuid, Webhook> {
self.runtime.read().unwrap().iter().map(|(uuid, wh)| (*uuid, wh.clone())).collect()
}
}
#[derive(Debug, Serialize, Deserialize, Default, Clone, PartialEq)]
struct CliWebhook {
pub url: String,
pub auth: Option<String>,
}
impl From<&CliWebhook> for Webhook {
fn from(webhook: &CliWebhook) -> Self {
let mut headers = BTreeMap::new();
if let Some(ref auth) = webhook.auth {
headers.insert("Authorization".to_string(), auth.to_string());
}
Self { url: webhook.url.to_string(), headers }
}
}

View File

@@ -103,10 +103,12 @@ make_enum_progress! {
pub enum DumpCreationProgress {
StartTheDumpCreation,
DumpTheApiKeys,
DumpTheChatCompletionSettings,
DumpTheTasks,
DumpTheBatches,
DumpTheIndexes,
DumpTheExperimentalFeatures,
DumpTheWebhooks,
CompressTheDump,
}
}
@@ -175,8 +177,17 @@ make_enum_progress! {
}
}
make_enum_progress! {
pub enum Export {
EnsuringCorrectnessOfTheTarget,
ExportingTheSettings,
ExportingTheDocuments,
}
}
make_atomic_progress!(Task alias AtomicTaskStep => "task" );
make_atomic_progress!(Document alias AtomicDocumentStep => "document" );
make_atomic_progress!(Index alias AtomicIndexStep => "index" );
make_atomic_progress!(Batch alias AtomicBatchStep => "batch" );
make_atomic_progress!(UpdateFile alias AtomicUpdateFileStep => "update file" );

View File

@@ -179,6 +179,7 @@ impl BatchQueue {
progress: None,
details: batch.details,
stats: batch.stats,
embedder_stats: batch.embedder_stats.as_ref().into(),
started_at: batch.started_at,
finished_at: batch.finished_at,
enqueued_at: batch.enqueued_at,

View File

@@ -127,7 +127,7 @@ fn query_batches_simple() {
"startedAt": "1970-01-01T00:00:00Z",
"finishedAt": null,
"enqueuedAt": null,
"stopReason": "task with id 0 of type `indexCreation` cannot be batched"
"stopReason": "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task."
}
"###);

View File

@@ -48,8 +48,8 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [1,2,3,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 3 of type `taskCancelation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -47,9 +47,9 @@ whalo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 2 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -4,7 +4,7 @@ source: crates/index-scheduler/src/queue/batches_test.rs
### Autobatching Enabled = true
### Processing batch Some(1):
[1,]
{uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
{uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, batch_uid: 0, status: succeeded, details: { primary_key: Some("mouse") }, kind: IndexCreation { index_uid: "catto", primary_key: Some("mouse") }}
@@ -42,7 +42,7 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -47,9 +47,9 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "task with id 2 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -52,10 +52,10 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [3,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "task with id 2 of type `indexSwap` cannot be batched", }
3 {uid: 3, details: {"swaps":[{"indexes":["catto","whalo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "task with id 3 of type `indexSwap` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "created batch containing only task with id 2 of type `indexSwap` that cannot be batched with any other task.", }
3 {uid: 3, details: {"swaps":[{"indexes":["catto","whalo"]}]}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexSwap":1},"indexUids":{}}, stop reason: "created batch containing only task with id 3 of type `indexSwap` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -48,8 +48,8 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [1,2,3,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 3 of type `taskCancelation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"sheep","matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query","swaps":[{"indexes":["catto","doggo"]}]}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"indexCreation":1,"indexSwap":1,"taskCancelation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -47,9 +47,9 @@ whalo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 2 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"bone"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"plankton"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"primaryKey":"his_own_vomit"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -47,9 +47,9 @@ doggo: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "task with id 2 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {"primaryKey":"mouse"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"primaryKey":"sheep"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggo":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"primaryKey":"fish"}, stats: {"totalNbTasks":1,"status":{"failed":1},"types":{"indexCreation":1},"indexUids":{"whalo":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -71,6 +71,7 @@ impl From<KindWithContent> for AutobatchKind {
KindWithContent::TaskCancelation { .. }
| KindWithContent::TaskDeletion { .. }
| KindWithContent::DumpCreation { .. }
| KindWithContent::Export { .. }
| KindWithContent::UpgradeDatabase { .. }
| KindWithContent::SnapshotCreation => {
panic!("The autobatcher should never be called with tasks that don't apply to an index.")

View File

@@ -1,4 +1,5 @@
use std::fmt;
use std::io::ErrorKind;
use meilisearch_types::heed::RoTxn;
use meilisearch_types::milli::update::IndexDocumentsMethod;
@@ -47,6 +48,9 @@ pub(crate) enum Batch {
IndexSwap {
task: Task,
},
Export {
task: Task,
},
UpgradeDatabase {
tasks: Vec<Task>,
},
@@ -103,6 +107,7 @@ impl Batch {
Batch::TaskCancelation { task, .. }
| Batch::Dump(task)
| Batch::IndexCreation { task, .. }
| Batch::Export { task }
| Batch::IndexUpdate { task, .. } => {
RoaringBitmap::from_sorted_iter(std::iter::once(task.uid)).unwrap()
}
@@ -142,6 +147,7 @@ impl Batch {
| TaskDeletions(_)
| SnapshotCreation(_)
| Dump(_)
| Export { .. }
| UpgradeDatabase { .. }
| IndexSwap { .. } => None,
IndexOperation { op, .. } => Some(op.index_uid()),
@@ -167,6 +173,7 @@ impl fmt::Display for Batch {
Batch::IndexUpdate { .. } => f.write_str("IndexUpdate")?,
Batch::IndexDeletion { .. } => f.write_str("IndexDeletion")?,
Batch::IndexSwap { .. } => f.write_str("IndexSwap")?,
Batch::Export { .. } => f.write_str("Export")?,
Batch::UpgradeDatabase { .. } => f.write_str("UpgradeDatabase")?,
};
match index_uid {
@@ -426,9 +433,10 @@ impl IndexScheduler {
/// 0. We get the *last* task to cancel.
/// 1. We get the tasks to upgrade.
/// 2. We get the *next* task to delete.
/// 3. We get the *next* snapshot to process.
/// 4. We get the *next* dump to process.
/// 5. We get the *next* tasks to process for a specific index.
/// 3. We get the *next* export to process.
/// 4. We get the *next* snapshot to process.
/// 5. We get the *next* dump to process.
/// 6. We get the *next* tasks to process for a specific index.
#[tracing::instrument(level = "trace", skip(self, rtxn), target = "indexing::scheduler")]
pub(crate) fn create_next_batch(
&self,
@@ -500,7 +508,17 @@ impl IndexScheduler {
return Ok(Some((Batch::TaskDeletions(tasks), current_batch)));
}
// 3. we batch the snapshot.
// 3. we batch the export.
let to_export = self.queue.tasks.get_kind(rtxn, Kind::Export)? & enqueued;
if !to_export.is_empty() {
let task_id = to_export.iter().next().expect("There must be at least one export task");
let mut task = self.queue.tasks.get_task(rtxn, task_id)?.unwrap();
current_batch.processing([&mut task]);
current_batch.reason(BatchStopReason::TaskKindCannotBeBatched { kind: Kind::Export });
return Ok(Some((Batch::Export { task }, current_batch)));
}
// 4. we batch the snapshot.
let to_snapshot = self.queue.tasks.get_kind(rtxn, Kind::SnapshotCreation)? & enqueued;
if !to_snapshot.is_empty() {
let mut tasks = self.queue.tasks.get_existing_tasks(rtxn, to_snapshot)?;
@@ -510,7 +528,7 @@ impl IndexScheduler {
return Ok(Some((Batch::SnapshotCreation(tasks), current_batch)));
}
// 4. we batch the dumps.
// 5. we batch the dumps.
let to_dump = self.queue.tasks.get_kind(rtxn, Kind::DumpCreation)? & enqueued;
if let Some(to_dump) = to_dump.min() {
let mut task =
@@ -523,7 +541,7 @@ impl IndexScheduler {
return Ok(Some((Batch::Dump(task), current_batch)));
}
// 5. We make a batch from the unprioritised tasks. Start by taking the next enqueued task.
// 6. We make a batch from the unprioritised tasks. Start by taking the next enqueued task.
let task_id = if let Some(task_id) = enqueued.min() { task_id } else { return Ok(None) };
let mut task =
self.queue.tasks.get_task(rtxn, task_id)?.ok_or(Error::CorruptedTaskQueue)?;
@@ -577,7 +595,11 @@ impl IndexScheduler {
.and_then(|task| task.ok_or(Error::CorruptedTaskQueue))?;
if let Some(uuid) = task.content_uuid() {
let content_size = self.queue.file_store.compute_size(uuid)?;
let content_size = match self.queue.file_store.compute_size(uuid) {
Ok(content_size) => content_size,
Err(file_store::Error::IoError(err)) if err.kind() == ErrorKind::NotFound => 0,
Err(otherwise) => return Err(otherwise.into()),
};
total_size = total_size.saturating_add(content_size);
}

View File

@@ -4,6 +4,7 @@ mod autobatcher_test;
mod create_batch;
mod process_batch;
mod process_dump_creation;
mod process_export;
mod process_index_operation;
mod process_snapshot_creation;
mod process_upgrade;
@@ -83,6 +84,9 @@ pub struct Scheduler {
///
/// 0 disables the cache.
pub(crate) embedding_cache_cap: usize,
/// Snapshot compaction status.
pub(crate) experimental_no_snapshot_compaction: bool,
}
impl Scheduler {
@@ -98,6 +102,7 @@ impl Scheduler {
auth_env: self.auth_env.clone(),
version_file_path: self.version_file_path.clone(),
embedding_cache_cap: self.embedding_cache_cap,
experimental_no_snapshot_compaction: self.experimental_no_snapshot_compaction,
}
}
@@ -114,6 +119,7 @@ impl Scheduler {
auth_env,
version_file_path: options.version_file_path.clone(),
embedding_cache_cap: options.embedding_cache_cap,
experimental_no_snapshot_compaction: options.experimental_no_snapshot_compaction,
}
}
}
@@ -370,9 +376,11 @@ impl IndexScheduler {
post_commit_dabases_sizes
.get(dbname)
.map(|post_size| {
use byte_unit::{Byte, UnitType::Binary};
use std::cmp::Ordering::{Equal, Greater, Less};
use byte_unit::Byte;
use byte_unit::UnitType::Binary;
let post = Byte::from_u64(*post_size as u64).get_appropriate_unit(Binary);
let diff_size = post_size.abs_diff(*pre_size) as u64;
let diff = Byte::from_u64(diff_size).get_appropriate_unit(Binary);
@@ -438,8 +446,7 @@ impl IndexScheduler {
Ok(())
})?;
// We shouldn't crash the tick function if we can't send data to the webhook.
let _ = self.notify_webhook(&ids);
self.notify_webhooks(ids);
#[cfg(test)]
self.breakpoint(crate::test_utils::Breakpoint::AfterProcessing);

View File

@@ -162,8 +162,13 @@ impl IndexScheduler {
.set_currently_updating_index(Some((index_uid.clone(), index.clone())));
let pre_commit_dabases_sizes = index.database_sizes(&index_wtxn)?;
let (tasks, congestion) =
self.apply_index_operation(&mut index_wtxn, &index, op, &progress)?;
let (tasks, congestion) = self.apply_index_operation(
&mut index_wtxn,
&index,
op,
&progress,
current_batch.embedder_stats.clone(),
)?;
{
progress.update_progress(FinalizingIndexStep::Committing);
@@ -238,10 +243,12 @@ impl IndexScheduler {
);
builder.set_primary_key(primary_key);
let must_stop_processing = self.scheduler.must_stop_processing.clone();
builder
.execute(
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
&|| must_stop_processing.get(),
&progress,
current_batch.embedder_stats.clone(),
)
.map_err(|e| Error::from_milli(e, Some(index_uid.to_string())))?;
index_wtxn.commit()?;
@@ -361,6 +368,46 @@ impl IndexScheduler {
task.status = Status::Succeeded;
Ok((vec![task], ProcessBatchInfo::default()))
}
Batch::Export { mut task } => {
let KindWithContent::Export { url, api_key, payload_size, indexes } = &task.kind
else {
unreachable!()
};
let ret = catch_unwind(AssertUnwindSafe(|| {
self.process_export(
url,
api_key.as_deref(),
payload_size.as_ref(),
indexes,
progress,
)
}));
let stats = match ret {
Ok(Ok(stats)) => stats,
Ok(Err(Error::AbortedTask)) => return Err(Error::AbortedTask),
Ok(Err(e)) => return Err(Error::Export(Box::new(e))),
Err(e) => {
let msg = match e.downcast_ref::<&'static str>() {
Some(s) => *s,
None => match e.downcast_ref::<String>() {
Some(s) => &s[..],
None => "Box<dyn Any>",
},
};
return Err(Error::Export(Box::new(Error::ProcessBatchPanicked(
msg.to_string(),
))));
}
};
task.status = Status::Succeeded;
if let Some(Details::Export { indexes, .. }) = task.details.as_mut() {
*indexes = stats;
}
Ok((vec![task], ProcessBatchInfo::default()))
}
Batch::UpgradeDatabase { mut tasks } => {
let KindWithContent::UpgradeDatabase { from } = tasks.last().unwrap().kind else {
unreachable!();
@@ -708,9 +755,11 @@ impl IndexScheduler {
from.1,
from.2
);
match std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
let ret = catch_unwind(std::panic::AssertUnwindSafe(|| {
self.process_rollback(from, progress)
})) {
}));
match ret {
Ok(Ok(())) => {}
Ok(Err(err)) => return Err(Error::DatabaseUpgrade(Box::new(err))),
Err(e) => {

View File

@@ -5,6 +5,7 @@ use std::sync::atomic::Ordering;
use dump::IndexMetadata;
use meilisearch_types::milli::constants::RESERVED_VECTORS_FIELD_NAME;
use meilisearch_types::milli::index::EmbeddingsWithMetadata;
use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use meilisearch_types::milli::vector::parsed_vectors::{ExplicitVectors, VectorOrArrayOfVectors};
use meilisearch_types::milli::{self};
@@ -43,7 +44,16 @@ impl IndexScheduler {
let rtxn = self.env.read_txn()?;
// 2. dump the tasks
// 2. dump the chat completion settings
// TODO should I skip the export if the chat completion has been disabled?
progress.update_progress(DumpCreationProgress::DumpTheChatCompletionSettings);
let mut dump_chat_completion_settings = dump.create_chat_completions_settings()?;
for result in self.chat_settings.iter(&rtxn)? {
let (name, chat_settings) = result?;
dump_chat_completion_settings.push_settings(name, &chat_settings)?;
}
// 3. dump the tasks
progress.update_progress(DumpCreationProgress::DumpTheTasks);
let mut dump_tasks = dump.create_tasks_queue()?;
@@ -81,7 +91,7 @@ impl IndexScheduler {
let mut dump_content_file = dump_tasks.push_task(&t.into())?;
// 2.1. Dump the `content_file` associated with the task if there is one and the task is not finished yet.
// 3.1. Dump the `content_file` associated with the task if there is one and the task is not finished yet.
if let Some(content_file) = content_file {
if self.scheduler.must_stop_processing.get() {
return Err(Error::AbortedTask);
@@ -105,7 +115,7 @@ impl IndexScheduler {
}
dump_tasks.flush()?;
// 3. dump the batches
// 4. dump the batches
progress.update_progress(DumpCreationProgress::DumpTheBatches);
let mut dump_batches = dump.create_batches_queue()?;
@@ -138,7 +148,7 @@ impl IndexScheduler {
}
dump_batches.flush()?;
// 4. Dump the indexes
// 5. Dump the indexes
progress.update_progress(DumpCreationProgress::DumpTheIndexes);
let nb_indexes = self.index_mapper.index_mapping.len(&rtxn)? as u32;
let mut count = 0;
@@ -165,9 +175,6 @@ impl IndexScheduler {
let fields_ids_map = index.fields_ids_map(&rtxn)?;
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
let embedding_configs = index
.embedding_configs(&rtxn)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let nb_documents = index
.number_of_documents(&rtxn)
@@ -178,7 +185,7 @@ impl IndexScheduler {
let documents = index
.all_documents(&rtxn)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// 4.1. Dump the documents
// 5.1. Dump the documents
for ret in documents {
if self.scheduler.must_stop_processing.get() {
return Err(Error::AbortedTask);
@@ -221,16 +228,21 @@ impl IndexScheduler {
return Err(Error::from_milli(user_err, Some(uid.to_string())));
};
for (embedder_name, embeddings) in embeddings {
let user_provided = embedding_configs
.iter()
.find(|conf| conf.name == embedder_name)
.is_some_and(|conf| conf.user_provided.contains(id));
for (
embedder_name,
EmbeddingsWithMetadata { embeddings, regenerate, has_fragments },
) in embeddings
{
let embeddings = ExplicitVectors {
embeddings: Some(VectorOrArrayOfVectors::from_array_of_vectors(
embeddings,
)),
regenerate: !user_provided,
regenerate: regenerate &&
// Meilisearch does not handle well dumps with fragments, because as the fragments
// are marked as user-provided,
// all embeddings would be regenerated on any settings change or document update.
// To prevent this, we mark embeddings has non regenerate in this case.
!has_fragments,
};
vectors.insert(embedder_name, serde_json::to_value(embeddings).unwrap());
}
@@ -240,7 +252,7 @@ impl IndexScheduler {
atomic.fetch_add(1, Ordering::Relaxed);
}
// 4.2. Dump the settings
// 5.2. Dump the settings
let settings = meilisearch_types::settings::settings(
index,
&rtxn,
@@ -251,13 +263,18 @@ impl IndexScheduler {
Ok(())
})?;
// 5. Dump experimental feature settings
// 6. Dump experimental feature settings
progress.update_progress(DumpCreationProgress::DumpTheExperimentalFeatures);
let features = self.features().runtime_features();
dump.create_experimental_features(features)?;
let network = self.network();
dump.create_network(network)?;
// 7. Dump the webhooks
progress.update_progress(DumpCreationProgress::DumpTheWebhooks);
let webhooks = self.webhooks_dump_view();
dump.create_webhooks(webhooks)?;
let dump_uid = started_at.format(format_description!(
"[year repr:full][month repr:numerical][day padding:zero]-[hour padding:zero][minute padding:zero][second padding:zero][subsecond digits:3]"
)).unwrap();

View File

@@ -0,0 +1,377 @@
use std::collections::BTreeMap;
use std::io::{self, Write as _};
use std::sync::atomic;
use std::time::Duration;
use backoff::ExponentialBackoff;
use byte_unit::Byte;
use flate2::write::GzEncoder;
use flate2::Compression;
use meilisearch_types::index_uid_pattern::IndexUidPattern;
use meilisearch_types::milli::constants::RESERVED_VECTORS_FIELD_NAME;
use meilisearch_types::milli::index::EmbeddingsWithMetadata;
use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use meilisearch_types::milli::update::{request_threads, Setting};
use meilisearch_types::milli::vector::parsed_vectors::{ExplicitVectors, VectorOrArrayOfVectors};
use meilisearch_types::milli::{self, obkv_to_json, Filter, InternalError};
use meilisearch_types::settings::{self, SecretPolicy};
use meilisearch_types::tasks::{DetailsExportIndexSettings, ExportIndexSettings};
use serde::Deserialize;
use ureq::{json, Response};
use super::MustStopProcessing;
use crate::processing::AtomicDocumentStep;
use crate::{Error, IndexScheduler, Result};
impl IndexScheduler {
pub(super) fn process_export(
&self,
base_url: &str,
api_key: Option<&str>,
payload_size: Option<&Byte>,
indexes: &BTreeMap<IndexUidPattern, ExportIndexSettings>,
progress: Progress,
) -> Result<BTreeMap<IndexUidPattern, DetailsExportIndexSettings>> {
#[cfg(test)]
self.maybe_fail(crate::test_utils::FailureLocation::ProcessExport)?;
let indexes: Vec<_> = self
.index_names()?
.into_iter()
.flat_map(|uid| {
indexes
.iter()
.find(|(pattern, _)| pattern.matches_str(&uid))
.map(|(pattern, settings)| (pattern, uid, settings))
})
.collect();
let mut output = BTreeMap::new();
let agent = ureq::AgentBuilder::new().timeout(Duration::from_secs(5)).build();
let must_stop_processing = self.scheduler.must_stop_processing.clone();
for (i, (_pattern, uid, export_settings)) in indexes.iter().enumerate() {
if must_stop_processing.get() {
return Err(Error::AbortedTask);
}
progress.update_progress(VariableNameStep::<ExportIndex>::new(
format!("Exporting index `{uid}`"),
i as u32,
indexes.len() as u32,
));
let ExportIndexSettings { filter, override_settings } = export_settings;
let index = self.index(uid)?;
let index_rtxn = index.read_txn()?;
let bearer = api_key.map(|api_key| format!("Bearer {api_key}"));
// First, check if the index already exists
let url = format!("{base_url}/indexes/{uid}");
let response = retry(&must_stop_processing, || {
let mut request = agent.get(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(Default::default()).map_err(into_backoff_error)
});
let index_exists = match response {
Ok(response) => response.status() == 200,
Err(Error::FromRemoteWhenExporting { code, .. }) if code == "index_not_found" => {
false
}
Err(e) => return Err(e),
};
let primary_key = index
.primary_key(&index_rtxn)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
// Create the index
if !index_exists {
let url = format!("{base_url}/indexes");
retry(&must_stop_processing, || {
let mut request = agent.post(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
let index_param = json!({ "uid": uid, "primaryKey": primary_key });
request.send_json(&index_param).map_err(into_backoff_error)
})?;
}
// Patch the index primary key
if index_exists && *override_settings {
let url = format!("{base_url}/indexes/{uid}");
retry(&must_stop_processing, || {
let mut request = agent.patch(&url);
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
let index_param = json!({ "primaryKey": primary_key });
request.send_json(&index_param).map_err(into_backoff_error)
})?;
}
// Send the index settings
if !index_exists || *override_settings {
let mut settings =
settings::settings(&index, &index_rtxn, SecretPolicy::RevealSecrets)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// Remove the experimental chat setting if not enabled
if self.features().check_chat_completions("exporting chat settings").is_err() {
settings.chat = Setting::NotSet;
}
// Retry logic for sending settings
let url = format!("{base_url}/indexes/{uid}/settings");
retry(&must_stop_processing, || {
let mut request = agent.patch(&url);
if let Some(bearer) = bearer.as_ref() {
request = request.set("Authorization", bearer);
}
request.send_json(settings.clone()).map_err(into_backoff_error)
})?;
}
let filter = filter
.as_ref()
.map(Filter::from_json)
.transpose()
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?
.flatten();
let filter_universe = filter
.map(|f| f.evaluate(&index_rtxn, &index))
.transpose()
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let whole_universe = index
.documents_ids(&index_rtxn)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
let universe = filter_universe.unwrap_or(whole_universe);
let fields_ids_map = index.fields_ids_map(&index_rtxn)?;
let all_fields: Vec<_> = fields_ids_map.iter().map(|(id, _)| id).collect();
// We don't need to keep this one alive as we will
// spawn many threads to process the documents
drop(index_rtxn);
let total_documents = universe.len() as u32;
let (step, progress_step) = AtomicDocumentStep::new(total_documents);
progress.update_progress(progress_step);
output.insert(
IndexUidPattern::new_unchecked(uid.clone()),
DetailsExportIndexSettings {
settings: (*export_settings).clone(),
matched_documents: Some(total_documents as u64),
},
);
let limit = payload_size.map(|ps| ps.as_u64() as usize).unwrap_or(20 * 1024 * 1024); // defaults to 20 MiB
let documents_url = format!("{base_url}/indexes/{uid}/documents");
let results = request_threads()
.broadcast(|ctx| {
let index_rtxn = index
.read_txn()
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
let mut buffer = Vec::new();
let mut tmp_buffer = Vec::new();
let mut compressed_buffer = Vec::new();
for (i, docid) in universe.iter().enumerate() {
if i % ctx.num_threads() != ctx.index() {
continue;
}
let document = index
.document(&index_rtxn, docid)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
let mut document = obkv_to_json(&all_fields, &fields_ids_map, document)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
// TODO definitely factorize this code
'inject_vectors: {
let embeddings = index
.embeddings(&index_rtxn, docid)
.map_err(|e| Error::from_milli(e, Some(uid.to_string())))?;
if embeddings.is_empty() {
break 'inject_vectors;
}
let vectors = document
.entry(RESERVED_VECTORS_FIELD_NAME)
.or_insert(serde_json::Value::Object(Default::default()));
let serde_json::Value::Object(vectors) = vectors else {
return Err(Error::from_milli(
milli::Error::UserError(
milli::UserError::InvalidVectorsMapType {
document_id: {
if let Ok(Some(Ok(index))) = index
.external_id_of(
&index_rtxn,
std::iter::once(docid),
)
.map(|it| it.into_iter().next())
{
index
} else {
format!("internal docid={docid}")
}
},
value: vectors.clone(),
},
),
Some(uid.to_string()),
));
};
for (
embedder_name,
EmbeddingsWithMetadata { embeddings, regenerate, has_fragments },
) in embeddings
{
let embeddings = ExplicitVectors {
embeddings: Some(
VectorOrArrayOfVectors::from_array_of_vectors(embeddings),
),
regenerate: regenerate &&
// Meilisearch does not handle well dumps with fragments, because as the fragments
// are marked as user-provided,
// all embeddings would be regenerated on any settings change or document update.
// To prevent this, we mark embeddings has non regenerate in this case.
!has_fragments,
};
vectors.insert(
embedder_name,
serde_json::to_value(embeddings).unwrap(),
);
}
}
tmp_buffer.clear();
serde_json::to_writer(&mut tmp_buffer, &document)
.map_err(milli::InternalError::from)
.map_err(|e| Error::from_milli(e.into(), Some(uid.to_string())))?;
// Make sure we put at least one document in the buffer even
// though we might go above the buffer limit before sending
if !buffer.is_empty() && buffer.len() + tmp_buffer.len() > limit {
// We compress the documents before sending them
let mut encoder =
GzEncoder::new(&mut compressed_buffer, Compression::default());
encoder
.write_all(&buffer)
.map_err(|e| Error::from_milli(e.into(), Some(uid.clone())))?;
encoder
.finish()
.map_err(|e| Error::from_milli(e.into(), Some(uid.clone())))?;
retry(&must_stop_processing, || {
let mut request = agent.post(&documents_url);
request = request.set("Content-Type", "application/x-ndjson");
request = request.set("Content-Encoding", "gzip");
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(&compressed_buffer).map_err(into_backoff_error)
})?;
buffer.clear();
compressed_buffer.clear();
}
buffer.extend_from_slice(&tmp_buffer);
if i > 0 && i % 100 == 0 {
step.fetch_add(100, atomic::Ordering::Relaxed);
}
}
retry(&must_stop_processing, || {
let mut request = agent.post(&documents_url);
request = request.set("Content-Type", "application/x-ndjson");
if let Some(bearer) = &bearer {
request = request.set("Authorization", bearer);
}
request.send_bytes(&buffer).map_err(into_backoff_error)
})?;
Ok(())
})
.map_err(|e| {
Error::from_milli(
milli::Error::InternalError(InternalError::PanicInThreadPool(e)),
Some(uid.to_string()),
)
})?;
for result in results {
result?;
}
step.store(total_documents, atomic::Ordering::Relaxed);
}
Ok(output)
}
}
fn retry<F>(must_stop_processing: &MustStopProcessing, send_request: F) -> Result<ureq::Response>
where
F: Fn() -> Result<ureq::Response, backoff::Error<ureq::Error>>,
{
match backoff::retry(ExponentialBackoff::default(), || {
if must_stop_processing.get() {
return Err(backoff::Error::Permanent(ureq::Error::Status(
u16::MAX,
// 444: Connection Closed Without Response
Response::new(444, "Abort", "Aborted task").unwrap(),
)));
}
send_request()
}) {
Ok(response) => Ok(response),
Err(backoff::Error::Permanent(e)) => Err(ureq_error_into_error(e)),
Err(backoff::Error::Transient { err, retry_after: _ }) => Err(ureq_error_into_error(err)),
}
}
fn into_backoff_error(err: ureq::Error) -> backoff::Error<ureq::Error> {
match err {
// Those code status must trigger an automatic retry
// <https://www.restapitutorial.com/advanced/responses/retries>
ureq::Error::Status(408 | 429 | 500 | 502 | 503 | 504, _) => {
backoff::Error::Transient { err, retry_after: None }
}
ureq::Error::Status(_, _) => backoff::Error::Permanent(err),
ureq::Error::Transport(_) => backoff::Error::Transient { err, retry_after: None },
}
}
/// Converts a `ureq::Error` into an `Error`.
fn ureq_error_into_error(error: ureq::Error) -> Error {
#[derive(Deserialize)]
struct MeiliError {
message: String,
code: String,
r#type: String,
link: String,
}
match error {
// This is a workaround to handle task abortion - the error propagation path
// makes it difficult to cleanly surface the abortion at this level.
ureq::Error::Status(u16::MAX, _) => Error::AbortedTask,
ureq::Error::Status(_, response) => match response.into_json() {
Ok(MeiliError { message, code, r#type, link }) => {
Error::FromRemoteWhenExporting { message, code, r#type, link }
}
Err(e) => e.into(),
},
ureq::Error::Transport(transport) => io::Error::new(io::ErrorKind::Other, transport).into(),
}
}
enum ExportIndex {}

View File

@@ -1,11 +1,13 @@
use std::sync::Arc;
use bumpalo::collections::CollectIn;
use bumpalo::Bump;
use meilisearch_types::heed::RwTxn;
use meilisearch_types::milli::documents::PrimaryKey;
use meilisearch_types::milli::progress::Progress;
use meilisearch_types::milli::progress::{EmbedderStats, Progress};
use meilisearch_types::milli::update::new::indexer::{self, UpdateByFunction};
use meilisearch_types::milli::update::DocumentAdditionResult;
use meilisearch_types::milli::{self, ChannelCongestion, Filter, ThreadPoolNoAbortBuilder};
use meilisearch_types::milli::{self, ChannelCongestion, Filter};
use meilisearch_types::settings::apply_settings_to_builder;
use meilisearch_types::tasks::{Details, KindWithContent, Status, Task};
use meilisearch_types::Index;
@@ -24,7 +26,7 @@ impl IndexScheduler {
/// The list of processed tasks.
#[tracing::instrument(
level = "trace",
skip(self, index_wtxn, index, progress),
skip(self, index_wtxn, index, progress, embedder_stats),
target = "indexing::scheduler"
)]
pub(crate) fn apply_index_operation<'i>(
@@ -33,6 +35,7 @@ impl IndexScheduler {
index: &'i Index,
operation: IndexOperation,
progress: &Progress,
embedder_stats: Arc<EmbedderStats>,
) -> Result<(Vec<Task>, Option<ChannelCongestion>)> {
let indexer_alloc = Bump::new();
let started_processing_at = std::time::Instant::now();
@@ -86,8 +89,9 @@ impl IndexScheduler {
let mut content_files_iter = content_files.iter();
let mut indexer = indexer::DocumentOperation::new();
let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?;
.map_err(|e| Error::from_milli(e.into(), Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?;
for operation in operations {
match operation {
@@ -113,18 +117,8 @@ impl IndexScheduler {
}
}
let local_pool;
let indexer_config = self.index_mapper.indexer_config();
let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
let pool = &indexer_config.thread_pool;
progress.update_progress(DocumentOperationProgress::ComputingDocumentChanges);
let (document_changes, operation_stats, primary_key) = indexer
@@ -187,6 +181,7 @@ impl IndexScheduler {
embedders,
&|| must_stop_processing.get(),
progress,
&embedder_stats,
)
.map_err(|e| Error::from_milli(e, Some(index_uid.clone())))?,
);
@@ -266,18 +261,8 @@ impl IndexScheduler {
let mut congestion = None;
if task.error.is_none() {
let local_pool;
let indexer_config = self.index_mapper.indexer_config();
let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
let pool = &indexer_config.thread_pool;
let candidates_count = candidates.len();
progress.update_progress(DocumentEditionProgress::ComputingDocumentChanges);
@@ -290,8 +275,9 @@ impl IndexScheduler {
})
.unwrap()?;
let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentEditionProgress::Indexing);
@@ -308,6 +294,7 @@ impl IndexScheduler {
embedders,
&|| must_stop_processing.get(),
progress,
&embedder_stats,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
@@ -429,18 +416,8 @@ impl IndexScheduler {
let mut congestion = None;
if !tasks.iter().all(|res| res.error.is_some()) {
let local_pool;
let indexer_config = self.index_mapper.indexer_config();
let pool = match &indexer_config.thread_pool {
Some(pool) => pool,
None => {
local_pool = ThreadPoolNoAbortBuilder::new()
.thread_name(|i| format!("indexing-thread-{i}"))
.build()
.unwrap();
&local_pool
}
};
let pool = &indexer_config.thread_pool;
progress.update_progress(DocumentDeletionProgress::DeleteDocuments);
let mut indexer = indexer::DocumentDeletion::new();
@@ -448,8 +425,9 @@ impl IndexScheduler {
indexer.delete_documents_by_docids(to_delete);
let document_changes = indexer.into_changes(&indexer_alloc, primary_key);
let embedders = index
.embedding_configs()
.embedding_configs(index_wtxn)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
.map_err(|err| Error::from_milli(err.into(), Some(index_uid.clone())))?;
let embedders = self.embedders(index_uid.clone(), embedders)?;
progress.update_progress(DocumentDeletionProgress::Indexing);
@@ -466,6 +444,7 @@ impl IndexScheduler {
embedders,
&|| must_stop_processing.get(),
progress,
&embedder_stats,
)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?,
);
@@ -498,14 +477,11 @@ impl IndexScheduler {
}
progress.update_progress(SettingsProgress::ApplyTheSettings);
builder
.execute(
|indexing_step| tracing::debug!(update = ?indexing_step),
|| must_stop_processing.get(),
)
let congestion = builder
.execute(&|| must_stop_processing.get(), progress, embedder_stats)
.map_err(|err| Error::from_milli(err, Some(index_uid.clone())))?;
Ok((tasks, None))
Ok((tasks, congestion))
}
IndexOperation::DocumentClearAndSetting {
index_uid,
@@ -521,6 +497,7 @@ impl IndexScheduler {
tasks: cleared_tasks,
},
progress,
embedder_stats.clone(),
)?;
let (settings_tasks, _congestion) = self.apply_index_operation(
@@ -528,6 +505,7 @@ impl IndexScheduler {
index,
IndexOperation::Settings { index_uid, settings, tasks: settings_tasks },
progress,
embedder_stats,
)?;
let mut tasks = settings_tasks;

View File

@@ -7,9 +7,73 @@ use meilisearch_types::milli::progress::{Progress, VariableNameStep};
use meilisearch_types::tasks::{Status, Task};
use meilisearch_types::{compression, VERSION_FILE_NAME};
use crate::heed::EnvOpenOptions;
use crate::processing::{AtomicUpdateFileStep, SnapshotCreationProgress};
use crate::queue::TaskQueue;
use crate::{Error, IndexScheduler, Result};
/// # Safety
///
/// See [`EnvOpenOptions::open`].
unsafe fn remove_tasks(
tasks: &[Task],
dst: &std::path::Path,
index_base_map_size: usize,
) -> Result<()> {
let env_options = EnvOpenOptions::new();
let mut env_options = env_options.read_txn_without_tls();
let env = env_options.max_dbs(TaskQueue::nb_db()).map_size(index_base_map_size).open(dst)?;
let mut wtxn = env.write_txn()?;
let task_queue = TaskQueue::new(&env, &mut wtxn)?;
// Destructuring to ensure the code below gets updated if a database gets added in the future.
let TaskQueue {
all_tasks,
status,
kind,
index_tasks: _, // snapshot creation tasks are not index tasks
canceled_by,
enqueued_at,
started_at,
finished_at,
} = task_queue;
for task in tasks {
all_tasks.delete(&mut wtxn, &task.uid)?;
let mut tasks = status.get(&wtxn, &task.status)?.unwrap_or_default();
tasks.remove(task.uid);
status.put(&mut wtxn, &task.status, &tasks)?;
let mut tasks = kind.get(&wtxn, &task.kind.as_kind())?.unwrap_or_default();
tasks.remove(task.uid);
kind.put(&mut wtxn, &task.kind.as_kind(), &tasks)?;
canceled_by.delete(&mut wtxn, &task.uid)?;
let timestamp = task.enqueued_at.unix_timestamp_nanos();
let mut tasks = enqueued_at.get(&wtxn, &timestamp)?.unwrap_or_default();
tasks.remove(task.uid);
enqueued_at.put(&mut wtxn, &timestamp, &tasks)?;
if let Some(task_started_at) = task.started_at {
let timestamp = task_started_at.unix_timestamp_nanos();
let mut tasks = started_at.get(&wtxn, &timestamp)?.unwrap_or_default();
tasks.remove(task.uid);
started_at.put(&mut wtxn, &timestamp, &tasks)?;
}
if let Some(task_finished_at) = task.finished_at {
let timestamp = task_finished_at.unix_timestamp_nanos();
let mut tasks = finished_at.get(&wtxn, &timestamp)?.unwrap_or_default();
tasks.remove(task.uid);
finished_at.put(&mut wtxn, &timestamp, &tasks)?;
}
}
wtxn.commit()?;
Ok(())
}
impl IndexScheduler {
pub(super) fn process_snapshot(
&self,
@@ -41,16 +105,33 @@ impl IndexScheduler {
progress.update_progress(SnapshotCreationProgress::SnapshotTheIndexScheduler);
let dst = temp_snapshot_dir.path().join("tasks");
fs::create_dir_all(&dst)?;
self.env.copy_to_path(dst.join("data.mdb"), CompactionOption::Disabled)?;
let compaction_option = if self.scheduler.experimental_no_snapshot_compaction {
CompactionOption::Disabled
} else {
CompactionOption::Enabled
};
self.env.copy_to_path(dst.join("data.mdb"), compaction_option)?;
// 2.2 Create a read transaction on the index-scheduler
// 2.2 Remove the current snapshot tasks
//
// This is done to ensure that the tasks are not processed again when the snapshot is imported
//
// # Safety
//
// This is safe because we open the env file we just created in a temporary directory.
// We are sure it's not being used by any other process nor thread.
unsafe {
remove_tasks(&tasks, &dst, self.index_mapper.index_base_map_size)?;
}
// 2.3 Create a read transaction on the index-scheduler
let rtxn = self.env.read_txn()?;
// 2.3 Create the update files directory
// 2.4 Create the update files directory
let update_files_dir = temp_snapshot_dir.path().join("update_files");
fs::create_dir_all(&update_files_dir)?;
// 2.4 Only copy the update files of the enqueued tasks
// 2.5 Only copy the update files of the enqueued tasks
progress.update_progress(SnapshotCreationProgress::SnapshotTheUpdateFiles);
let enqueued = self.queue.tasks.get_status(&rtxn, Status::Enqueued)?;
let (atomic, update_file_progress) = AtomicUpdateFileStep::new(enqueued.len() as u32);
@@ -80,7 +161,7 @@ impl IndexScheduler {
let dst = temp_snapshot_dir.path().join("indexes").join(uuid.to_string());
fs::create_dir_all(&dst)?;
index
.copy_to_path(dst.join("data.mdb"), CompactionOption::Disabled)
.copy_to_path(dst.join("data.mdb"), compaction_option)
.map_err(|e| Error::from_milli(e, Some(name.to_string())))?;
}
@@ -90,7 +171,7 @@ impl IndexScheduler {
progress.update_progress(SnapshotCreationProgress::SnapshotTheApiKeys);
let dst = temp_snapshot_dir.path().join("auth");
fs::create_dir_all(&dst)?;
self.scheduler.auth_env.copy_to_path(dst.join("data.mdb"), CompactionOption::Disabled)?;
self.scheduler.auth_env.copy_to_path(dst.join("data.mdb"), compaction_option)?;
// 5. Copy and tarball the flat snapshot
progress.update_progress(SnapshotCreationProgress::CreateTheTarball);

View File

@@ -0,0 +1,17 @@
---
source: crates/index-scheduler/src/scheduler/test.rs
expression: config.embedder_options
---
{
"Rest": {
"api_key": "My super secret",
"distribution": null,
"dimensions": 4,
"url": "http://localhost:7777",
"request": "{{text}}",
"search_fragments": {},
"indexing_fragments": {},
"response": "{{embedding}}",
"headers": {}
}
}

View File

@@ -0,0 +1,12 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: simple_hf_config.embedder_options
---
{
"HuggingFace": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"distribution": null,
"pooling": "useModel"
}
}

View File

@@ -0,0 +1,15 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: doc
---
{
"doggo": "Intel",
"breed": "beagle",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@@ -0,0 +1,15 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: doc
---
{
"doggo": "kefir",
"breed": "patou",
"_vectors": {
"noise": [
0.1,
0.2,
0.3
]
}
}

View File

@@ -1,12 +1,17 @@
---
source: crates/index-scheduler/src/scheduler/test_embedders.rs
expression: simple_hf_config.embedder_options
expression: fakerest_config.embedder_options
---
{
"HuggingFace": {
"model": "sentence-transformers/all-MiniLM-L6-v2",
"revision": "e4ce9877abf3edfe10b0d82785e83bdcb973e22e",
"Rest": {
"api_key": "My super secret",
"distribution": null,
"pooling": "useModel"
"dimensions": 384,
"url": "http://localhost:7777",
"request": "{{text}}",
"search_fragments": {},
"indexing_fragments": {},
"response": "{{embedding}}",
"headers": {}
}
}

View File

@@ -39,7 +39,7 @@ catto [0,]
[timestamp] [0,1,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, stop reason: "task with id 1 of type `taskCancelation` cannot be batched", }
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 1 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,1,]

View File

@@ -50,7 +50,7 @@ catto: { number_of_documents: 1, field_distribution: {"id": 1} }
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"catto":1}}, stop reason: "batched all enqueued tasks for index `catto`", }
1 {uid: 1, details: {"receivedDocuments":2,"indexedDocuments":0,"matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query"}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"documentAdditionOrUpdate":2,"taskCancelation":1},"indexUids":{"beavero":1,"wolfo":1}}, stop reason: "task with id 3 of type `taskCancelation` cannot be batched", }
1 {uid: 1, details: {"receivedDocuments":2,"indexedDocuments":0,"matchedTasks":3,"canceledTasks":2,"originalFilter":"test_query"}, stats: {"totalNbTasks":3,"status":{"succeeded":1,"canceled":2},"types":{"documentAdditionOrUpdate":2,"taskCancelation":1},"indexUids":{"beavero":1,"wolfo":1}}, stop reason: "created batch containing only task with id 3 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -38,7 +38,7 @@ canceled [0,]
[timestamp] [0,1,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"matchedTasks":1,"canceledTasks":1,"originalFilter":"cancel dump"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"taskCancelation":1,"dumpCreation":1},"indexUids":{}}, stop reason: "task with id 1 of type `taskCancelation` cannot be batched", }
0 {uid: 0, details: {"matchedTasks":1,"canceledTasks":1,"originalFilter":"cancel dump"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"taskCancelation":1,"dumpCreation":1},"indexUids":{}}, stop reason: "created batch containing only task with id 1 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,1,]

View File

@@ -4,7 +4,7 @@ source: crates/index-scheduler/src/scheduler/test.rs
### Autobatching Enabled = true
### Processing batch Some(0):
[0,]
{uid: 0, details: {"dumpUid":null}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"dumpCreation":1},"indexUids":{}}, stop reason: "task with id 0 of type `dumpCreation` cannot be batched", }
{uid: 0, details: {"dumpUid":null}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"dumpCreation":1},"indexUids":{}}, stop reason: "created batch containing only task with id 0 of type `dumpCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { dump_uid: None }, kind: DumpCreation { keys: [], instance_uid: None }}

View File

@@ -40,7 +40,7 @@ catto: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,1,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, stop reason: "task with id 1 of type `taskCancelation` cannot be batched", }
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"matchedTasks":1,"canceledTasks":1,"originalFilter":"test_query"}, stats: {"totalNbTasks":2,"status":{"succeeded":1,"canceled":1},"types":{"documentAdditionOrUpdate":1,"taskCancelation":1},"indexUids":{"catto":1}}, stop reason: "created batch containing only task with id 1 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,1,]

View File

@@ -41,7 +41,7 @@ catto: { number_of_documents: 1, field_distribution: {"id": 1} }
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":1}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentAdditionOrUpdate":1},"indexUids":{"catto":1}}, stop reason: "batched all enqueued tasks", }
1 {uid: 1, details: {"matchedTasks":1,"canceledTasks":0,"originalFilter":"test_query"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"taskCancelation":1},"indexUids":{}}, stop reason: "task with id 1 of type `taskCancelation` cannot be batched", }
1 {uid: 1, details: {"matchedTasks":1,"canceledTasks":0,"originalFilter":"test_query"}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"taskCancelation":1},"indexUids":{}}, stop reason: "created batch containing only task with id 1 of type `taskCancelation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -60,9 +60,9 @@ girafos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [5,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"girafos":1}}, stop reason: "task with id 2 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"girafos":1}}, stop reason: "created batch containing only task with id 2 of type `indexCreation` that cannot be batched with any other task.", }
3 {uid: 3, details: {"deletedDocuments":0}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentDeletion":1},"indexUids":{"doggos":1}}, stop reason: "batched all enqueued tasks for index `doggos`", }
4 {uid: 4, details: {"deletedDocuments":0}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentDeletion":1},"indexUids":{"cattos":1}}, stop reason: "batched all enqueued tasks for index `cattos`", }
5 {uid: 5, details: {"deletedDocuments":0}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"documentDeletion":1},"indexUids":{"girafos":1}}, stop reason: "batched all enqueued tasks", }

View File

@@ -41,7 +41,7 @@ doggos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -42,8 +42,8 @@ doggos [0,1,2,]
[timestamp] [1,2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {"receivedDocuments":1,"indexedDocuments":0,"deletedDocuments":0}, stats: {"totalNbTasks":2,"status":{"succeeded":2},"types":{"documentAdditionOrUpdate":1,"indexDeletion":1},"indexUids":{"doggos":2}}, stop reason: "task with id 2 deletes the index", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {"receivedDocuments":1,"indexedDocuments":0,"deletedDocuments":0}, stats: {"totalNbTasks":2,"status":{"succeeded":2},"types":{"documentAdditionOrUpdate":1,"indexDeletion":1},"indexUids":{"doggos":2}}, stop reason: "stopped after task with id 2 because it deletes the index", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -37,7 +37,7 @@ doggos [0,1,]
[timestamp] [0,1,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"deletedDocuments":0}, stats: {"totalNbTasks":2,"status":{"succeeded":2},"types":{"documentAdditionOrUpdate":1,"indexDeletion":1},"indexUids":{"doggos":2}}, stop reason: "task with id 1 deletes the index", }
0 {uid: 0, details: {"receivedDocuments":1,"indexedDocuments":0,"deletedDocuments":0}, stats: {"totalNbTasks":2,"status":{"succeeded":2},"types":{"documentAdditionOrUpdate":1,"indexDeletion":1},"indexUids":{"doggos":2}}, stop reason: "stopped after task with id 1 because it deletes the index", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,1,]

View File

@@ -4,7 +4,7 @@ source: crates/index-scheduler/src/scheduler/test.rs
### Autobatching Enabled = true
### Processing batch Some(0):
[0,]
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { primary_key: Some("id") }, kind: IndexCreation { index_uid: "index_a", primary_key: Some("id") }}

View File

@@ -4,7 +4,7 @@ source: crates/index-scheduler/src/scheduler/test.rs
### Autobatching Enabled = true
### Processing batch Some(0):
[0,]
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { primary_key: Some("id") }, kind: IndexCreation { index_uid: "index_a", primary_key: Some("id") }}

View File

@@ -4,7 +4,7 @@ source: crates/index-scheduler/src/scheduler/test.rs
### Autobatching Enabled = true
### Processing batch Some(0):
[0,]
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
{uid: 0, details: {"primaryKey":"id"}, stats: {"totalNbTasks":1,"status":{"processing":1},"types":{"indexCreation":1},"indexUids":{"index_a":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### All Tasks:
0 {uid: 0, status: enqueued, details: { primary_key: Some("id") }, kind: IndexCreation { index_uid: "index_a", primary_key: Some("id") }}

View File

@@ -41,7 +41,7 @@ doggos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -44,8 +44,8 @@ doggos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [1,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -45,9 +45,9 @@ cattos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [2,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "task with id 1 of type `indexCreation` cannot be batched", }
2 {uid: 2, details: {"deletedDocuments":0}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexDeletion":1},"indexUids":{"doggos":1}}, stop reason: "task with id 2 deletes the index", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
1 {uid: 1, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"cattos":1}}, stop reason: "created batch containing only task with id 1 of type `indexCreation` that cannot be batched with any other task.", }
2 {uid: 2, details: {"deletedDocuments":0}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexDeletion":1},"indexUids":{"doggos":1}}, stop reason: "stopped after task with id 2 because it deletes the index", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

View File

@@ -42,7 +42,7 @@ doggos: { number_of_documents: 0, field_distribution: {} }
[timestamp] [0,]
----------------------------------------------------------------------
### All Batches:
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "task with id 0 of type `indexCreation` cannot be batched", }
0 {uid: 0, details: {}, stats: {"totalNbTasks":1,"status":{"succeeded":1},"types":{"indexCreation":1},"indexUids":{"doggos":1}}, stop reason: "created batch containing only task with id 0 of type `indexCreation` that cannot be batched with any other task.", }
----------------------------------------------------------------------
### Batch to tasks mapping:
0 [0,]

Some files were not shown because too many files have changed in this diff Show More