Commit Graph

867 Commits

Author SHA1 Message Date
cbdf80893d Merge #5422
5422: Add more progress levels to measure merging r=Kerollmops a=Kerollmops

I found out that Meilisearch was not correctly reporting the long indexing times in the progress and that a lot of time was spent on extracting words with all documents already extracted. The reason was that there was no step to report merging the cache and sending the entries to write to the writer thread. This PR adds these entries to the progress.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-03-17 12:02:46 +00:00
e2156ddfc7 Simplify the IndexingStep progress enum 2025-03-17 11:40:50 +01:00
13a88d6131 Merge #5407
5407: Geo update bug r=irevoire a=ManyTheFish

# Pull Request

## Related issue
Fixes #5380
Fixes #5399



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-03-17 10:24:33 +00:00
d9875b782d Merge #5421
5421: Accept total batch size in human size r=irevoire a=Kerollmops

This PR fixes the new `experimental-limit-batched-tasks-total-size` to accept human-defined sizes in bytes.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-03-17 09:41:22 +00:00
cb16baab18 Add more progress levels to measure merging 2025-03-17 10:13:29 +01:00
2500e3c067 Merge #5414
5414: Update version for the next release (v1.14.0) in Cargo.toml r=Kerollmops a=meili-bot

⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging. Fixes https://github.com/meilisearch/meilisearch/issues/5268.

Co-authored-by: Kerollmops <Kerollmops@users.noreply.github.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-03-14 13:35:54 +00:00
d3e4b2dfe7 Accept total batch size in human size 2025-03-14 13:07:51 +01:00
009c36a4d0 Add support for the progress API of arroy 2025-03-13 19:00:43 +01:00
2a47e25e6d Update the upgrade path snap 2025-03-13 18:35:06 +01:00
e2d372823a Disable the cache by default and make it experimental 2025-03-13 17:22:51 +01:00
1876132172 Mutex-based implementation 2025-03-13 17:22:50 +01:00
d0b0b90d17 fixup tests, in particular foil the cache for the timeout test 2025-03-13 17:22:50 +01:00
b08544e86d Add embedding cache 2025-03-13 17:22:50 +01:00
d9111fe8ce Add lru crate to milli again 2025-03-13 17:22:50 +01:00
41d8161017 Update the versions 2025-03-13 17:22:32 +01:00
7df5715d39 Merge pull request #5406 from meilisearch/bump-heed
Bump heed to v0.22 and arroy to v0.6
2025-03-13 16:52:45 +01:00
5fe02ab5e0 Move to heed 0.22 and arroy 0.6 2025-03-13 15:48:18 +01:00
5ef7767429 Let arroy uses all the memory available instead of 50% of the 70% 2025-03-13 15:06:03 +01:00
3fad48167b remove arroy dependency in the index-scheduler 2025-03-13 14:57:56 +01:00
a92a48b9b9 Do not recompute stats on dumpless upgrade
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-03-13 13:58:58 +01:00
d53225bf64 uses a random seed instead of 42 2025-03-13 12:43:31 +01:00
1af520077c Call the underlying Env::copy_to_path method 2025-03-13 11:49:25 +01:00
7e07cb9de1 Make meilitool prefer WithoutTls Env 2025-03-13 11:47:19 +01:00
a12b06d99d Merge #5369
5369: exhaustive facet search r=ManyTheFish a=ManyTheFish

Fixes #5403

This PR adds an `exhaustiveFacetCount` field to the `/facet-search` API allowing the end-user to have a better facet count when having a distinct attribute set in the index settings.

 # Usage

`POST /index/:index_uid/facet-search`
**Body:**
```json
{
  "facetQuery": "blob",
  "facetName": "genres",
  "q": "",
  "exhaustiveFacetCount": true
}
```

# Prototype Docker images

```sh
$ docker pull getmeili/meilisearch:prototype-exhaustive-facet-search-00
```

Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-03-13 10:36:04 +00:00
ef9d9f8481 set the memory in arroy 2025-03-13 11:29:00 +01:00
d3d22d8ed4 Prefer waiting for the task before getting the indexes 2025-03-13 11:29:00 +01:00
5e6abcf50c Prefer using WithoutTls for the auth env 2025-03-13 11:29:00 +01:00
a4aaf932ba Fix some test (invalid anyway) 2025-03-13 11:29:00 +01:00
55ca2c4481 Avoid opening the Auth environment multiple times 2025-03-13 11:07:49 +01:00
fedb444e66 Fix the upgrade arroy calls 2025-03-13 11:07:49 +01:00
bef5954741 Use a WithoutTls env 2025-03-13 11:07:49 +01:00
ff8cf38d6b Move to the latest version of arroy 2025-03-13 11:07:48 +01:00
566b4efb06 Dumpless upgrade from v1.13 to v1.14 2025-03-13 11:07:44 +01:00
1d499ed9b2 Use the new arroy upgrade method to move from 0.4 to 0.5 2025-03-13 11:07:44 +01:00
3bc62f0549 WIP: Still need to introduce a Env::copy_to_path method 2025-03-13 11:07:39 +01:00
21bbbdec76 Specify WithoutTls everywhere 2025-03-13 11:07:38 +01:00
78ebd8dba2 Fix the error variants 2025-03-13 11:07:38 +01:00
34df44a002 Open Env without TLS 2025-03-13 11:07:38 +01:00
48a27f669e Bump heed and other dependencies 2025-03-13 11:07:37 +01:00
e2d0ce52ba Merge #5384
5384: Get multiple documents by ids r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #5345 

## What does this PR do?
- Implements [public usage](https://www.notion.so/meilisearch/Get-documents-by-ID-1994b06b651f805ba273e1c6b75ce4d8)
- Slightly refactor error messages for the `/similar` route

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-12 17:26:49 +00:00
60ff1b19a8 Searching for a document that does not exist no longer raises an error 2025-03-12 11:50:39 +01:00
7df5e3f059 Fix error message
Co-authored-by: Tamo <tamo@meilisearch.com>
2025-03-12 11:48:40 +01:00
0197dc87e0 Make sure to delete useless prefixes 2025-03-12 11:24:13 +01:00
7a172b82ca Add test 2025-03-12 11:22:59 +01:00
eb3ff325d1 Add an exhaustiveFacetCount field to the facet-search API 2025-03-12 11:22:59 +01:00
d3cd5ea689 Check if the geo fields changed additionally to the other faceted fields when reindexing facets 2025-03-12 11:20:10 +01:00
3ed43f9097 add a failing test reproducing the bug 2025-03-12 11:20:10 +01:00
a2a86ef4e2 Merge #5254
5254: Granular Filterable attribute settings r=ManyTheFish a=ManyTheFish

# Related
**Issue:** https://github.com/meilisearch/meilisearch/issues/5163
**PRD:** https://meilisearch.notion.site/API-usage-Settings-to-opt-out-indexing-features-filterableAttributes-1764b06b651f80aba8bdf359b2df3ca8

# Summary
Change the `filterableAttributes` settings to let the user choose which facet feature he wants to activate or not.
Deactivating a feature will avoid some database computation in the indexing process and save time and disk size.

# Example

`PATCH /indexes/:index_uid/settings`

```json
{
  "filterableAttributes": [
    {
      "patterns": [
        "cattos",
        "doggos.age"
      ],
      "features": {
        "facetSearch": false,
        "filter": {
          "equality": true,
          "comparison": false
        }
      }
    }
  ]
}
```

# Impact on the codebase
- Settings API:
  - `/settings`
  - `/settings/filterable-attributes`
  - OpenAPI 
  - may impact the LocalizedAttributesRules due to the AttributePatterns factorization
- Database:
  - Filterable attributes format changed
  - Faceted field_ids are no more stored in the database
  - FieldIdsMap has no more unexisting fields
- Search:
  - Search using filters
  - Facet search
  - `Attributes` ranking rule
  - Distinct attribute
  - Facet distribution
- Settings reindexing:
  - searchable
  - facet
  - vector
  - geo
- Document indexing:
  - searchable
  - facet
  - vector
  - geo
- Dump import

# Note for the reviewers
The changes are huge and have been split in different commits with a dedicated explanation, I suggest reviewing the commit 1by1

Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-03-12 09:00:43 +00:00
d500c7f625 Add default deserialize value 2025-03-11 17:55:49 +01:00
ea7e299663 Update has_changed_for_fields documentation 2025-03-11 16:48:55 +01:00