Commit Graph

8611 Commits

Author SHA1 Message Date
Kerollmops
66b8cfd8c8 Introduce a way to store the HNSW on multiple LMDB entries 2023-06-27 12:32:42 +02:00
Kerollmops
ff3664431f Make rustfmt happy 2023-06-27 12:32:42 +02:00
Kerollmops
531748c536 Return a user error when the _vectors type is invalid 2023-06-27 12:32:41 +02:00
Kerollmops
7aa1275337 Display the _semanticSimilarity even if the _vectors field is not displayed 2023-06-27 12:32:41 +02:00
Kerollmops
737aec1705 Expose an _semanticSimilarity as a dot product in the documents 2023-06-27 12:32:41 +02:00
Kerollmops
3e3c743392 Make Rustfmt happy 2023-06-27 12:32:41 +02:00
Kerollmops
5c5a4e075d Make clippy happy 2023-06-27 12:32:41 +02:00
Kerollmops
ab9f2269aa Normalize the vectors during indexation and search 2023-06-27 12:32:41 +02:00
Kerollmops
321ec5f3fa Accept multiple vectors by documents using the _vectors field 2023-06-27 12:32:40 +02:00
Kerollmops
1b2923f7c0 Return the vector in the output of the search routes 2023-06-27 12:32:40 +02:00
Kerollmops
717d4fddd4 Remove the unused distance 2023-06-27 12:32:40 +02:00
Kerollmops
a7e0f0de89 Introduce a new error message for invalid vector dimensions 2023-06-27 12:32:40 +02:00
Kerollmops
3b560ef7d0 Make clippy happy 2023-06-27 12:32:40 +02:00
Kerollmops
2cf747cb89 Fix the tests 2023-06-27 12:32:40 +02:00
Kerollmops
3c31e1cdd1 Support more pages but in an ugly way 2023-06-27 12:32:39 +02:00
Kerollmops
23eaaf1001 Change the name of the distance module 2023-06-27 12:32:39 +02:00
Kerollmops
c2a402f3ae Implement an ugly deletion of values in the HNSW 2023-06-27 12:32:39 +02:00
Kerollmops
436a10bef4 Replace the euclidean with a dot product 2023-06-27 12:32:39 +02:00
Kerollmops
8debf6fe81 Use a basic euclidean distance function 2023-06-27 12:32:39 +02:00
Kerollmops
c79e82c62a Move back to the hnsw crate
This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.
2023-06-27 12:32:39 +02:00
Kerollmops
aca305bb77 Log more to make sure we insert vectors in the hgg data-structure 2023-06-27 12:32:38 +02:00
Kerollmops
5816008139 Introduce an optimized version of the euclidean distance function 2023-06-27 12:32:38 +02:00
Kerollmops
268a9ef416 Move to the hgg crate 2023-06-27 12:32:38 +02:00
Clément Renault
642b0f3a1b Expose a new vector field on the search route 2023-06-27 12:32:38 +02:00
Clément Renault
cad90e8cbc Add a vector field to the search routes 2023-06-27 12:32:38 +02:00
Clément Renault
4571e512d2 Store the vectors in an HNSW in LMDB 2023-06-27 12:32:38 +02:00
Clément Renault
7ac2f1489d Extract the vectors from the documents 2023-06-27 12:32:37 +02:00
Clément Renault
34349faeae Create a new _vector extractor 2023-06-27 12:32:37 +02:00
meili-bors[bot]
ed0a5be4b6 Merge #3853
3853: docs: fixed some broken links r=gillian-meilisearch a=0xflotus

Some of the links in the README file were broken.


Co-authored-by: 0xflotus <0xflotus@gmail.com>
2023-06-27 10:30:13 +00:00
meili-bors[bot]
f105df6599 Merge #3850
3850: Experimental features r=Kerollmops a=dureuill

# Pull Request

## Related issue

- Fixes https://github.com/meilisearch/meilisearch/issues/3857
- Related to https://github.com/meilisearch/meilisearch/issues/3771
## What does this PR do?

### Example

<details>
<summary>Using the feature to enable `scoreDetails`</summary>

```json
❯ curl \
  -X POST 'http://localhost:7700/indexes/index-word-count-10-count/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{ "q": "Batman", "limit": 1, "showRankingScoreDetails": true, "attributesToRetrieve": ["title"]}' | jsonxf

{
  "message": "Computing score details requires enabling the `score details` experimental feature. See https://github.com/meilisearch/product/discussions/674",
  "code": "feature_not_enabled",
  "type": "invalid_request",
  "link": "https://docs.meilisearch.com/errors#feature_not_enabled"
}
```

```json
❯ curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{
    "scoreDetails": true
  }'
{"scoreDetails":true,"vectorSearch":false}
```

```json
❯ curl \
  -X POST 'http://localhost:7700/indexes/index-word-count-10-count/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{ "q": "Batman", "limit": 1, "showRankingScoreDetails": true, "attributesToRetrieve": ["title"]}' | jsonxf
{
  "hits": [
    {
      "title": "Batman",
      "_rankingScoreDetails": {
        "words": {
          "order": 0,
          "matchingWords": 1,
          "maxMatchingWords": 1,
          "score": 1.0
        },
        "typo": {
          "order": 1,
          "typoCount": 0,
          "maxTypoCount": 1,
          "score": 1.0
        },
        "proximity": {
          "order": 2,
          "score": 1.0
        },
        "attribute": {
          "order": 3,
          "attribute_ranking_order_score": 1.0,
          "query_word_distance_score": 1.0,
          "score": 1.0
        },
        "exactness": {
          "order": 4,
          "matchType": "exactMatch",
          "score": 1.0
        }
      }
    }
  ],
  "query": "Batman",
  "processingTimeMs": 3,
  "limit": 1,
  "offset": 0,
  "estimatedTotalHits": 46
}
```


</details>

### User standpoint

- Add new route GET/POST/PATCH/DELETE `/experimental-features` to switch on or off some of the experimental features in a manner persistent between instance restarts
- Use these new routes to allow setting on/off the following experimental features:
  - vector store **TODO:** fill in issue 
  - score details (related to https://github.com/meilisearch/meilisearch/issues/3771)
- Make the way of checking feature availability and error message uniform for the Prometheus metrics experimental feature
- Save the enabled features in dump, restore from dumps
- **TODO:** tests:
  - Test new security permissions (do they allow access with ALL, do they prevent access when missing)
  - Test dump behavior, in particular ability to import existing v6 dumps
  - Test basic behavior when calling the rule 

### Implementation standpoint

- New DB "experimental-features"
- dumps are modified to save the state of that new DB as a `experimental-features.json` file, that is then loaded back when importing the dump. This doesn't change the dump version, as the file is optional and it missing will not cause the dump to fail

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-26 15:13:43 +00:00
Louis Dureuil
13e9b4c2e5 Add dump support 2023-06-26 16:29:43 +02:00
Louis Dureuil
5a83cecb0f fix tests 2023-06-26 16:29:43 +02:00
Louis Dureuil
cca6e47ec1 Errors when GETting metrics without the feature gate 2023-06-26 16:29:43 +02:00
Louis Dureuil
6196a53668 Gate score_details behind a runtime experimental feature flag 2023-06-26 16:29:43 +02:00
Louis Dureuil
bb6448dc2e Compute instance features from CLI options 2023-06-26 16:29:43 +02:00
Louis Dureuil
eef9293630 New route to set some experimental features 2023-06-26 16:29:43 +02:00
Louis Dureuil
dac77dfd14 Add new permissions for experimental-features route 2023-06-26 16:29:43 +02:00
Louis Dureuil
072d81843f Persistently save to DB the status of experimental features 2023-06-26 16:29:43 +02:00
Louis Dureuil
29ec02d4d4 Add meilisearch_types::features module 2023-06-26 16:09:03 +02:00
ManyTheFish
9d2a12821d Use insta snapshot 2023-06-26 14:56:19 +02:00
ManyTheFish
63ca25290b Take into account small Review requests 2023-06-26 14:56:19 +02:00
ManyTheFish
59f64a5256 Return an error when an attribute is not searchable 2023-06-26 14:56:19 +02:00
ManyTheFish
dc391deca0 Reverse assert comparison to have a consistent error message 2023-06-26 14:55:57 +02:00
ManyTheFish
114f878205 Rename restrictSearchableAttributes into attributesToSearchOn 2023-06-26 14:55:57 +02:00
ManyTheFish
42709ea9a5 Fix clippy warnings 2023-06-26 14:55:57 +02:00
ManyTheFish
993b0d012c Remove proximity_ranking_rule_order test, fixing this test would force us to create a fid_word_pair_proximity_docids and a fid_word_prefix_pair_proximity_docids databases which may multiply the keys of word_pair_proximity_docids and word_prefix_pair_proximity_docids by the number of attributes in the searchable_attributes list 2023-06-26 14:55:57 +02:00
ManyTheFish
fb8fa07169 Restrict field ids in search context 2023-06-26 14:55:57 +02:00
ManyTheFish
0ccf1e2e40 Allow the search cache to store owned values 2023-06-26 14:55:57 +02:00
ManyTheFish
9680e1e41f Introduce a BytesDecodeOwned trait in heed_codecs 2023-06-26 14:55:14 +02:00
ManyTheFish
a61ca4066e Add tests 2023-06-26 14:55:14 +02:00