Commit Graph

8229 Commits

Author SHA1 Message Date
34b2e98fe9 Expose a sortFacetValuesBy parameter to the user 2023-06-29 14:33:00 +02:00
80bbd4b6f3 Clean and make the facet order configurable internally 2023-06-29 14:31:17 +02:00
f42bef2f66 Make the search to always return the facets ordered by count 2023-06-29 14:31:17 +02:00
bd3c026406 First to-test version of the algorithm 2023-06-29 14:31:17 +02:00
84f8938f33 Rename facet distribution to be explicit on the order to find them 2023-06-29 14:31:15 +02:00
34a07110de Merge #3864
3864: Remove `/experimental-features` verbs that weren't in the PRD r=dureuill a=dureuill

Removes:

- POST `/experimental-features`
- DELETE `/experimental-features`

keeping only:

- PATCH `/experimental-features`
- GET `/experimental-features`

The two routes that are described in the PRD.

Following `@guimachiavelli's` [question](https://github.com/meilisearch/documentation/issues/2482#issuecomment-1611845372) about the POST route.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-29 09:43:14 +00:00
73bb080a26 Merge #3699
3699: Search for Facet Values r=Kerollmops a=Kerollmops

This PR introduces the first version of [the _Search for Facet Values_ feature](https://github.com/meilisearch/product/discussions/515) that allows a user to search for facets, by optionally using a prefix string and optionally specifying the `q` and `filter` original search parameters to restrict the candidates to search in.

The steps to merge it into Meilisearch will first start by providing prototype Docker images. This way users will be able to test the prototypes before using them.

The current route to use the _Search for Facet Values_ feature is the `POST /indexes/{index}/facet-search` where the body is a JSON object that looks like the following:
```json5
{
  "q": "spiderman", // optional
  "filter": "rating > 10", // optional
  "facetName": "genres",
  "facetQuery": "a" // optional
}
```

## What is missing?

 - [x] Send some analytics.
 - [x] Support the `matchingStrategy` parameter.
 - [x] Make sure that the errors are the right ones.
 - [x] Use the [Index typo tolerance settings](https://www.meilisearch.com/docs/learn/configuration/typo_tolerance#minwordsizefortypos) when matching facet values.
    - [x] minWordSizeForTypos.oneTypo
    - [x] minWordSizeForTypos.twoTypo
 - [x] Add tests
 - [x] Log the time it took to compute the results.
 - [x] Fix the compilation warnings.
 - [x] [Create an issue to fix potential performance issues when indexing](https://github.com/meilisearch/meilisearch/issues/3862).


Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2023-06-29 09:08:55 +00:00
44b5b9e1a7 Improve the documentation of the FacetSearchQuery struct 2023-06-29 10:28:23 +02:00
68356869c0 Remove /experimental-features verbs that weren't in the PRD 2023-06-29 10:02:55 +02:00
605c1dd54a Fix analytics 2023-06-28 16:41:56 +02:00
3e3f73ba1e Fix the analytics 2023-06-28 15:45:09 +02:00
efbe7ce78b Clean the facet string FSTs when we clear the documents 2023-06-28 15:36:32 +02:00
82e1f59f1e Add attributes_to_search_on 2023-06-28 15:28:24 +02:00
362e9ff845 Add more tests 2023-06-28 15:28:24 +02:00
32f2556d22 Move the additional_search_parameters_provided analytic inside facets 2023-06-28 15:06:09 +02:00
63fd10aaa5 Fix the invalid facet name field error code 2023-06-28 15:06:09 +02:00
29b40295b8 Ignore unknown facet search query parameters 2023-06-28 15:06:09 +02:00
26f0fa678d Change the error message when a facet is not searchable 2023-06-28 15:06:09 +02:00
60ddd53439 Return one of the original facet values when doing a facet search 2023-06-28 15:06:09 +02:00
2bcd8d2983 Make sure the facet queries are normalized 2023-06-28 15:06:09 +02:00
09079a4e88 Remove useless InvalidSearchFacet error 2023-06-28 15:06:09 +02:00
904f6574bf Make rustfmt happy 2023-06-28 15:06:08 +02:00
6fb8af423c Rename the hits and query output into facetHits and facetQuery respectively 2023-06-28 15:06:08 +02:00
cb0bb399fa Fix the error code returned when the facetName field is missing 2023-06-28 15:06:08 +02:00
41760a9306 Introduce a new invalid_facet_search_facet_name error code 2023-06-28 15:06:07 +02:00
e9a3029c30 Use the right field id to write the string facet values FST 2023-06-28 15:01:51 +02:00
ed0ff47551 Return an empty list of results if attribute is set as filterable 2023-06-28 15:01:51 +02:00
e1b8fb48ee Use the minWordSizeForTypos index settings 2023-06-28 15:01:51 +02:00
87e22e436a Fix compilation issues 2023-06-28 15:01:51 +02:00
0252cfe8b6 Simplify the placeholder search of the facet-search route 2023-06-28 15:01:50 +02:00
f35ad96afa Use the disableOnAttributes parameter on the facet-search route 2023-06-28 15:01:50 +02:00
2ceb781c73 Use the disableOnWords parameter on the facet-search route 2023-06-28 15:01:50 +02:00
7bd67543dd Support the typoTolerant.enabled parameter 2023-06-28 15:01:50 +02:00
8e86eb91bb Log an error when a facet value is missing from the database 2023-06-28 15:01:50 +02:00
55c17aa38b Rename the SearchForFacetValues struct 2023-06-28 15:01:50 +02:00
aadbe88048 Return an internal error when a field id is missing 2023-06-28 15:01:50 +02:00
f36de2115f Make clippy happy 2023-06-28 15:01:50 +02:00
702041b7e1 Improve the returned errors from the facet-search route 2023-06-28 15:01:48 +02:00
a05074e675 Fix the max number of facets to be returned to 100 2023-06-28 14:58:42 +02:00
93f30e65a9 Return the correct response JSON object from the facet-search route 2023-06-28 14:58:42 +02:00
893592c5e9 Send analytics about the facet-search route 2023-06-28 14:58:42 +02:00
e81809aae7 Make the search for facet work 2023-06-28 14:58:41 +02:00
ce7e7f12c8 Introduce the facet search route 2023-06-28 14:58:41 +02:00
addb21f110 Restrict the number of facet search results to 1000 2023-06-28 14:58:41 +02:00
c34de05106 Introduce the SearchForFacetValue struct 2023-06-28 14:58:41 +02:00
15a4c05379 Store the facet string values in multiple FSTs 2023-06-28 14:58:41 +02:00
9deeec88e0 Merge #3861
3861: Add "meilisearch" prefix to last metrics that were missing it r=Kerollmops a=dureuill

# Pull Request

## Related issue
Related to #3790 

## What does this PR do?
- change implementation to follow the spec on metrics name
- regenerate grafana dashboard from the code

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-06-28 09:28:31 +00:00
167ac55a2d Update dashboard generated from grafana 2023-06-28 11:22:16 +02:00
ea68ccd034 prefix http_* metrics by meilisearch 2023-06-28 11:21:50 +02:00
d4f10800f2 Merge #3834
3834: Define searchable fields at runtime r=Kerollmops a=ManyTheFish

## Summary
This feature allows the end-user to search in one or multiple attributes using the search parameter `attributesToSearchOn`:

```json
{
  "q": "Captain Marvel",
  "attributesToSearchOn": ["title"]
}
```

This feature act like a filter, forcing Meilisearch to only return the documents containing the requested words in the attributes-to-search-on. Note that, with the matching strategy `last`, Meilisearch will only ensure that the first word is in the attributes-to-search-on, but, the retrieved documents will be ordered taking into account the word contained in the attributes-to-search-on. 

## Trying the prototype

A dedicated docker image has been released for this feature:

#### last prototype version:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-1
```

#### others prototype versions:

```bash
docker pull getmeili/meilisearch:prototype-define-searchable-fields-at-search-time-0
```

## Technical Detail

The attributes-to-search-on list is given to the search context, then, the search context uses the `fid_word_docids`database using only the allowed field ids instead of the global `word_docids` database. This is the same for the prefix databases.
The database cache is updated with the merged values, meaning that the union of the field-id-database values is only made if the requested key is missing from the cache.

### Relevancy limits

Almost all ranking rules behave as expected when ordering the documents.
Only `proximity` could miss-order documents if all the searched words are in the restricted attribute but a better proximity is found in an ignored attribute in a document that should be ranked lower. I put below a failing test showing it:
```rust
#[actix_rt::test]
async fn proximity_ranking_rule_order() {
    let server = Server::new().await;
    let index = index_with_documents(
        &server,
        &json!([
        {
            "title": "Captain super mega cool. A Marvel story",
            // Perfect distance between words in an ignored attribute
            "desc": "Captain Marvel",
            "id": "1",
        },
        {
            "title": "Captain America from Marvel",
            "desc": "a Shazam ersatz",
            "id": "2",
        }]),
    )
    .await;

    // Document 2 should appear before document 1.
    index
        .search(json!({"q": "Captain Marvel", "attributesToSearchOn": ["title"], "attributesToRetrieve": ["id"]}), |response, code| {
            assert_eq!(code, 200, "{}", response);
            assert_eq!(
                response["hits"],
                json!([
                    {"id": "2"},
                    {"id": "1"},
                ])
            );
        })
        .await;
}
```

Fixing this would force us to create a `fid_word_pair_proximity_docids` and a `fid_word_prefix_pair_proximity_docids` databases which may multiply the keys of `word_pair_proximity_docids` and `word_prefix_pair_proximity_docids` by the number of attributes in the searchable_attributes list. If we think we should fix this test, I'll suggest doing it in another PR.

## Related

Fixes #3772

Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-06-28 08:19:23 +00:00