Commit Graph

619 Commits

Author SHA1 Message Date
aa414565bb Fix proximity graph edge builder to include all proximities 2023-03-20 09:41:56 +01:00
1db152046e WIP on split words and synonyms support 2023-03-20 09:41:56 +01:00
c27ea2677f Rewrite cheapest path algorithm and empty path cache
It is now much simpler and has much better performance.
2023-03-20 09:41:56 +01:00
caa1e1b923 Add typo ranking rule to new search impl 2023-03-20 09:41:56 +01:00
71f18e4379 Add sort ranking rule to new search impl 2023-03-20 09:41:56 +01:00
600e3dd1c5 Remove warnings 2023-03-20 09:41:56 +01:00
362eb0de86 Add support for filters 2023-03-20 09:41:56 +01:00
998d46ac10 Add support for search offset and limit 2023-03-20 09:41:56 +01:00
6c85c0d95e Fix more bugs + visual empty path cache logging 2023-03-20 09:41:56 +01:00
0e1fbbf7c6 Fix bugs in query graph's "remove word" and "cheapest paths" algos 2023-03-20 09:41:56 +01:00
6806640ef0 Fix d2 description of paths map 2023-03-20 09:41:56 +01:00
173e37584c Improve the visual/detailed search logger 2023-03-20 09:41:55 +01:00
6ba4d5e987 Add a search logger 2023-03-20 09:41:55 +01:00
dd12d44134 Support swapped word pairs in new proximity ranking rule impl 2023-03-20 09:41:55 +01:00
c8e251bf24 Remove noise in codebase 2023-03-20 09:41:55 +01:00
a938fbde4a Use a cache when resolving the query graph 2023-03-20 09:41:55 +01:00
dcf3f1d18a Remove EdgeIndex and NodeIndex types, prefer u32 instead 2023-03-20 09:41:55 +01:00
66d0c63694 Add some documentation and use bitmaps instead of hashmaps when possible 2023-03-20 09:41:55 +01:00
132191360b Introduce the sort ranking rule working with the new search structures 2023-03-20 09:41:55 +01:00
345c99d5bd Introduce the words ranking rule working with the new search structures 2023-03-20 09:41:55 +01:00
89d696c1e3 Introduce the proximity ranking rule as a graph-based ranking rule 2023-03-20 09:41:55 +01:00
c645853529 Introduce a generic graph-based ranking rule 2023-03-20 09:41:55 +01:00
a70ab8b072 Introduce a function to find the K shortest paths in a graph 2023-03-20 09:41:55 +01:00
48aae76b15 Introduce a function to find the docids of a set of paths in a graph 2023-03-20 09:41:55 +01:00
23bf572dea Introduce cache structures used with ranking rule graphs 2023-03-20 09:41:55 +01:00
864f6410ed Introduce a structure to represent a set of graph paths efficiently 2023-03-20 09:41:55 +01:00
c9bf6bb2fa Introduce a structure to implement ranking rules with graph algorithms 2023-03-20 09:41:55 +01:00
46249ea901 Implement a function to find a QueryGraph's docids 2023-03-20 09:41:55 +01:00
ce0d1e0e13 Introduce a common way to manage the coordination between ranking rules 2023-03-20 09:41:55 +01:00
5065d8b0c1 Introduce a DatabaseCache to memorize the addresses of LMDB values 2023-03-20 09:41:55 +01:00
a83007c013 Introduce structure to represent search queries as graphs 2023-03-20 09:41:55 +01:00
79e0a6dd4e Introduce a new search module, eventually meant to replace the old one
The code here does not compile, because I am merely splitting one giant
commit into smaller ones where each commit explains a single file.
2023-03-20 09:41:55 +01:00
2d88089129 Remove unused term matching strategies 2023-03-20 09:41:55 +01:00
4f1ccbc495 Merge #3525
3525: Fix phrase search containing stop words r=ManyTheFish a=ManyTheFish

# Summary
A search with a phrase containing only stop words was returning an HTTP error 500,
this PR filters the phrase containing only stop words dropping them before the search starts, a query with a phrase containing only stop words now behaves like a placeholder search.

fixes https://github.com/meilisearch/meilisearch/issues/3521

related v1.0.2 PR on milli: https://github.com/meilisearch/milli/pull/779



Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-03-02 10:55:37 +00:00
37489fd495 Return an internal error in the case of matching word is invalid 2023-03-01 19:05:16 +01:00
ac5a1e4c4b Merge #3423
3423: Add min and max facet stats r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #3426

## What does this PR do?

### User standpoint

- When using a `facets` parameter in search, the facets that have numeric values are displayed in a new section of the response called `facetStats` that contains, per facet, the numeric min and max value of the hits returned by the search.

<details>
<summary>
Sample request/response
</summary>

```json
❯ curl \
  -X POST 'http://localhost:7700/indexes/meteorites/search?facets=mass' \
  -H 'Content-Type: application/json' \
  --data-binary '{ "q": "LL6", "facets":["mass", "recclass"], "limit": 5 }' | jsonxf
{
  "hits": [
    {
      "name": "Niger (LL6)",
      "id": "16975",
      "nametype": "Valid",
      "recclass": "LL6",
      "mass": 3.3,
      "fall": "Fell"
    },
    {
      "name": "Appley Bridge",
      "id": "2318",
      "nametype": "Valid",
      "recclass": "LL6",
      "mass": 15000,
      "fall": "Fell",
      "_geo": {
        "lat": 53.58333,
        "lng": -2.71667
      }
    },
    {
      "name": "Athens",
      "id": "4885",
      "nametype": "Valid",
      "recclass": "LL6",
      "mass": 265,
      "fall": "Fell",
      "_geo": {
        "lat": 34.75,
        "lng": -87.0
      }
    },
    {
      "name": "Bandong",
      "id": "4935",
      "nametype": "Valid",
      "recclass": "LL6",
      "mass": 11500,
      "fall": "Fell",
      "_geo": {
        "lat": -6.91667,
        "lng": 107.6
      }
    },
    {
      "name": "Benguerir",
      "id": "30443",
      "nametype": "Valid",
      "recclass": "LL6",
      "mass": 25000,
      "fall": "Fell",
      "_geo": {
        "lat": 32.25,
        "lng": -8.15
      }
    }
  ],
  "query": "LL6",
  "processingTimeMs": 15,
  "limit": 5,
  "offset": 0,
  "estimatedTotalHits": 42,
  "facetDistribution": {
    "mass": {
      "110000": 1,
      "11500": 1,
      "1161": 1,
      "12000": 1,
      "1215.5": 1,
      "127000": 1,
      "15000": 1,
      "1676": 1,
      "1700": 1,
      "1710.5": 1,
      "18000": 1,
      "19000": 1,
      "220000": 1,
      "2220": 1,
      "22300": 1,
      "25000": 2,
      "265": 1,
      "271000": 1,
      "2840": 1,
      "3.3": 1,
      "3000": 1,
      "303": 1,
      "32000": 1,
      "34000": 1,
      "36.1": 1,
      "45000": 1,
      "460": 1,
      "478": 1,
      "483": 1,
      "5500": 2,
      "600": 1,
      "6000": 1,
      "67.8": 1,
      "678": 1,
      "680.5": 1,
      "6930": 1,
      "8": 1,
      "8300": 1,
      "840": 1,
      "8400": 1
    },
    "recclass": {
      "L/LL6": 3,
      "LL6": 39
    }
  },
  "facetStats": {
    "mass": {
      "min": 3.3,
      "max": 271000.0
    }
  }
}
```

</details>

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-02-22 13:06:43 +00:00
900bae3d9d keep phrases that has at least one word 2023-02-21 18:16:51 +01:00
8aa808d51b Merge branch 'main' into enhance-language-detection 2023-02-20 18:14:34 +01:00
119e6d8811 Update milli/src/search/mod.rs
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 15:33:10 +01:00
eb28d4c525 add facet test 2023-02-20 13:52:28 +01:00
9ac981d025 Remove some clippy type complexity warns by deboxing iters 2023-02-20 13:52:27 +01:00
74859ecd61 Add min and max facet stats 2023-02-20 13:52:27 +01:00
8ae441a4db Update usage of iterators 2023-02-20 13:52:27 +01:00
042d86cbb3 facet sort ascending/descending now also return the values 2023-02-20 13:52:27 +01:00
143e3cf948 Merge #3490
3490: Fix attributes set candidates r=curquiza a=ManyTheFish

# Pull Request

Fix attributes set candidates for v1.1.0

## details

The attribute criterion was not returning the remaining candidates when its internal algorithm was been exhausted.
We had a loss of candidates by the attribute criterion leading to the bug reported in the issue linked below.
After some investigation, it seems that it was the only criterion that had this behavior.

We are now returning the remaining candidates instead of an empty bitmap.

## Related issue

Fixes #3483
PR on milli for v1.0.1: https://github.com/meilisearch/milli/pull/777


Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-02-15 17:38:07 +00:00
a53536836b fmt 2023-02-14 17:04:22 +01:00
d7ad39ad77 fix: clippy error 2023-02-14 00:15:35 +01:00
7481559e8b move BadGeo to FilterError 2023-02-14 00:15:35 +01:00
83c765ce6c implement From<ParseGeoError> for FilterError 2023-02-14 00:15:35 +01:00
825923f6fc export ParseGeoError 2023-02-14 00:15:35 +01:00