Commit Graph

1836 Commits

Author SHA1 Message Date
Loïc Lecrenier
5758268866 Don't compute split_words for phrases 2023-05-16 17:01:18 +02:00
Loïc Lecrenier
3e19702de6 Update snapshot tests 2023-05-16 12:22:46 +02:00
meili-bors[bot]
1e762d151f Merge #3755
3755: Re-add final dot r=curquiza a=ManyTheFish

I removed the final dot of the error message in my last PR, this one re-adds it.

related to https://github.com/meilisearch/meilisearch/pull/3749

> Oups 😬 

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-05-16 10:10:58 +00:00
Loïc Lecrenier
f6524a6858 Adjust costs of edges in position ranking rule
To ensure good performance
2023-05-16 11:28:56 +02:00
meili-bors[bot]
65ad8cce36 Merge #3741
3741: Add ngram support to the highlighter r=ManyTheFish a=loiclec

This PR fixes a bug introduced by the search refactor, where ngrams were not highlighted. 

The solution was to add the ngrams to the vector of `LocatedQueryTerm` that is given to the `MatchingWords` structure.

Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2023-05-16 09:03:31 +00:00
ManyTheFish
42650f82e8 Re-add final dot 2023-05-16 10:57:26 +02:00
Loïc Lecrenier
a37da36766 Implement words as a graph-based ranking rule and fix some bugs 2023-05-16 10:42:11 +02:00
Loïc Lecrenier
85d96d35a8 Highlight ngram matches as well 2023-05-16 10:39:36 +02:00
meili-bors[bot]
bf66e97b48 Merge #3749
3749: Fix back: sort error message r=ManyTheFish a=ManyTheFish

This PR reintroduces the error message modified in https://github.com/meilisearch/milli/pull/375.
However, this added double-quotes around `sort` in the message. I don't think another message contains double-quotes, so I have added a separate commit replacing the double-quotes with back-ticks, which seems more consistent with the other error messages, this last change can be reverted easily.

## Detailed changes
#### v1.2-rc0
```
The sort ranking rule must be specified in the ranking rules settings to use the sort parameter at search time.
```
#### [Reintroduce fix (previous and expected behavior)](23d1c86825)
```
You must specify where "sort" is listed in the rankingRules setting to use the sort parameter at search time
```
#### [Replace double-quotes with back-ticks (my suggestion)](4d691d071a)
```
You must specify where `sort` is listed in the rankingRules setting to use the sort parameter at search time
```

## Related

Fixes #3722

## Reviewers

- technical review: `@irevoire`
- to validate the replacement: `@macraig`

Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-05-15 14:55:51 +00:00
Kerollmops
1a79fd0c3c Use the new heed v0.12.6 2023-05-15 11:42:30 +02:00
Kerollmops
f759ec7fad Expose a flag to enable the MDB_WRITEMAP flag 2023-05-15 11:38:43 +02:00
ManyTheFish
4d691d071a Change double-quotes by back-ticks in sort error message 2023-05-15 11:10:36 +02:00
ManyTheFish
23d1c86825 Re-introduce the sort error message fix 2023-05-15 11:07:23 +02:00
Kerollmops
c4a40e7110 Use the writemap flag to reduce the memory usage 2023-05-15 10:15:33 +02:00
Loïc Lecrenier
4d352a21ac Compute split words derivations of terms that don't accept typos 2023-05-10 13:31:19 +02:00
Loïc Lecrenier
3625389057 Highlight ngram matches as well 2023-05-08 15:35:41 +02:00
meili-bors[bot]
eace6df91b Merge #3726
3726: Fix prefix highlighting r=loiclec a=ManyTheFish

The prefix queries were not properly highlighted, this PR now highlights only the start of a word when it matched with a prefix

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>
2023-05-08 07:46:46 +00:00
Loïc Lecrenier
83ab8cf4e5 Remove dbg!(..) expression in highlighter tests 2023-05-08 09:45:23 +02:00
ManyTheFish
cd2573fcc3 Fix prefix highlighting 2023-05-04 16:53:50 +02:00
meili-bors[bot]
9f7981df28 Merge #3687
3687: Allow to disable specialized tokenizations (again) r=Kerollmops a=jirutka

In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai` feature flags to allow melisearch to be built without huge specialed tokenizations that took up 90% of the melisearch binary size. Unfortunately, due to some recent changes, this doesn't work anymore. The problem lies in excessive use of the `default` feature flag, which infects the dependency graph.

Instead of adding `default-features = false` here and there, it's easier and more future-proof to not declare `default` in `milli` and `meilisearch-types`. I've renamed it to `all-tokenizers`, which also makes it a bit clearer what it's about.


Co-authored-by: Jakub Jirutka <jakub@jirutka.cz>
2023-05-04 14:48:01 +00:00
Jakub Jirutka
e615fa5ec6 Fix unused_imports warning in milli when japanese is not enabled 2023-05-04 15:46:11 +02:00
Jakub Jirutka
13f1277637 Allow to disable specialized tokenizations (again)
In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai`
feature flags to allow melisearch to be built without huge specialed
tokenizations that took up 90% of the melisearch binary size.
Unfortunately, due to some recent changes, this doesn't work anymore.
The problem lies in excessive use of the `default` feature flag, which
infects the dependency graph.

Instead of adding `default-features = false` here and there, it's easier
and more future-proof to not declare `default` in `milli` and
`meilisearch-types`. I've renamed it to `all-tokenizers`, which also
makes it a bit clearer what it's about.
2023-05-04 15:45:40 +02:00
Louis Dureuil
a35d3fc708 Add Index::iter_documents 2023-05-04 15:31:54 +02:00
Louis Dureuil
732c52093d Processing time without autobatching implementation 2023-05-03 17:41:48 +02:00
Louis Dureuil
f8f190cd40 Update exactness tests following charabia camelCase tokenization 2023-05-03 14:45:09 +02:00
Louis Dureuil
3a408e8287 Increase map size for tests following charabia camelCase tokenization 2023-05-03 14:44:48 +02:00
Louis Dureuil
d3e5b10e23 fix nb of dbs 2023-05-03 14:11:20 +02:00
Louis Dureuil
1aaf24ccbf Cargo fmt 2023-05-03 12:21:58 +02:00
Louis Dureuil
90bc230820 Merge remote-tracking branch 'origin/main' into search-refactor
Conflicts | resolution
----------|-----------
Cargo.lock | added mimalloc
Cargo.toml |  took origin/main version
milli/src/search/criteria/exactness.rs | deleted after checking it was only clippy changes
milli/src/search/query_tree.rs | deleted after checking it was only clippy changes
2023-05-03 12:19:06 +02:00
Louis Dureuil
342c4ff85d geosort: Remove rtree unwrap 2023-05-03 09:52:16 +02:00
Tamo
c85392ce40 make the descendent geosort fast 2023-05-03 09:13:12 +02:00
Tamo
8875d24a48 deserialize the rtree only when its needed, and keep it in memory once it has been deserialized 2023-05-03 09:13:12 +02:00
Tamo
c470b67fa2 revamp the test to use execute_iterative_and_rtree_returns_the_same 2023-05-03 09:13:12 +02:00
meili-bors[bot]
c0e081cd98 Merge #3702 #3710
3702: Update charabia v0.7.2 r=curquiza a=ManyTheFish

fixes #3701
fixes #3689
fixes #3285 

3710: Updated messages pointing to the docs website r=curquiza a=roy9495

# Pull Request

Fixes partially #3668

## What does this PR do?
- ...Any messages referencing this docs site https://docs.meilisearch.com has been changed to this docs site https://meilisearch.com/docs .
 Thanks.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: TATHAGATA ROY <98920199+roy9495@users.noreply.github.com>
2023-05-02 17:27:57 +00:00
Louis Dureuil
b60840ebff Remove self.iterating from words 2023-05-02 18:54:23 +02:00
Louis Dureuil
fdc1763838 Use MultiOps for resolve_query_graph 2023-05-02 18:54:09 +02:00
Louis Dureuil
75819bc940 Remove too many arguments on resolve_maximally_reduced_query_graph 2023-05-02 18:53:40 +02:00
Louis Dureuil
7b8cc25625 rename located_query_terms_from_string -> located_query_terms_from_tokens 2023-05-02 18:53:01 +02:00
Loïc Lecrenier
aa63091752 Fix bug in exact_attribute 2023-05-02 10:48:32 +02:00
Loïc Lecrenier
58735d6d8f Fix outdated relevancy test 2023-05-02 10:48:32 +02:00
Loïc Lecrenier
1b514517f5 Fix bug in computation of query term at a position 2023-05-02 10:48:32 +02:00
Loïc Lecrenier
11f814821d Minor cleanup 2023-05-02 10:48:32 +02:00
Loïc Lecrenier
30fb1153cc Speed up graph based ranking rule when a lot of different costs exist 2023-05-02 09:59:42 +02:00
Loïc Lecrenier
3b2c8b9f25 Improve performance of position rr 2023-05-02 09:59:42 +02:00
Loïc Lecrenier
2a7f9adf78 Build query graph more correctly from paths
Update snapshots
2023-05-02 09:59:42 +02:00
Loïc Lecrenier
608ceea440 Fix bug in position rr 2023-05-02 09:59:42 +02:00
Loïc Lecrenier
79001b9c97 Improve performance of the cheapest path finder algorithm 2023-05-02 09:59:42 +02:00
Loïc Lecrenier
59b12fca87 Fix errors, clippy warnings, and add review comments 2023-04-29 11:48:11 +02:00
Loïc Lecrenier
48f5bb1693 Implements the geo-sort ranking rule 2023-04-29 11:02:16 +02:00
Loïc Lecrenier
93188b3c88 Fix indexing of word_prefix_fid_docids 2023-04-29 10:56:48 +02:00