Commit Graph

12664 Commits

Author SHA1 Message Date
bors[bot]
f30979d021 Merge #662
662: Enhance word splitting strategy r=ManyTheFish a=akki1306

# Pull Request

## Related issue
Fixes #648 

## What does this PR do?
- [split_best_frequency](55d889522b/milli/src/search/query_tree.rs (L282-L301)) to use frequency of word pairs near together with proximity value of 1 instead of considering the frequency of individual words. Word pairs having max frequency are considered.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Co-authored-by: Akshay Kulkarni <akshayk.gj@gmail.com>
2022-10-13 08:14:22 +00:00
Akshay Kulkarni
85f3028317 remove underscore and introduce back word_documents_count 2022-10-13 13:21:59 +05:30
Akshay Kulkarni
8195fc6141 revert removal of word_documents_count method 2022-10-13 13:14:27 +05:30
Akshay Kulkarni
32f825d442 move default implementation of word_pair_frequency to TestContext 2022-10-13 12:57:50 +05:30
Akshay Kulkarni
ff8b2d4422 formatting 2022-10-13 12:44:08 +05:30
Akshay Kulkarni
6cb8b46900 use word_pair_frequency and remove word_documents_count 2022-10-13 12:43:11 +05:30
Andrey "MOU" Larionov
b69f8d67c3 Added test to verify response encoding
Alongside request encoding (compression) support, it is helpful to verify that the server respect `Accept-Encoding` headers and apply the corresponding compression to responses.
2022-10-13 00:56:57 +02:00
Andrey "MOU" Larionov
99e2788ee7 Fix Cargo.toml formatting 2022-10-12 21:12:18 +02:00
Akshay Kulkarni
8c9245149e format file 2022-10-12 15:27:56 +05:30
bors[bot]
2000f7958d Merge #604
604: Speed up debug builds r=Kerollmops a=loiclec

Note: this draft PR is based on https://github.com/meilisearch/milli/pull/601 , for no particular reason.

## What does this PR do?
Make a series of changes with the goal of speeding up debug builds:

1. Add an `all_languages` feature which compiles charabia with its `default` features activated.
The `all_languages` feature is activated by default. But running:
```
cargo build --no-default-features
```
on `milli` is now much faster.

2. Reduce the debug optimisation level from 3 to 0, except for a few critical dependencies.

3.  Compile the build dependencies quicker as well. Previously, all build dependencies were compiled with `opt-level = 3`. Now, only the critical build dependencies are compiled with optimisations.

4. Reduce the amount of code generated by the `documents!` macro

5. Make the "progress update" closure provided to indexing functions a trait object instead of a generic parameter. This avoids monomorphising the indexing code multiple times needlessly.

## Results
Initial build times on my computer before and after these changes:
|        | cargo check | cargo check --no-default-features | cargo test | cargo test --lib | cargo test --no-default-features | cargo test --lib --no-default-features |
|--------|-------------|-----------------------------------|------------|------------------|----------------------------------|----------------------------------------|
| before | 1m05s       | 1m05s                             | 2m06s      | 1m47s            | 2m06                             | 1m47s                                  |
| after  | 28.9s       | 13.1s                             | 40s      | 38s            | 23s                              | 21s                                  |



Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-10-12 08:54:48 +00:00
Akshay Kulkarni
63e79a9039 update comment 2022-10-12 13:36:48 +05:30
Akshay Kulkarni
7f9680f0a0 Enhance word splitting strategy 2022-10-12 13:18:23 +05:30
Clémentine Urquizar - curqui
a9e6a8901b Fix CI to send signal to Cloud team 2022-10-12 09:36:01 +02:00
Loïc Lecrenier
53503f09ca Make milli's default features optional in other executable targets 2022-10-12 09:22:05 +02:00
Loïc Lecrenier
6fbf5dac68 Simplify documents! macro to reduce compile times 2022-10-12 09:22:05 +02:00
Loïc Lecrenier
98fc093823 Optimize a few performance sensitive dependencies on debug builds 2022-10-12 09:22:05 +02:00
Loïc Lecrenier
5cfb5df31e Set opt-level to 0 for debug builds
But speed up compile times by optimising build dependencies of lindera
2022-10-12 09:22:05 +02:00
Lawrence Chou
3c3ae3ff98 Impeove invalid config_file_path handling
1. Besides opt.config_file_path, also consider MEILI_CONFIG_FILE_PATH in the Err path because they are both user input.
2. Print out the incorrect file path in error message.
3. Add tests
https://github.com/meilisearch/meilisearch/pull/2804#discussion_r991999888
2022-10-12 12:04:48 +08:00
bors[bot]
343828a76e Merge #2890
2890: fix: add handle dumpCreation query on tasks request r=Kerollmops a=washbin

# Pull Request

## Related issue
Fixes #2874 

## What does this PR do?
- add missing `DumpCreation` type in tasks route

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Co-authored-by: washbin <76929116+washbin@users.noreply.github.com>
2022-10-11 19:09:34 +00:00
washbin
72c1aef1c4 fix: add handle dumpCreation query on tasks request 2022-10-11 19:36:04 +05:45
Lawrence Chou
91accc0194 Fix default config file path typo 2022-10-11 21:36:17 +08:00
bors[bot]
0f024e7d97 Merge #2882
2882: Bring back v0.29.1 changes into `main` r=Kerollmops a=curquiza

Following v0.29.1 release

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2022-10-10 16:27:22 +00:00
bors[bot]
55d889522b Merge #658
658: Add proximity calculation for the same word r=ManyTheFish a=msvaljek

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/milli/issues/647

## What does this PR do?
- During [the increase of the current word position](d94339a858/milli/src/update/index_documents/extract/extract_word_pair_proximity_docids.rs (L129-L135)) we extract the proximity between the current position and the next one.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: msvaljek <marko.svaljek@commercetools.com>
2022-10-10 13:33:58 +00:00
Clémentine Urquizar
3ebd88c03b Revert "Comment cache steps in jobs"
This reverts commit f513ac1233.
v0.29.1
2022-10-10 14:46:54 +02:00
bors[bot]
c958097e99 Merge #2862
2862: Use Ubuntu 18.04 for all CI tasks that previously used Ubuntu 20.04 r=curquiza a=loiclec

This is to prevent linking with a version of glibc that is too recent.

With meilisearch v0.29.0 we inadvertently bumped the minimum supported glibc version to 2.29, which means it couldn't be run from Debian 10 (for example) anymore. By using Ubuntu 18.04, which uses glibc 2.27, we restore support for older Linux distros.

Fixes #2850

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-10-10 14:42:18 +02:00
Clémentine Urquizar
f513ac1233 Comment cache steps in jobs 2022-10-10 14:25:24 +02:00
Loïc Lecrenier
97c202db51 Update version for next release (v0.29.1) 2022-10-10 14:25:24 +02:00
Loïc Lecrenier
a5e23aa6e4 Use Ubuntu 18.04 for all CI tasks that previously used Ubuntu 20.04
This is to prevent linking with a version of glibc that is too recent.

With meilisearch v0.29.0 we inadvertently bumped the minimum supported
glibc version to 2.29, which means it couldn't be run from Debian 10
(for example) anymore. By using Ubuntu 18.04, which uses glibc 2.27, we
restore support for older Linux distros.
2022-10-10 14:25:18 +02:00
bors[bot]
c5cd743eb6 Merge #2868
2868: Uncomment cache steps in Github CI r=curquiza a=AM1TB

# Pull Request

Uncomment cache steps as they were previously affected by an issue with Github actions: https://www.githubstatus.com/incidents/gq1x0j8bv67v

## Related issue
Fixes #2864 

## What does this PR do?
- Reintroduce the rust-cache steps within Github CI.

## PR checklist
Please check if your PR fulfills the following requirements:
- [X] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [X] Have you read the contributing guidelines?
- [X] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Amit Banerjee <amit.banerjee.jsr@gmail.com>
2022-10-10 11:33:37 +00:00
Andrey "MOU" Larionov
343a677566 Fix formatting and apply clippy suggestion 2022-10-10 11:04:46 +02:00
Andrey "MOU" Larionov
9dbc71cb6d Added support for encoded payload
Actix provides different content encodings out of the box, but only if we use built-in content wrappers and containers. This patch wraps its own Payload implementation with an actix decoder, which enables request compression support.
2022-10-09 22:09:30 +02:00
Andrey "MOU" Larionov
11b986a81d Added support for specifying compression in tests
Refactored tests code to allow to specify compression (content-encoding) algorithm.

Added tests to verify what actix actually handle different content encodings properly.
2022-10-09 22:09:29 +02:00
Andrey "MOU" Larionov
7607a62531 Split tests over two modules
Currently, `add_documents` contains some amount of `update` tests. This change should unify test structure with `index` module.
2022-10-09 22:03:22 +02:00
bors[bot]
c883b23cca Merge #2861
2861: Change default bind address to localhost r=Kerollmops a=Fall1ngStar

# Pull Request

## Related issue
Fixes #2782

## What does this PR do?
- Change the default bind address to `localhost` so that it can be accessed with IPv6

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Fall1ngStar <fall1ngstar.public@gmail.com>
2022-10-07 19:49:39 +00:00
msvaljek
762e320c35 Add proximity calculation for the same word 2022-10-07 12:59:12 +02:00
Fall1ngStar
d1c10d6d68 Fix segment_analytics default_http_addr import 2022-10-06 22:42:20 -04:00
Amit Banerjee
1f40c3e48c Uncomment cache steps in jobs 2022-10-06 22:32:29 +05:30
Lawrence Chou
2681e92d4e Support MEILI_CONFIG_FILE_PATH to define config file path
Close #2800

This is an alternative to the `--config-file-path` option. If both `--config-file-path` and `MEILI_CONFIG_FILE_PATH` are present, `--config-file-path` takes precedence according to the "Priority order" section of #2558.
2022-10-07 00:41:14 +08:00
Lawrence Chou
da25328c2b Fix clap ArgGroup typos
Not sure why but the compiler didn't catch this until clap is upgraded to v4.
Follwoing are the error from 'cargo test':

running 2 tests
test routes::indexes::search::test::test_fix_sort_query_parameters ... ok
test option::test::test_valid_opt ... FAILED

failures:

---- option::test::test_valid_opt stdout ----
thread 'option::test::test_valid_opt' panicked at 'Command meilisearch-http: Argument or group 'import-snapshot' specified in 'requires*' for 'ignore_missing_snapshot' does not exist', /Users/ychou/.cargo/registry/src/github.com-1ecc6299db9ec823/clap-4.0.9/src/builder/debug_asserts.rs:152:13
note: run with  environment variable to display a backtrace

failures:
    option::test::test_valid_opt

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
2022-10-07 00:32:26 +08:00
Lawrence Chou
9e5ef8eb69 Upgrade clap to v4
Close #2846

4.0.0 changelog: 'https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#400---2022-09-28'

I followed the [Migrating steps](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#migrating) and the only issue I encountered are:
1. The typo problem in previous commit "Fix clap ArgGroup typo"
2. I can't say I am 100% sure every [Subtle changes](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#breaking-changes) is fine for our use case, but at least after a quick read I didn't notice anything actionable.
2022-10-07 00:32:25 +08:00
Lawrence Chou
6285c5949c Fix clap v4 deprecation warning
Following the 3. stap of [clap v4 migration instructions](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#migrating)
The result for 'cargo check --features clap/deprecated' is https://user-images.githubusercontent.com/12410942/193825216-ac680574-f53b-49c0-88c4-8bc42c4c6381.png
2022-10-07 00:15:53 +08:00
Lawrence Chou
b55ec7db4d Upgrade clap to 3.2.8
Upgrade to the latest version of v3 before upgrading to v4
2022-10-07 00:04:21 +08:00
bors[bot]
1b72eba1f3 Merge #2867
2867: Bring back `stable` into `main` r=Kerollmops a=curquiza

Following hotfix for v0.29.1

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
2022-10-06 14:14:23 +00:00
bors[bot]
c98d3209ad Merge #2839
2839: Update internal CLI documentation r=curquiza a=jeertmans

# Pull Request

## Related issue
Fixes #2810

## What does this PR do?
- Make internal CLI documentation match the online one

## Remarks
- Scope is limited to `meilisearch/meilisearch-http/src/option.rs`, not the flattened structs: should I also take care of them?
- Could not find online docs for `enable_metrics_route`
- Max column width wrapping was done by hand, so may not be perfect
- I removed the links from the internal doc: should I put them back?

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: Jérome Eertmans <jeertmans@icloud.com>
2022-10-06 13:32:32 +00:00
Jérome Eertmans
abb0233077 chore: add docs of flattened structs 2022-10-06 15:14:56 +02:00
Jérome Eertmans
8c526c31da Update meilisearch-http/src/option.rs
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-10-06 15:08:37 +02:00
bors[bot]
ed6c2fb22c Merge #2862
2862: Use Ubuntu 18.04 for all CI tasks that previously used Ubuntu 20.04 r=curquiza a=loiclec

This is to prevent linking with a version of glibc that is too recent.

With meilisearch v0.29.0 we inadvertently bumped the minimum supported glibc version to 2.29, which means it couldn't be run from Debian 10 (for example) anymore. By using Ubuntu 18.04, which uses glibc 2.27, we restore support for older Linux distros.

Fixes #2850 

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-10-06 12:18:38 +00:00
Clémentine Urquizar
ab17c0acd5 Comment cache steps in jobs 2022-10-06 14:03:04 +02:00
bors[bot]
425692287d Merge #2841
2841: Bail if config file contains 'config_file_path' r=Kerollmops a=arriven

# Pull Request

## Related issue
Fixes #2801

## What does this PR do?
- Return an error if config file contains 'config_file_path'

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?


Co-authored-by: arriven <20084245+Arriven@users.noreply.github.com>
2022-10-06 11:58:28 +00:00
Loïc Lecrenier
05af8f0e46 Update version for next release (v0.29.1) 2022-10-06 10:27:11 +02:00