Commit Graph

10663 Commits

Author SHA1 Message Date
ManyTheFish
cbb3b25459 Fix(Search): Fix phrase search candidates computation
This bug is an old bug but was hidden by the proximity criterion,
Phrase search were always returning an empty candidates list.

Before the fix, we were trying to find any words[n] near words[n]
instead of finding  any words[n] near words[n+1], for example:

for a phrase search '"Hello world"' we were searching for "hello" near "hello" first, instead of "hello" near "world".
2022-07-21 10:04:30 +02:00
bors[bot]
941af58239 Merge #561
561: Enriched documents batch reader r=curquiza a=Kerollmops

~This PR is based on #555 and must be rebased on main after it has been merged to ease the review.~
This PR contains the work in #555 and can be merged on main as soon as reviewed and approved.

- [x] Create an `EnrichedDocumentsBatchReader` that contains the external documents id.
- [x] Extract the primary key name and make it accessible in the `EnrichedDocumentsBatchReader`.
- [x] Use the external id from the `EnrichedDocumentsBatchReader` in the `Transform::read_documents`.
- [x] Remove the `update_primary_key` from the _transform.rs_ file.
- [x] Really generate the auto-generated documents ids.
- [x] Insert the (auto-generated) document ids in the document while processing it in `Transform::read_documents`.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-07-21 07:08:50 +00:00
bors[bot]
78c5826c57 Merge #2631
2631: Update mini-dashboard to v0.2.1 r=curquiza a=mdubus

# Pull Request

## What does this PR do?
Fixes #2629

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2022-07-21 06:37:57 +00:00
Morgane Dubus
7e6f3274fa Update mini-dashboard to v0.2.1 2022-07-21 09:47:07 +04:00
Loïc Lecrenier
41a0ce07cb Add a code comment, as suggested in PR review
Co-authored-by: Many the fish <many@meilisearch.com>
2022-07-20 16:20:35 +02:00
bors[bot]
94ef326be3 Merge #2630
2630: Update version for next release (v0.28.1) r=ManyTheFish a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-07-20 13:42:51 +00:00
Clémentine Urquizar
d01a3ab889 Update version for next release (v0.28.1) 2022-07-20 15:46:53 +04:00
Loïc Lecrenier
1506683705 Avoid using too much memory when indexing facet-exists-docids 2022-07-19 14:42:35 +02:00
Loïc Lecrenier
d0eee5ff7a Fix compiler error 2022-07-19 13:54:30 +02:00
bors[bot]
ad494b6f77 Merge #2625
2625: Update link to Cloud beta form r=curquiza a=davelarkan

# Pull Request

## What does this PR do?
Fixes #2624

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Dave Larkan <davelarkan@gmail.com>
2022-07-19 11:09:12 +00:00
Dave Larkan
2f11686c81 Update link to Cloud beta form 2022-07-19 11:10:16 +01:00
Loïc Lecrenier
aed8c69bcb Refactor indexation of the "facet-id-exists-docids" database
The idea is to directly create a sorted and merged list of bitmaps
in the form of a BTreeMap<FieldId, RoaringBitmap> instead of creating
a grenad::Reader where the keys are field_id and the values are docids.

Then we send that BTreeMap to the thing that handles TypedChunks, which
inserts its content into the database.
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
1eb1e73bb3 Add integration tests for the EXISTS filter 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
4f0bd317df Remove custom implementation of BytesEncode/Decode for the FieldId 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
80b962b4f4 Run cargo fmt 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
ea0642c32d Make filter parser more strict regarding spacing around operators
OR, AND, NOT, TO must now be followed by spaces
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
c17d616250 Refactor index_documents_check_exists_database tests 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
30bd4db0fc Simplify indexing task for facet_exists_docids database 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
392472f4bb Apply suggestions from code review
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
bd15f5625a Fix compiler warning 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
722db7b088 Ignore target directory of filter-parser/fuzz crate 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
a5c9162250 Improve parser for NOT EXISTS filter
Allow multiple spaces between NOT and EXISTS
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
0388b2d463 Run cargo fmt 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
dc64170a69 Improve syntax of EXISTS filter, allow “value NOT EXISTS” 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
72452f0cb2 Implements the EXIST filter operator 2022-07-19 10:07:33 +02:00
Loïc Lecrenier
a8641b42a7 Modify flatten_serde_json to keep dummy value for all object keys
Example:
```json
{
    "id": 0,
    "colour" : { "green": 1 }
}
```
becomes:
```json
{
    "id": 0,
    "colour" : [],
    "colour.green": 1
}
```
to retain the information the key "colour" exists in the original
json value.
2022-07-19 10:07:33 +02:00
Loïc Lecrenier
453d593ce8 Add a database containing the docids where each field exists 2022-07-19 10:07:33 +02:00
bors[bot]
5704235521 Merge #584
584: Chores: Enhance smart-crop code comments r=curquiza a=ManyTheFish

Enhance explanation around smart crop algorithms

Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2022-07-19 07:08:14 +00:00
bors[bot]
f6415b679f Merge #588
588: Fix name of "release_date" facet in movies benchmarks r=ManyTheFish a=loiclec

## What does this PR do?
The `movies.json` file in the benchmark datasets contains a filterable field called "release_date", but the indexing benchmarks wrongly called the field "released_date" instead. This PR fixes that.


Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-07-18 15:51:09 +00:00
Many the fish
2d79720f5d Update milli/src/search/matches/mod.rs 2022-07-18 17:48:04 +02:00
Many the fish
8ddb4e750b Update milli/src/search/matches/mod.rs 2022-07-18 17:47:39 +02:00
Many the fish
a277daa1f2 Update milli/src/search/matches/mod.rs 2022-07-18 17:47:13 +02:00
Many the fish
fb794c6b5e Update milli/src/search/matches/mod.rs 2022-07-18 17:46:00 +02:00
Many the fish
1237cfc249 Update milli/src/search/matches/mod.rs 2022-07-18 17:45:37 +02:00
Many the fish
d7fd5c58cd Update milli/src/search/matches/mod.rs 2022-07-18 17:45:06 +02:00
Loïc Lecrenier
fc9f3f31e7 Change DocumentsBatchReader to access cursor and index at same time
Otherwise it is not possible to iterate over all documents while
using the fields index at the same time.
2022-07-18 16:08:14 +02:00
Loïc Lecrenier
ab1571cdec Simplify Transform::read_documents, enabled by enriched documents reader 2022-07-18 12:45:47 +02:00
Loïc Lecrenier
8270e2b768 Fix name of "release_date" facet in movies benchmarks 2022-07-18 10:34:12 +02:00
Many the fish
e261ef64d7 Update milli/src/search/matches/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-07-18 10:18:51 +02:00
Many the fish
1da4ab5918 Update milli/src/search/matches/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-07-18 10:18:03 +02:00
Kerollmops
448114cc1c Fix the benchmarks with the new indexation API 2022-07-12 15:22:09 +02:00
Kerollmops
25e768f31c Fix another issue with the nested primary key selector 2022-07-12 15:14:07 +02:00
Kerollmops
192793ee38 Add some tests to check for the nested documents ids 2022-07-12 15:14:07 +02:00
Kerollmops
a892a4a79c Introduce a function to extend from a JSON array of objects 2022-07-12 15:14:06 +02:00
Kerollmops
dc61105554 Fix the nested document id fetching function 2022-07-12 15:14:06 +02:00
Kerollmops
2eec290424 Check the validity of the latitute and longitude numbers 2022-07-12 15:14:06 +02:00
Kerollmops
5d149d631f Remove tests for a function that no more exists 2022-07-12 15:14:06 +02:00
Kerollmops
0bbcc7b180 Expose the DocumentId struct to be sure to inject the generated ids 2022-07-12 15:14:06 +02:00
Kerollmops
d1a4da9812 Generate a real UUIDv4 when ids are auto-generated 2022-07-12 15:14:06 +02:00
Kerollmops
c8ebf0de47 Rename the validate function as an enriching function 2022-07-12 15:14:06 +02:00