Commit Graph

1810 Commits

Author SHA1 Message Date
ManyTheFish
a16de5de84 Symplify format and remove intermediate function 2022-04-08 11:20:41 +02:00
ManyTheFish
a769e09dfa Make token_crop_bounds more rust idiomatic 2022-04-07 20:15:14 +02:00
bors[bot]
9ac2fd1c37 Merge #487
487: Update version (v0.26.0) r=Kerollmops a=curquiza

breaking because of #458 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-07 17:10:24 +00:00
bors[bot]
80ae020bee Merge #458
458: Nested fields r=Kerollmops a=irevoire

For the following document:
```json
{
  "id": 1,
  "person": {
    "name": "tamo",
    "age": 25,
  }
}
```
Suppose the user sets `person` as a filterable attribute. We need to store `person` in the filterable _obviously_. But we also need to keep track of `person.name` and `person.age` somewhere.
That’s where I changed a little bit the logic of the engine.

Currently, we have a function called `faceted_field` that returns the union of the filterable and sortable.
I renamed this function in `user_defined_faceted_field`. And now, when we finish indexing documents, we look at all the fields and see if they « match » a `user_defined_faceted_field`.
So in our case:
- does `id` match `person`: 🔴 
- does `person.name` match `person`: 🟢 
- does `person.age` match `person`: 🟢 

And thus, we insert in the database the following faceted fields: `person, person.name, person.age`.

The good thing about that solution is that we generate everything during the indexing phase, and then during the search, we can access our field without recomputing too much globbing.

-----

Now the bad thing is that I had to create a new db.

And if that was only one db, that would be ok, but actually, I need to do the same for the:
- Displayed attributes
- Attributes to retrieve
- Attributes to highlight
- Attribute to crop

`@Kerollmops` 
Do you think there is a better way to do it?
Apart from all the code, can we have a problem because we have too many dbs?

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-07 16:26:09 +00:00
Tamo
bab898ce86 move the flatten-serde-json crate inside of milli 2022-04-07 18:20:44 +02:00
ManyTheFish
c8ed1675a7 Add some documentation 2022-04-07 17:32:13 +02:00
ManyTheFish
b1905dfa24 Make split_best_frequency returns references instead of owned data 2022-04-07 17:05:44 +02:00
Tamo
ab458d8840 fix tests after rebase 2022-04-07 17:00:00 +02:00
Irevoire
4f3ce6d9cd nested fields 2022-04-07 16:58:46 +02:00
Clémentine Urquizar
ee1d627803 Update version (v0.26.0) 2022-04-07 15:56:10 +02:00
bors[bot]
4ae7aea3b2 Merge #486
486: Update version (v0.25.0) r=curquiza a=curquiza

v0.25.0 will be released once #478 is merged

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-06 11:40:41 +00:00
bors[bot]
aadb0c58c9 Merge #478
478: Disable typo on attribute r=Kerollmops a=MarinPostma

disable typo on attributes


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-05 23:45:35 +00:00
ad hoc
86249e2ae4 add missing \t in cli update display
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-04-05 21:35:06 +02:00
ad hoc
b799f3326b rename merge_nothing to merge_ignore_values 2022-04-05 18:44:35 +02:00
ManyTheFish
fa7d3a37c0 Make some cleaning and add comments 2022-04-05 17:48:56 +02:00
ManyTheFish
3bb1e35ada Fix match count 2022-04-05 17:48:45 +02:00
ManyTheFish
56e0edd621 Put crop markers direclty around words 2022-04-05 17:41:32 +02:00
ManyTheFish
a93cd8c61c Fix prefix highlight with special chars 2022-04-05 17:41:32 +02:00
ManyTheFish
b3f0f39106 Make some cleaning 2022-04-05 17:41:32 +02:00
ManyTheFish
6dc345bc53 Test and Fix prefix highlight 2022-04-05 17:41:32 +02:00
ManyTheFish
bd30ee97b8 Keep separators at start of the croped string 2022-04-05 17:41:32 +02:00
ManyTheFish
29c5f76d7f Use new matcher in http-ui 2022-04-05 17:41:32 +02:00
ManyTheFish
734d0899d3 Publish Matcher 2022-04-05 17:41:32 +02:00
ManyTheFish
4428cb5909 Add some tests and fix some corner cases 2022-04-05 17:41:32 +02:00
ManyTheFish
844f546a8b Add matches algorithm V1 2022-04-05 17:41:32 +02:00
ManyTheFish
3be1790803 Add crop algorithm with naive match algorithm 2022-04-05 17:41:32 +02:00
ManyTheFish
d96e72e5dc Create formater with some tests 2022-04-05 17:41:32 +02:00
ad hoc
201fea0fda limit extract_word_docids memory usage 2022-04-05 14:14:15 +02:00
ad hoc
5cfd3d8407 add exact attributes documentation 2022-04-05 14:10:22 +02:00
Clémentine Urquizar
9eec44dd98 Update version (v0.25.0) 2022-04-05 12:06:42 +02:00
ad hoc
b85cd4983e remove field_id_from_position 2022-04-05 09:50:34 +02:00
ad hoc
dac81b2d44 add missing \n in cli settings 2022-04-05 09:48:56 +02:00
ad hoc
ab185a59b5 fix infos 2022-04-05 09:46:56 +02:00
ad hoc
59e41d98e3 add comments to integration test 2022-04-04 21:17:06 +02:00
ad hoc
1810927dbd rephrase exact_attributes doc 2022-04-04 21:04:49 +02:00
ad hoc
b7694c34f5 remove println 2022-04-04 21:00:07 +02:00
ad hoc
6cabd47c32 fix typo in comment 2022-04-04 20:59:20 +02:00
ad hoc
9963f11172 fix infos crate compilation issue 2022-04-04 20:54:03 +02:00
ad hoc
c8d3a09af8 add integration test for disabel typo on attributes 2022-04-04 20:54:03 +02:00
ad hoc
bfd81ce050 add exact atttributes to cli settings 2022-04-04 20:54:03 +02:00
ad hoc
6b2c2509b2 fix bug in exact search 2022-04-04 20:54:03 +02:00
ad hoc
56b4f5dce2 add exact prefix to query_docids 2022-04-04 20:54:03 +02:00
ad hoc
21ae4143b1 add exact_word_prefix to Context 2022-04-04 20:54:03 +02:00
ad hoc
e8f06f6c06 extract exact_word_prefix_docids 2022-04-04 20:54:03 +02:00
ad hoc
6dd2e4ffbd introduce exact_word_prefix database in index 2022-04-04 20:54:03 +02:00
ad hoc
ba0bb29cd8 refactor WordPrefixDocids to take dbs instead of indexes 2022-04-04 20:54:02 +02:00
ad hoc
c4c6e35352 query exact_word_docids in resolve_query_tree 2022-04-04 20:54:02 +02:00
ad hoc
8d46a5b0b5 extract exact word docids 2022-04-04 20:54:02 +02:00
ad hoc
5451c64d5d increase criteria asc desc test map size 2022-04-04 20:54:02 +02:00
ad hoc
0a77be4ec0 introduce exact_word_docids db 2022-04-04 20:54:02 +02:00