Commit Graph

123 Commits

Author SHA1 Message Date
9ac2fd1c37 Merge #487
487: Update version (v0.26.0) r=Kerollmops a=curquiza

breaking because of #458 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-07 17:10:24 +00:00
bab898ce86 move the flatten-serde-json crate inside of milli 2022-04-07 18:20:44 +02:00
4f3ce6d9cd nested fields 2022-04-07 16:58:46 +02:00
ee1d627803 Update version (v0.26.0) 2022-04-07 15:56:10 +02:00
9eec44dd98 Update version (v0.25.0) 2022-04-05 12:06:42 +02:00
ddf78a735b Update version (v0.24.1) 2022-03-24 16:39:45 +01:00
86dd88698d bump tokenizer 2022-03-23 14:25:58 +01:00
5dc464b9a7 rollback meilisearch-tokenizer version 2022-03-21 17:29:10 +01:00
08a06b49f0 Bump version to 0.23.1 2022-03-15 15:50:28 +01:00
63682c2c9a Upgrade the dependencies 2022-03-15 11:17:44 +01:00
288a879411 Remove three useless dependencies 2022-03-15 11:17:44 +01:00
d9ed9de2b0 Update heed link in cargo toml 2022-03-01 19:45:29 +01:00
25123af3b8 Merge #436
436: Speed up the word prefix databases computation time r=Kerollmops a=Kerollmops

This PR depends on the fixes done in #431 and must be merged after it.

In this PR we will bring the `WordPrefixPairProximityDocids`, `WordPrefixDocids` and, `WordPrefixPositionDocids` update structures to a new era, a better era, where computing the word prefix pair proximities costs much fewer CPU cycles, an era where this update structure can use the, previously computed, set of new word docids from the newly indexed batch of documents.

---

The `WordPrefixPairProximityDocids` is an update structure, which means that it is an object that we feed with some parameters and which modifies the LMDB database of an index when asked for. This structure specifically computes the list of word prefix pair proximities, which correspond to a list of pairs of words associated with a proximity (the distance between both words) where the second word is not a word but a prefix e.g. `s`, `se`, `a`. This word prefix pair proximity is associated with the list of documents ids which contains the pair of words and prefix at the given proximity.

The origin of the performances issue that this struct brings is related to the fact that it starts its job from the beginning, it clears the LMDB database before rewriting everything from scratch, using the other LMDB databases to achieve that. I hope you understand that this is absolutely not an optimized way of doing things.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-02-16 15:41:14 +00:00
f367cc2e75 Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
0defeb268c bump milli 2022-02-16 13:27:41 +01:00
48542ac8fd get rid of chrono in favor of time 2022-02-15 11:41:55 +01:00
d03b3ceb58 Update version for the next release (v0.22.1) 2022-02-07 18:39:29 +01:00
367f403693 bump milli 2022-01-17 16:41:34 +01:00
c10f58b7bd Update tokenizer to v0.2.7 2022-01-17 13:02:00 +05:30
1b3923b5ce Update all packages to 0.21.0 2021-11-29 12:17:59 +01:00
64ef5869d7 Update tokenizer v0.2.6 2021-11-18 16:56:05 +01:00
f28600031d Rename the filter_parser crate into filter-parser
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-11-09 16:41:10 +01:00
6831c23449 merge with main 2021-11-06 16:34:30 +01:00
a58bc5bebb update milli with the new parser_filter 2021-11-04 15:02:36 +01:00
743ed9f57f Bump milli version 2021-11-04 14:04:21 +01:00
702589104d Update version for the next release (v0.20.1) 2021-11-03 14:20:01 +01:00
056ff13c4d Update version for the next release (v0.20.0) 2021-10-28 14:52:57 +02:00
d7943fe225 Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
15c29cdd9b Merge #401
401: Update version for the next release (v0.19.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-25 12:49:53 +00:00
208903ddde Revert "Replacing pest with nom " 2021-10-25 11:58:00 +02:00
679fe18b17 Update version for the next release (v0.19.0) 2021-10-25 11:52:17 +02:00
0f86d6b28f implement csv serialization 2021-10-25 10:26:42 +02:00
efb2f8b325 convert the errors 2021-10-22 16:38:35 +02:00
c27870e765 integrate a first version without any error handling 2021-10-22 14:33:18 +02:00
01dedde1c9 update some names and move some parser out of the lib.rs 2021-10-22 01:59:38 +02:00
f8fe9316c0 Update version for the next release (v0.18.1) 2021-10-21 11:56:14 +02:00
2209acbfe2 Update version for the next release (v0.18.2) 2021-10-18 13:45:48 +02:00
59cc59e93e Merge #358
358: Replacing pest with nom  r=Kerollmops a=CNLHC



Co-authored-by: 刘瀚骋 <cn_lhc@qq.com>
2021-10-16 20:44:38 +00:00
7666e4f34a follow the suggestions 2021-10-14 21:37:59 +08:00
c7db4176f3 Merge #384
384: Replace memmap with memmap2 r=Kerollmops a=palfrey

[memmap is unmaintained](https://rustsec.org/advisories/RUSTSEC-2020-0077.html) and needs replacing. memmap2 is a drop-in replacement fork that's well maintained. Note that the version numbers got reset on fork, hence the lower values.

Co-authored-by: Tom Parker-Shemilt <palfrey@tevp.net>
2021-10-13 13:47:23 +00:00
f7796edc7e remove everything about pest 2021-10-12 13:30:40 +08:00
8748df2ca4 draft without error handling 2021-10-12 13:30:40 +08:00
dd56e82dba Update version for the next release (v0.17.2) 2021-10-11 15:20:35 +02:00
2dfe24f067 memmap -> memmap2 2021-10-10 22:47:12 +01:00
05d8a33a28 Update version for the next release (v0.17.1) 2021-10-02 16:21:31 +02:00
0e8665bf18 Update version for the next release (v0.17.0) 2021-09-28 19:38:12 +02:00
1eacab2169 Update version for the next release (v0.15.1) 2021-09-22 17:18:54 +02:00
f8ecbc28e2 Update version for the next release (v0.15.0) 2021-09-21 18:09:14 +02:00
aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli

fmt

review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00
94764e5c7c Merge #360
360: Update version for the next release (v0.14.0) r=Kerollmops a=curquiza

Release containing the geosearch, cf #322 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-21 08:43:27 +00:00