Commit Graph

140 Commits

Author SHA1 Message Date
1bfdcfc84f Bump uuid to 1.1.2 2022-07-05 16:23:36 +02:00
446439e8be bump charabia 2022-07-05 12:19:30 +02:00
cc48992e79 Bump the milli version to 0.31.1 2022-06-22 17:05:51 +02:00
f5c3b951bc Bump the milli version to 0.31.0 2022-06-22 12:08:16 +02:00
31f749b5d8 Update version for next release (v0.30.0) 2022-06-20 12:09:57 +02:00
676187ba43 bump milli version 2022-06-09 16:53:32 +02:00
56ee9cc21f Bump the version to 0.29.2 2022-06-08 16:00:06 +02:00
478dbfa45a Update version for next release (v0.29.1) 2022-06-07 18:59:33 +02:00
6ce1c6487a Update version for next release (v0.29.0) 2022-06-02 18:07:55 +02:00
192e024ada Add Charabia in Cargo.toml 2022-06-02 16:59:07 +02:00
c19c17eddb Update version to v0.28.1 2022-06-01 18:31:02 +02:00
895f5d8a26 Bump milli version 2022-05-18 10:37:12 +02:00
484a9ddb27 Simplify the error creation with thiserror and a smol friendly macro 2022-05-04 17:24:00 +02:00
d138b3c704 Update version 2022-04-25 18:43:46 +02:00
8d630a6f62 Update version for the next release (v0.26.1) 2022-04-14 11:44:06 +02:00
399fba16bb only flatten an object if it's nested 2022-04-14 11:14:08 +02:00
ee64f4a936 Use smartstring to store the external id in our hashmap
We need to store all the external id (primary key) in a hashmap
associated to their internal id during.
The smartstring remove heap allocation / memory usage and should
improve the cache locality.
2022-04-13 21:22:07 +02:00
9ac2fd1c37 Merge #487
487: Update version (v0.26.0) r=Kerollmops a=curquiza

breaking because of #458 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-07 17:10:24 +00:00
bab898ce86 move the flatten-serde-json crate inside of milli 2022-04-07 18:20:44 +02:00
4f3ce6d9cd nested fields 2022-04-07 16:58:46 +02:00
ee1d627803 Update version (v0.26.0) 2022-04-07 15:56:10 +02:00
9eec44dd98 Update version (v0.25.0) 2022-04-05 12:06:42 +02:00
ddf78a735b Update version (v0.24.1) 2022-03-24 16:39:45 +01:00
86dd88698d bump tokenizer 2022-03-23 14:25:58 +01:00
5dc464b9a7 rollback meilisearch-tokenizer version 2022-03-21 17:29:10 +01:00
08a06b49f0 Bump version to 0.23.1 2022-03-15 15:50:28 +01:00
63682c2c9a Upgrade the dependencies 2022-03-15 11:17:44 +01:00
288a879411 Remove three useless dependencies 2022-03-15 11:17:44 +01:00
d9ed9de2b0 Update heed link in cargo toml 2022-03-01 19:45:29 +01:00
25123af3b8 Merge #436
436: Speed up the word prefix databases computation time r=Kerollmops a=Kerollmops

This PR depends on the fixes done in #431 and must be merged after it.

In this PR we will bring the `WordPrefixPairProximityDocids`, `WordPrefixDocids` and, `WordPrefixPositionDocids` update structures to a new era, a better era, where computing the word prefix pair proximities costs much fewer CPU cycles, an era where this update structure can use the, previously computed, set of new word docids from the newly indexed batch of documents.

---

The `WordPrefixPairProximityDocids` is an update structure, which means that it is an object that we feed with some parameters and which modifies the LMDB database of an index when asked for. This structure specifically computes the list of word prefix pair proximities, which correspond to a list of pairs of words associated with a proximity (the distance between both words) where the second word is not a word but a prefix e.g. `s`, `se`, `a`. This word prefix pair proximity is associated with the list of documents ids which contains the pair of words and prefix at the given proximity.

The origin of the performances issue that this struct brings is related to the fact that it starts its job from the beginning, it clears the LMDB database before rewriting everything from scratch, using the other LMDB databases to achieve that. I hope you understand that this is absolutely not an optimized way of doing things.

Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-02-16 15:41:14 +00:00
f367cc2e75 Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
0defeb268c bump milli 2022-02-16 13:27:41 +01:00
48542ac8fd get rid of chrono in favor of time 2022-02-15 11:41:55 +01:00
d03b3ceb58 Update version for the next release (v0.22.1) 2022-02-07 18:39:29 +01:00
367f403693 bump milli 2022-01-17 16:41:34 +01:00
c10f58b7bd Update tokenizer to v0.2.7 2022-01-17 13:02:00 +05:30
1b3923b5ce Update all packages to 0.21.0 2021-11-29 12:17:59 +01:00
64ef5869d7 Update tokenizer v0.2.6 2021-11-18 16:56:05 +01:00
f28600031d Rename the filter_parser crate into filter-parser
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-11-09 16:41:10 +01:00
6831c23449 merge with main 2021-11-06 16:34:30 +01:00
a58bc5bebb update milli with the new parser_filter 2021-11-04 15:02:36 +01:00
743ed9f57f Bump milli version 2021-11-04 14:04:21 +01:00
702589104d Update version for the next release (v0.20.1) 2021-11-03 14:20:01 +01:00
056ff13c4d Update version for the next release (v0.20.0) 2021-10-28 14:52:57 +02:00
d7943fe225 Merge #402
402: Optimize document transform r=MarinPostma a=MarinPostma

This pr optimizes the transform of documents additions in the obkv format. Instead on accepting any serializable objects, we instead treat json and CSV specifically:
- For json, we build a serde `Visitor`, that transform the json straight into obkv without intermediate representation.
- For csv, we directly write the lines in the obkv, applying other optimization as well.

Co-authored-by: marin postma <postma.marin@protonmail.com>
2021-10-26 09:55:28 +00:00
15c29cdd9b Merge #401
401: Update version for the next release (v0.19.0) r=curquiza a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-25 12:49:53 +00:00
208903ddde Revert "Replacing pest with nom " 2021-10-25 11:58:00 +02:00
679fe18b17 Update version for the next release (v0.19.0) 2021-10-25 11:52:17 +02:00
0f86d6b28f implement csv serialization 2021-10-25 10:26:42 +02:00
efb2f8b325 convert the errors 2021-10-22 16:38:35 +02:00