meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-09-12 15:56:34 +00:00

Author	SHA1	Message	Date
bors[bot]	04381011b0	Merge #2336 2336: Move permissive-json-pointer in the meilisearch repository r=Kerollmops a=irevoire Move the permissive-json-pointer crate in the meilisearch repository. Co-authored-by: Tamo <tamo@meilisearch.com>	2022-04-20 17:25:44 +00:00
Tamo	1ef87cc6d0	chore: move permissive-json-pointer in the meilisearch repository Update permissive-json-pointer/src/lib.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2022-04-20 19:24:41 +02:00
bors[bot]	4a9000bb96	Merge #2332 2332: fix(search): formatted field r=curquiza a=irevoire fix #2318 Co-authored-by: Irevoire <tamo@meilisearch.com> v0.27.0rc2	2022-04-20 14:59:41 +00:00
Tamo	d81a3f4a74	improve the fuzzer of the flatten crate	2022-04-20 16:11:23 +02:00
bors[bot]	754c49f991	Merge #2326 2326: rename min word lenght for typo r=irevoire a=MarinPostma rename `minWordLengthForTypo` to `minWordSizeForTypos` as specified. discussed here: https://github.com/meilisearch/specifications/pull/117#discussion_r850795714 Co-authored-by: ad hoc <postma.marin@protonmail.com>	2022-04-20 11:54:10 +00:00
bors[bot]	97adef6bfc	Merge #2335 2335: Fix typo reset by upgrading Milli to v0.26.2 r=MarinPostma a=curquiza Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>	2022-04-20 10:49:57 +00:00
Clémentine Urquizar	a7fd199ded	Fix typo reseting by upgrading milli to v0.26.2	2022-04-20 12:24:46 +02:00
bors[bot]	2692b8c960	Merge #2334 2334: Update dashboard to v.0.1.10 r=curquiza a=mdubus Closes #2322 Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>	2022-04-20 10:14:46 +00:00
Irevoire	58a1124e9a	fix(search): formatted field	2022-04-20 11:30:01 +02:00
Morgane Dubus	b57ad15a24	Update dashboard to v.0.1.10	2022-04-20 11:14:42 +02:00
bors[bot]	c7d0097c97	Merge #498 498: Get rid of the threshold when comparing benchmarks r=curquiza a=irevoire It just hides things Co-authored-by: Tamo <tamo@meilisearch.com>	2022-04-19 14:04:11 +00:00
Tamo	152a10344c	Get rid of the threshold when comparing benchmarks It just hide things	2022-04-19 15:39:58 +02:00
bors[bot]	04eb32e539	Merge #499 499: fix min-word-len-for-typo not reset properly r=Kerollmops a=MarinPostma fix min word len for typo not resettign properly, as reported in https://github.com/meilisearch/meilisearch/issues/2330 Co-authored-by: ad hoc <postma.marin@protonmail.com>	2022-04-19 13:22:19 +00:00
ad hoc	8b14090927	fix min-word-len-for-typo not reset properly	2022-04-19 15:20:16 +02:00
bors[bot]	ea4bb9402f	Merge #483 483: Enhance matching words r=Kerollmops a=ManyTheFish # Summary Enhance milli word-matcher making it handle match computing and cropping. # Implementation ## Computing best matches for cropping Before we were considering that the first match of the attribute was the best one, this was accurate when only one word was searched but was missing the target when more than one word was searched. Now we are searching for the best matches interval to crop around, the chosen interval is the one: 1) that have the highest count of unique matches > for example, if we have a query `split the world`, then the interval `the split the split the` has 5 matches but only 2 unique matches (1 for `split` and 1 for `the`) where the interval `split of the world` has 3 matches and 3 unique matches. So the interval `split of the world` is considered better. 2) that have the minimum distance between matches > for example, if we have a query `split the world`, then the interval `split of the world` has a distance of 3 (2 between `split` and `the`, and 1 between `the` and `world`) where the interval `split the world` has a distance of 2. So the interval `split the world` is considered better. 3) that have the highest count of ordered matches > for example, if we have a query `split the world`, then the interval `the world split` has 2 ordered words where the interval `split the world` has 3. So the interval `split the world` is considered better. ## Cropping around the best matches interval Before we were cropping around the interval without checking the context. Now we are cropping around words in the same context as matching words. This means that we will keep words that are farther from the matching words but are in the same phrase, than words that are nearer but separated by a dot. > For instance, for the matching word `Split` the text: `Natalie risk her future. Split The World is a book written by Emily Henry. I never read it.` will be cropped like: `…. Split The World is a book written by Emily Henry. …` and not like: `Natalie risk her future. Split The World is a book …` Co-authored-by: ManyTheFish <many@meilisearch.com>	2022-04-19 11:42:32 +00:00
ManyTheFish	f1115e274f	Use Copy impl of FormatOption instead of clonning	2022-04-19 10:35:50 +02:00
ad hoc	9b064e53e7	fix(http, lib): rename_min_word_length_for_typo into rename_min_word_size_for_typo	2022-04-17 10:02:56 +02:00
bors[bot]	289bfd46ff	Merge #2321 2321: Bump milli r=curquiza a=irevoire Co-authored-by: Irevoire <tamo@meilisearch.com> v0.27.0rc1	2022-04-14 11:51:15 +00:00
Irevoire	64b0a50a58	chore: bump milli	2022-04-14 12:12:54 +02:00
Clémentine Urquizar - curqui	a68e3a79fb	Merge pull request #497 from meilisearch/v0.26.1 Update version for the next release (v0.26.1)	2022-04-14 11:53:31 +02:00
bors[bot]	b1333ab5b0	Merge #2320 2320: chore(http, lib): rename typo to typo_tolerance r=irevoire a=MarinPostma fix #2319 Co-authored-by: ad hoc <postma.marin@protonmail.com>	2022-04-14 09:50:39 +00:00
Clémentine Urquizar	8d630a6f62	Update version for the next release (v0.26.1)	2022-04-14 11:44:06 +02:00
Clémentine Urquizar - curqui	d362278a41	Merge pull request #494 from meilisearch/flatten-what-is-needed Only flatten the required objects	2022-04-14 11:43:28 +02:00
Tamo	00f78d6b5a	Apply code suggestions Co-authored-by: Clément Renault <clement@meilisearch.com>	2022-04-14 11:14:08 +02:00
Tamo	399fba16bb	only flatten an object if it's nested	2022-04-14 11:14:08 +02:00
Tamo	c2469b6765	create the json-depth-checker crate	2022-04-14 11:14:08 +02:00
ad hoc	276dc6043a	chore(http, lib): rename typo to typo_tolerance	2022-04-14 10:42:06 +02:00
bors[bot]	7791ef90e7	Merge #493 493: Use smartstring to store the external id in our hashmap r=Kerollmops a=irevoire We need to store all the external id (primary key) in a hashmap associated to their internal id. The smartstring remove heap allocation / memory usage and should improve the cache locality. I ran the benchmarks to measure the impact of this PR on the indexing time. I think we should merge it whatever happens thought because it'll decrease the memory consumption. --------- This improve really sliiiiiightly the performances but improve the memory usage thus it should be merged. ``` group indexing_main_6b073738 indexing_use-smartsring_3f343511 ----- ---------------------- -------------------------------- indexing/Indexing geo_point 1.02 25.2±0.20s ? ?/sec 1.00 24.8±0.13s ? ?/sec indexing/Indexing movies in three batches 1.00 18.2±0.10s ? ?/sec 1.00 18.2±0.23s ? ?/sec indexing/Indexing movies with default settings 1.00 17.5±0.09s ? ?/sec 1.01 17.7±0.11s ? ?/sec indexing/Indexing songs in three batches with default settings 1.00 68.3±1.01s ? ?/sec 1.00 68.0±0.95s ? ?/sec indexing/Indexing songs with default settings 1.00 63.2±0.78s ? ?/sec 1.00 63.0±0.58s ? ?/sec indexing/Indexing songs without any facets 1.02 59.6±1.00s ? ?/sec 1.00 58.5±1.03s ? ?/sec indexing/Indexing songs without faceted numbers 1.00 62.8±0.38s ? ?/sec 1.00 62.6±1.02s ? ?/sec indexing/Indexing wiki 1.01 1009.2±25.25s ? ?/sec 1.00 998.1±11.27s ? ?/sec indexing/Indexing wiki in three batches 1.01 1142.0±9.97s ? ?/sec 1.00 1134.4±11.21s ? ?/sec ``` Co-authored-by: Tamo <tamo@meilisearch.com>	2022-04-13 20:28:28 +00:00
Tamo	ee64f4a936	Use smartstring to store the external id in our hashmap We need to store all the external id (primary key) in a hashmap associated to their internal id during. The smartstring remove heap allocation / memory usage and should improve the cache locality.	2022-04-13 21:22:07 +02:00
bors[bot]	b9e676b8ca	Merge #2316 2316: Add version flag r=Kerollmops a=sanders41 # Pull Request ## What does this PR do? Fixes #2315 ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Paul Sanders <psanders1@gmail.com>	2022-04-13 17:24:09 +00:00
bors[bot]	6c06fb226d	Merge #2307 2307: Feat(Analytics): Add analytics for search format options r=irevoire a=ManyTheFish Specification: [#120](https://github.com/meilisearch/specifications/pull/120) ([f5c6a8e](`f5c6a8e183`)) fix #2308 Co-authored-by: ManyTheFish <many@meilisearch.com>	2022-04-13 12:01:52 +00:00
bors[bot]	456887a54a	Merge #496 496: Improve the performances of the flattening subcrate r=irevoire a=Kerollmops This PR adds some benchmarks to the _flatten-serde-json_ crate, this crate is responsible for transforming the original documents into flat versions that the engine can understand. It can probably be speed-up and this is why I added benchmarks to it. I make some interesting performance improvements when I replaced the `json!` macro calls. ``` flatten/simple time: [452.44 ns 453.31 ns 454.18 ns] change: [-15.036% -14.751% -14.473%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Benchmarking flatten/complex: Collecting 100 samples in estimated 5.0007 s (4.9M i flatten/complex time: [1.0101 us 1.0131 us 1.0160 us] change: [-18.001% -17.775% -17.536%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe ``` --- _I removed this particular commit from this PR._ The reason is that the two other commits were enough for this PR to give enough impact and be merged. We will continue to explore where we can get performances later. But when I changed the flattening function to accept an owned version of the objects, we lost a lot of performances. Yes, I rewrote the benchmarks (locally) to clone the input object (and measured both, previous and new versions, with the cloning benchmarks). Maybe cloning the benchmark inputs is not the right thing to do... ``` Benchmarking flatten/simple: Collecting 100 samples in estimated 5.0005 s (6.7M it flatten/simple time: [746.46 ns 749.59 ns 752.70 ns] change: [+40.082% +40.714% +41.347%] (p = 0.00 < 0.05) Performance has regressed. Benchmarking flatten/complex: Collecting 100 samples in estimated 5.0047 s (2.9M i flatten/complex time: [1.7311 us 1.7342 us 1.7368 us] change: [+40.976% +41.398% +41.807%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild ``` Co-authored-by: Kerollmops <clement@meilisearch.com>	2022-04-13 11:14:29 +00:00
Kerollmops	b3cec1a383	Prefer using direct method calls instead of using the json macros	2022-04-13 13:12:57 +02:00
Kerollmops	436d2032c4	Add benchmarks to the flatten-serde-json subcrate	2022-04-13 13:12:57 +02:00
bors[bot]	3828635fb2	Merge #489 489: fix distinct count bug r=curquiza a=MarinPostma fix https://github.com/meilisearch/meilisearch/issues/2152 I think the issue was that we didn't take off the excluded candidates from the initial candidates when returning the candidates with the search result. Co-authored-by: ad hoc <postma.marin@protonmail.com>	2022-04-13 10:15:30 +00:00
ad hoc	dda28d7415	exclude excluded canditates from search result candidates	2022-04-13 12:10:35 +02:00
ad hoc	cd83014fff	add test for disctinct nb hits	2022-04-13 12:10:35 +02:00
ad hoc	bbb6728d2f	add distinct attributes to cli	2022-04-13 12:10:35 +02:00
bors[bot]	49fbbacafc	Merge #492 492: Add the new `Specify breaking` check to bors.toml r=curquiza a=curquiza Should prevent this problem: https://github.com/meilisearch/milli/pull/489#issuecomment-1094988060 Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>	2022-04-13 08:59:40 +00:00
Clémentine Urquizar - curqui	7ad582f39f	Update bors.toml	2022-04-13 10:56:56 +02:00
Clémentine Urquizar - curqui	aa896f0e7a	Update bors.toml	2022-04-13 10:56:56 +02:00
Clémentine Urquizar - curqui	0261a0e3cf	Add the new `Specify breaking` check to bors.toml	2022-04-13 10:56:55 +02:00
Paul Sanders	41249be274	Add version flag	2022-04-12 15:22:36 -04:00
ManyTheFish	5809d3ae0d	Add first benchmarks on formatting	2022-04-12 16:31:58 +02:00
bors[bot]	049cf0fcee	Merge #2313 2313: fix(search): remove the back and forth between the IndexMap and the serde_json::Map r=irevoire a=irevoire This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap. See https://github.com/meilisearch/meilisearch/pull/2298#discussion_r845228412_ Co-authored-by: Tamo <tamo@meilisearch.com>	2022-04-12 14:17:26 +00:00
Tamo	2ee210483f	fix(search): remove the back and forth between the IndexMap and the serde_json::Map This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap.	2022-04-12 16:12:52 +02:00
ManyTheFish	827cedcd15	Add format option structure	2022-04-12 13:42:14 +02:00
ManyTheFish	011f8210ed	Make compute_matches more rust idiomatic	2022-04-12 10:19:02 +02:00
bors[bot]	6b0737384b	Merge #491 491: remove the unused key warning r=curquiza a=irevoire When I copy-pasted my flatten crate I forgot to remove the key used to publish the package and that throw a warning. Co-authored-by: Tamo <tamo@meilisearch.com>	2022-04-11 16:55:25 +00:00
bors[bot]	13205066f3	Merge #2311 2311: Change version for the next release (v0.27.0) r=irevoire a=curquiza Fixes #2310 Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>	2022-04-11 14:49:33 +00:00

... 43 44 45 46 47 ...

7375 Commits