meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-10-25 13:06:27 +00:00

Author	SHA1	Message	Date
tamo	15cce89a45	update the README with instructions to get the download the dataset	2021-06-02 11:05:07 +02:00
tamo	e425f70ef9	let criterion decide how much iteration it wants to do in 10s	2021-06-02 11:05:07 +02:00
tamo	4fdbfd6048	push a first version of the benchmark for the typo	2021-06-02 11:05:07 +02:00
bors[bot]	270da98c46	Merge #202 202: Add field id word count docids database r=Kerollmops a=LegendreM This PR introduces a new database, `field_id_word_count_docids`, that maps the number of words in an attribute with a list of document ids. This relation is limited to attributes that contain less than 11 words. This database is used by the exactness criterion to know if a document has an attribute that contains exactly the query without any additional word. Fix #165 Fix #196 Related to [specifications:#36](https://github.com/meilisearch/specifications/pull/36) Co-authored-by: many <maxime@meilisearch.com> Co-authored-by: Many <legendre.maxime.isn@gmail.com>	2021-06-01 16:09:48 +00:00
many	e857ca4d7d	Fix PR comments	2021-06-01 18:06:46 +02:00
Many	ab2cf69e8d	Update milli/src/update/delete_documents.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2021-06-01 17:04:10 +02:00
Many	8e6d1ff0dc	Update milli/src/update/index_documents/store.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2021-06-01 17:04:02 +02:00
bors[bot]	168fe0aa28	Merge #206 206: Fix http-ui r=Kerollmops a=irevoire I just noticed that `http-ui` was not compiling on `main`. I'm not sure this is the best fix, but it works 👀 Co-authored-by: Tamo <irevoire@hotmail.fr>	2021-06-01 14:31:32 +00:00
Tamo	608c5bad24	fix http-ui	2021-06-01 16:24:46 +02:00
bors[bot]	7d36d664a7	Merge #203 203: Make the MatchingWords return the number of matching bytes r=Kerollmops a=LegendreM Make the MatchingWords return the number of matching bytes using a custom Levenshtein algorithm. Fix #138 Co-authored-by: many <maxime@meilisearch.com>	2021-06-01 12:00:33 +00:00
many	225ae6fd25	Resolve PR comments	2021-06-01 11:53:09 +02:00
bors[bot]	2f9f6a1f21	Merge #169 169: Optimize roaring codec r=Kerollmops a=MarinPostma Optimize the `BoRoaringBitmapCodec` by preventing it from emiting useless error that caused allocation. On my flamegraph, the byte_decode function went from 4.13% to 1.70% (of transplant graph). This may not be the greatest optimization ever, but hey, this was a low hanging fruit. before: ![image](https://user-images.githubusercontent.com/28804882/116241125-17018880-a754-11eb-9f9d-a67418d100e1.png) after: ![image](https://user-images.githubusercontent.com/28804882/116241167-21bc1d80-a754-11eb-9afc-d9d72727477c.png) Co-authored-by: Marin Postma <postma.marin@protonmail.com>	2021-06-01 06:30:25 +00:00
Marin Postma	984dc7c1ed	rewrite roaring codec without byteorder.	2021-05-31 22:15:39 +02:00
Marin Postma	1373637da1	optimize roaring codec	2021-05-31 22:15:35 +02:00
many	1df68d342a	Make the MatchingWords return the number of matching bytes	2021-05-31 18:22:29 +02:00
many	b8e6db0feb	Add database in infos crate	2021-05-31 16:29:27 +02:00
many	c701f8bf36	Use field id word count database in exactness criterion	2021-05-31 16:27:28 +02:00
many	4ddf008be2	add field id word count database	2021-05-31 16:27:28 +02:00
bors[bot]	2f5e61bacb	Merge #184 184: Transfer numbers and strings facets into the appropriate facet databases r=Kerollmops a=Kerollmops This pull request is related to https://github.com/meilisearch/milli/issues/152 and changes the layout of the facets values, numbers and strings are now in dedicated databases and the user no more needs to define the type of the fields. No more conversion between the two types is done, numbers (floats and integers converted to f64) go to the facet float database and strings go to the strings facet database. There is one related issue that I found regarding CSVs, the values in a CSV are always considered to be strings, [meilisearch/specifications#28](`d916b57d74/text/0028-indexing-csv.md`) fixes this issue by allowing the user to define the fields types using `:` in the "CSV Formatting Rules" section. All previous tests on facets have been modified to pass again and I have also done hand-driven tests with the 115m songs dataset. Everything seems to be good! Fixes #192. Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Kerollmops <clement@meilisearch.com>	2021-05-31 13:32:58 +00:00
Kerollmops	1c0a5cd136	Resolve code modification suggestions	2021-05-31 15:22:50 +02:00
bors[bot]	76b9178b16	Merge #200 200: Fix plane sweep algorithm r=Kerollmops a=LegendreM Fix plain sweep algorithm after creating some tests on proximity. Co-authored-by: many <maxime@meilisearch.com>	2021-05-26 11:36:24 +00:00
many	a5e98cf46d	Fix plane sweep algorithm	2021-05-25 18:21:55 +02:00
Kerollmops	5012cc3a32	Fix the http-ui crate to support split facet databases	2021-05-25 11:31:06 +02:00
Kerollmops	28bd9e183e	Fix the infos crate to support split facet databases	2021-05-25 11:31:06 +02:00
Clément Renault	3a4a150ef0	Fix the tests and remaining warnings	2021-05-25 11:31:06 +02:00
Clément Renault	02c655ff1a	Refine the facet distribution to use both databases	2021-05-25 11:30:00 +02:00
Clément Renault	79efded841	Refine the FacetCondition from_array constructor	2021-05-25 11:30:00 +02:00
Clément Renault	f7efde11d9	Refine the facet condition to use both facet databases	2021-05-25 11:30:00 +02:00
Clément Renault	e62b89a2ed	Make the facet distinct work with the new split facets	2021-05-25 11:30:00 +02:00
Clément Renault	bd7b285bae	Split the update side to use the number and the strings facet databases	2021-05-25 11:30:00 +02:00
Clément Renault	038e03a4e4	Use both facet databases in the FacetIter type	2021-05-25 11:30:00 +02:00
Clément Renault	597144b0b9	Use both number and string facet databases in the distinct system	2021-05-25 11:29:59 +02:00
Clément Renault	837c1041c7	Clear and delete the documents from the facet database	2021-05-25 11:28:36 +02:00
Clément Renault	a56c46b6f1	Explode the string and f64 facet databases into two	2021-05-25 11:28:36 +02:00
Clément Renault	df7a32e3d0	Move the creation date initialization into a function	2021-05-25 11:28:35 +02:00
bors[bot]	49bee2ebc5	Merge #190 190: Make bucket candidates optionals r=Kerollmops a=LegendreM Before the bucket candidates were the result of the facet filters or result of the query tree. They will now be only the result of the query tree, making the number of candidates more consistent between the same request with or without facet filters. Fix some clippy warnings. Fix #186 Co-authored-by: many <maxime@meilisearch.com>	2021-05-24 11:19:32 +00:00
many	a3944a7083	Introduce a filtered_candidates field	2021-05-11 11:37:40 +02:00
many	efba662ca6	Fix clippy warnings in cirteria	2021-05-10 10:27:18 +02:00
many	e923d51b8f	Make bucket candidates optionals	2021-05-10 10:27:04 +02:00
Many	c620626515	Merge pull request #188 from meilisearch/exactness-criterion Exactness criterion	2021-05-06 17:56:21 +02:00
Many	44b6843de7	Fix pull request reviews Update milli/src/fields_ids_map.rs Update milli/src/search/criteria/exactness.rs Update milli/src/search/criteria/mod.rs	2021-05-06 14:31:03 +02:00
many	c1ce4e4ca9	Introduce mocked ExactAttribute step in exactness criterion	2021-05-06 14:28:31 +02:00
many	a3f8686fbf	Introduce exactness criterion	2021-05-06 14:28:30 +02:00
bors[bot]	25f75d4d03	Merge #189 189: Update version for the next release (v0.2.1) r=Kerollmops a=curquiza Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>	2021-05-05 15:28:56 +00:00
bors[bot]	7e63e32960	Merge #187 187: Fix fields distribution after documents merge r=Kerollmops a=shekhirin Resolves https://github.com/meilisearch/milli/issues/174 The problem was with calculation of fields distribution before the merge in `output_from_sorter()`. So if you'd import two documents with the same primary key value, fields distribution will count it as two documents, while `output_from_sorter()` will merge these documents into one. --- ```console ➜ Downloads cat short_movies.json [ {"id":"47474","title":"The Serpent's Egg","poster":"https://image.tmdb.org/t/p/w500/n7z0doFkXHcvo8QQWHLFnkEPXRU.jpg","overview":"The Serpent's Egg follows a week in the life of Abel Rosenberg, an out-of-work American circus acrobat living in poverty-stricken Berlin following Germany's defeat in World War I.","release_date":246844800,"genres":["Thriller","Drama","Mystery"]}, {"id":"47474","title":"The Serpent's Egg","poster":"https://image.tmdb.org/t/p/w500/n7z0doFkXHcvo8QQWHLFnkEPXRU.jpg","overview":"The Serpent's Egg follows a week in the life of Abel Rosenberg, an out-of-work American circus acrobat living in poverty-stricken Berlin following Germany's defeat in World War I.","release_date":246844800,"genres":["Thriller","Drama","Mystery"]} ] ➜ Downloads curl -X POST -H "Content-Type: text/json" --data-binary @short_movies.json 127.0.0.1:7700/indexes/movies/documents {"updateId":0} ``` ## Before ```console ➜ Downloads curl -s 127.0.0.1:7700/indexes/movies/stats \| jq { "numberOfDocuments": 1, "isIndexing": false, "fieldsDistribution": { "release_date": 2, "poster": 2, "title": 2, "overview": 2, "genres": 2, "id": 2 } } ``` ## After ```console ➜ Downloads curl -s 127.0.0.1:7700/indexes/movies/stats \| jq { "numberOfDocuments": 1, "isIndexing": false, "fieldsDistribution": { "poster": 1, "release_date": 1, "title": 1, "genres": 1, "id": 1, "overview": 1 } } ``` Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>	2021-05-05 14:45:08 +00:00
Clémentine Urquizar	1e11578ef0	Update version for the next release (v0.2.1)	2021-05-05 14:57:34 +02:00
Alexey Shekhirin	f8d0f5265f	fix(update): fields distribution after documents merge	2021-05-04 22:12:20 +03:00
bors[bot]	1207a058d0	Merge #185 185: Provide an iterator over all the documents in a milli index r=Kerollmops a=irevoire Co-authored-by: tamo <tamo@meilisearch.com>	2021-05-04 14:04:16 +00:00
tamo	d61566787e	provide an iterator over all the documents in a milli index	2021-05-04 11:23:51 +02:00
bors[bot]	c08f4599f2	Merge #183 183: remove tests on main r=Kerollmops a=MarinPostma remove testing on main since we now use bors for merging. Co-authored-by: Marin Postma <postma.marin@protonmail.com>	2021-05-03 15:06:28 +00:00

1 2 3 4 5 ...

805 Commits