meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-10-25 13:06:27 +00:00

Author	SHA1	Message	Date
meili-bors[bot]	40e13ceef3	Merge #4892 4892: Add a documentTemplateMaxBytes parameter to limit the max length of document templates r=ManyTheFish a=dureuill # Pull Request ## Related issue Fixes #4885 See [public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#a3d63628129e40adba943ae7b8ec06c2) Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-09-03 11:50:07 +00:00
Louis Dureuil	18a2c13e4e	add analytics	2024-09-03 12:07:59 +02:00
Louis Dureuil	ed19b7c3c3	Only reindex if the size increased	2024-09-03 12:07:59 +02:00
Louis Dureuil	66bda2ce8a	fix tests	2024-09-03 12:07:58 +02:00
Louis Dureuil	1ac008926b	Add maxBytes parameter	2024-09-03 12:07:15 +02:00
Louis Dureuil	c49d892c82	Changes to prompt	2024-09-03 12:07:10 +02:00
Louis Dureuil	de962a26f3	New error type when maxBytes is null	2024-09-03 12:01:04 +02:00
meili-bors[bot]	80408c92dc	Merge #4906 4906: Add searchable fields to template r=dureuill a=dureuill # Pull Request ## Related issue Fixes #4886 See [public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1dd6f0eee5a1422888e1c5d48e107cd1) ## What does this PR do? - `Prompt::render` now requires and uses metadata to indicate if the fields are searchable or not - Changes default template - Updated tests - Correctly reindex vectors when the list of searchable fields changes in a settings update. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-09-03 07:14:58 +00:00
Louis Dureuil	24ace5c381	Add reindexing test	2024-09-02 13:37:01 +02:00
Louis Dureuil	21296190a3	Reindex embedders	2024-09-02 13:00:53 +02:00
Louis Dureuil	03fda78901	update other tests	2024-09-02 11:31:31 +02:00
Louis Dureuil	30a143f149	Test new facilities	2024-09-02 11:31:23 +02:00
Louis Dureuil	4464d319af	Change default template to use the new facility	2024-09-02 11:30:59 +02:00
Louis Dureuil	580ea2f450	Pass the fields <-> ids map with metadata to render	2024-09-02 11:30:10 +02:00
Louis Dureuil	915cf4bae5	Add field.is_searchable property to fields	2024-09-02 11:28:53 +02:00
meili-bors[bot]	9a756cf2c5	Merge #4888 4888: bring back v1.10.0 into main r=Kerollmops a=ManyTheFish Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-08-27 14:02:08 +00:00
meili-bors[bot]	36d8684dc8	Merge #4881 4881: Infer locales from index settings r=curquiza a=ManyTheFish # Pull Request ## Related issue Fixes #4828 Fixes #4816 ## What does this PR do? - Add some test using `AttributesToSearchOn` - Make the search infer the language based on the index settings when the `locales` filed is not precise CI is now working: https://github.com/meilisearch/meilisearch/actions/runs/10490050545/job/29055955667 Co-authored-by: ManyTheFish <many@meilisearch.com> v1.10.0-rc.3 v1.10.0	2024-08-21 14:18:16 +00:00
ManyTheFish	b12e997c8a	Add pinyin flag	2024-08-21 14:38:04 +02:00
ManyTheFish	8bf89ec394	Infer locales from index settings	2024-08-21 10:47:40 +02:00
meili-bors[bot]	ee62d9ce30	Merge #4845 4845: Fix perf regression facet strings r=ManyTheFish a=dureuill Benchmarks between v1.9 and v1.10 show a performance regression of about x2 (+3dB regression) for most indexing workloads (+44s for hackernews). [Benchmark interpretation in the engine weekly meeting](https://www.notion.so/meilisearch/Engine-weekly-4d49560d374c4a87b4e3d126a261d4a0?pvs=4#98a709683276450295fcfe1f8ea5cef3). - Initial investigation pointed to #4819 as the origin of the regression. - Further investigation points towards the hypernormalization of each facet value in `extract_facet_string_docids` - Most of the slowdown is in `normalize_facet_strings`, and precisely in `detection.language()`. This PR improves the situation (-10s compared with `main` for hackernews, so only +34s regression compared with `v1.9`) by skipping normalization when it can be skipped. I'm not sure how to fix the root cause though. Should we skip facet locale normalization for now? Cc `@ManyTheFish` --- Tentative resolution options: 1. remove locale normalization from facet. I'm not sure why this is required, I believe we weren't doing this before, so maybe we can stop doing that again. 2. don't do language detection when it can be helped: won't help with the regressions in benchmark, but maybe we can skip language detection when the locales contain only one language? 3. use a faster language detection library: `@Kerollmops` told me about https://github.com/quickwit-oss/whichlang which bolsters x10 to x100 throughput compared with whatlang. Should we consider replacing whatlang with whichlang? Now I understand whichlang supports fewer languages than whatlang, so I also suggest: 4. use whichlang when the list of locales is empty (autodetection), or when it only contains locales that whichlang can detect. If the list of locales contains locales that whichlang cannot detect, then use whatlang instead. --- > [!CAUTION] > this PR contains a commit that adds detailed spans, that were used to detect which part of `extract_facet_string_docids` was taking too much time. As this commit adds spans that are called too often and adds 7s overhead, it should be removed before landing. Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-08-19 06:29:48 +00:00
ManyTheFish	0f965d3574	Remove hotloop's spans	2024-08-14 14:33:36 +02:00
ManyTheFish	ade54493ab	Only detect language for a facet if several locales have been specified by the user in the settings	2024-08-14 12:03:52 +02:00
meili-bors[bot]	07c8ed0459	Merge #4864 4864: Don't remove facet value when multiple original values map to the same normalized value r=ManyTheFish a=dureuill # Pull Request ## Related issue Fixes #4860 > [!WARNING] > This PR contains a fix to the immediate issue, but it looks like the underlying data model is faulty: there is only one possible "original" value for each normalized value in a facet of a document, while because of array values (or manually written nested fields, if you're evil), it is technically possible to have multiple, distinct original values mapping to the same normalized value. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-08-13 14:04:17 +00:00
Louis Dureuil	c3cdc407ec	Avoid unnecessary clone()	2024-08-08 14:57:02 +02:00
Louis Dureuil	2f10273d14	Group by normalized values, make sure you don't remove a value where there remains at still one value that normalizes towards it	2024-08-08 14:02:53 +02:00
meili-bors[bot]	321639364f	Merge #4861 4861: Make sure the index scheduler never stops running r=irevoire a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4748 ## What does this PR do? - Whatever happens, we always try to process tasks once every minute (if no tasks are enqueued that's practically free) Co-authored-by: Tamo <tamo@meilisearch.com>	2024-08-07 16:21:54 +00:00
Tamo	442d06dce7	ensure the run function doesn't panic even if the tick function does	2024-08-07 17:50:32 +02:00
Tamo	8f6a98df07	make sure the index scheduler never stops running	2024-08-07 17:06:43 +02:00
meili-bors[bot]	b44e17c4c3	Merge #4858 4858: also intersect the universe for searchOnAttributes r=irevoire a=dureuill # Pull Request ## Related issue Fixes #4857 ## What does this PR do? - intersect with the universe (which does not contain the filtered out ids) when looking up documents for words, even when using `searchOnAttributes` Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-08-07 13:15:26 +00:00
Louis Dureuil	e3ef0ae19e	also intersect the universe for searchOnAttributes	2024-08-06 14:06:56 +02:00
meili-bors[bot]	57f7af77c7	Merge #4846 4846: Add OpenAI tests r=dureuill a=dureuill # Pull Request ## Related issue Part of fixing #4757 ## What does this PR do? - OpenAI embedder: don't pass apiKey when it is empty (slightly improves error messages) - rest embedder and rest-based embedders: specialize the authorization denied error message depending on the configuration source - fix existing tests - Adds assets containing prerecorded texts to embed and the embeddings obtained from OpenAI - Adds an asset containing a tokenized long document and the embedding obtained from OpenAI for this token - Uses the wiremock crate to mock the OpenAI API: parse the openai request, lookup the response in assets, craft an openai response Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.10.0-rc.2	2024-08-05 10:49:28 +00:00
meili-bors[bot]	2d16d0aea1	Merge #4839 4839: In prometheus metrics return the route pattern instead of the real route when returning the HTTP requests total r=irevoire a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4825 ## What does this PR do? - return the route pattern instead of the real route when returning the HTTP requests total Co-authored-by: Tamo <tamo@meilisearch.com>	2024-08-05 10:14:51 +00:00
meili-bors[bot]	c817718e07	Merge #4853 4853: Fix rhai deletion r=irevoire a=dureuill # Pull Request ## Related issue Fixes #4849 ## What does this PR do? - insert inside of the bitmap instead of pushing into it. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-08-01 16:34:31 +00:00
Louis Dureuil	e64d0e0ca8	use insert instead of push for bitmaps	2024-08-01 18:32:45 +02:00
Louis Dureuil	21aa430b5e	Fix openai tests	2024-07-31 17:57:55 +02:00
Louis Dureuil	8535dc0be2	Fix existing tests	2024-07-31 17:57:32 +02:00
Louis Dureuil	72b9005344	Redact uid for Value	2024-07-31 17:57:13 +02:00
meili-bors[bot]	420c33132c	Merge #4850 4850: Use a fixed date format regardless of features r=irevoire a=dureuill # Pull Request ## Related issue Fixes #4844 ## What does this PR do? Given the following script: ``` cargo run -- --db-path meili.ms sleep 3 curl -s -X POST http://127.0.0.1:7700/indexes -H 'Content-Type: application/json' --data-binary '{"uid": "movies", "primaryKey": "id"}' sleep 3 cargo run -p meilisearch --db-path meili.ms sleep 3 curl -s -X POST http://127.0.0.1:7700/indexes/movies/search -H 'Content-Type: application/json' --data-binary '{}' ``` - Before this PR, the final search returns a decoding error. - After this PR, the search completes successfully ### Technical standpoint This PR fixes two locations where the formatting of dates were dependent on the feature set of the `time` crate. 1. The `IndexStats` had two fields without the serialization format specified 2. More subtly, the index dates (`createdAt,` `updatedAt`) were using value remapping in the main DB to `SerdeJson<OffsetDateTime>`, which was using whatever default format was available. This was fixed by creating a local `OffsetDateTime` wrapper that would specify the serialization format Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-07-31 15:32:26 +00:00
Louis Dureuil	9ef710cad4	Use wrapper that forces the desired date format	2024-07-31 17:12:19 +02:00
Louis Dureuil	48f7329a83	Specify index_mapper on `IndexStats`	2024-07-31 17:11:28 +02:00
Louis Dureuil	ab1ec9ca21	Add tokenized test	2024-07-31 15:03:45 +02:00
Louis Dureuil	9d6efd92d2	new assets for tokenized test	2024-07-31 15:03:45 +02:00
Louis Dureuil	abdb337fd6	Add openai tests	2024-07-31 15:03:45 +02:00
Louis Dureuil	1c755c8899	Add openai responses	2024-07-31 15:03:45 +02:00
Louis Dureuil	3a42c3134e	update tests after changing authorized error message	2024-07-31 15:03:45 +02:00
Louis Dureuil	5aa6cb3600	Specialize authorized error message depending on config source	2024-07-31 15:03:44 +02:00
Louis Dureuil	9b7764575b	openai: don't pass apiKey when it is empty	2024-07-31 15:03:44 +02:00
Louis Dureuil	0e68718027	Add detailed spans	2024-07-31 13:05:47 +02:00
Louis Dureuil	7c3fc8c655	Split settings and document facet string extractions	2024-07-31 10:57:46 +02:00
Louis Dureuil	8acd3f50bb	skip normalization when the locales and values are the same	2024-07-31 09:53:00 +02:00

1 2 3 4 5 ...

9847 Commits