Clément Renault 
							
						 
					 
					
						
						
							
						
						013acb3d93 
					 
					
						
						
							
							Measure merger writer channel contention  
						
						 
						
						
						
						
					 
					
						2024-09-23 11:07:59 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								meili-bors[bot] 
							
						 
					 
					
						
						
							
						
						462a2329f1 
					 
					
						
						
							
							Merge  #4941  
						
						 
						
						... 
						
						
						
						4941: Implement the binary quantization in meilisearch r=irevoire a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4873 
## What does this PR do?
- Add a settings for the binary quantization
- Once enabled, the bq cannot be disabled
TODO:
- [ ] Missing a bunch of tests
Co-authored-by: Tamo <tamo@meilisearch.com > 
						
						
					 
					
						2024-09-19 15:50:24 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						f6483cf15d 
					 
					
						
						
							
							apply review comment  
						
						 
						
						
						
						
					 
					
						2024-09-19 16:47:06 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								meili-bors[bot] 
							
						 
					 
					
						
						
							
						
						bd34ed01d9 
					 
					
						
						
							
							Merge  #4945  
						
						 
						
						... 
						
						
						
						4945: Add swedish in default pipelines r=dureuill a=ManyTheFish
# Summary
## Fix Swedish support
In Swedish the characters `å`/`ä`/`ö` are completely different than `a` or `o`  and should not be normalized as the same character.
because the Swedish specialized pipeline was not activated by default, these characters were normalized even with the settings:
```json
{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}
```
## Update Charabia adding German support
German segmentation will now be activated using the setting:
```json
{
  "localizedAttributes": [ { "locales": ["deu"], "attributePatterns": ["*"] } ]
}
```
# TODO
- [x] Activate Swedish Pipeline
- [x] Add a test to avoid future regressions
- [x] Update Charabia
Co-authored-by: ManyTheFish <many@meilisearch.com > 
						
						
					 
					
						2024-09-19 14:42:03 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						74199f328d 
					 
					
						
						
							
							Make clippy happy  
						
						 
						
						
						
						
					 
					
						2024-09-19 16:27:34 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						1113c42de0 
					 
					
						
						
							
							fix broken comments  
						
						 
						
						
						
						
					 
					
						2024-09-19 16:18:36 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						7d6768e4c4 
					 
					
						
						
							
							Add german tokenization pipeline  
						
						 
						
						
						
						
					 
					
						2024-09-19 16:09:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						f77661ec44 
					 
					
						
						
							
							Update Charabia v0.9.1  
						
						 
						
						
						
						
					 
					
						2024-09-19 16:08:59 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						b8fd85a46d 
					 
					
						
						
							
							Get rids of useless collect before an iteration on the readers  
						
						 
						
						
						
						
					 
					
						2024-09-19 15:57:38 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						fd43c6c404 
					 
					
						
						
							
							Improve the error message explaining you can't un-bq an embedder  
						
						 
						
						
						
						
					 
					
						2024-09-19 15:51:29 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						2564ec1496 
					 
					
						
						
							
							Update milli/src/index.rs  
						
						 
						
						... 
						
						
						
						Co-authored-by: Louis Dureuil <louis@meilisearch.com > 
						
						
					 
					
						2024-09-19 15:41:44 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						b6b73fe41c 
					 
					
						
						
							
							Update milli/src/update/settings.rs  
						
						 
						
						... 
						
						
						
						Co-authored-by: Louis Dureuil <louis@meilisearch.com > 
						
						
					 
					
						2024-09-19 15:41:14 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						6dde41cc46 
					 
					
						
						
							
							stop using a local version of arroy and instead point to the git repo with the rev  
						
						 
						
						
						
						
					 
					
						2024-09-19 15:25:38 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						163f8023a1 
					 
					
						
						
							
							remove debug println  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:13:25 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						633537ccd7 
					 
					
						
						
							
							fix updating documents without updating the settings  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:00:58 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						3f6301dbc9 
					 
					
						
						
							
							fix the missing embedder name in the error message when trying to disable the binary quantization  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:00:58 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						2b6952eda1 
					 
					
						
						
							
							rename the ArroyReader to an ArroyWrapper since it can read and write  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:00:58 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						79f29eed3c 
					 
					
						
						
							
							fix the tests and the arroy_readers method  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:00:58 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tamo 
							
						 
					 
					
						
						
							
						
						cc45e264ca 
					 
					
						
						
							
							implement the binary quantization in meilisearch  
						
						 
						
						
						
						
					 
					
						2024-09-19 12:00:56 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								meili-bors[bot] 
							
						 
					 
					
						
						
							
						
						5f474a640d 
					 
					
						
						
							
							Merge  #4938  
						
						 
						
						... 
						
						
						
						4938: Remove default embedder r=ManyTheFish a=dureuill
# Pull Request
## Related issue
Fixes  #4738  
## What does this PR do?
[See public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1044b06b651f80edb9d4ef6dc367bad0 )
- Remove `hybrid.embedder` boolean from analytics because embedder is now mandatory and so the boolean would always be `true`
- Rework search kind so that a search without query but with vector is a vector search regardless of (non-zero) semantic ratio
Co-authored-by: Louis Dureuil <louis@meilisearch.com > 
						
						
					 
					
						2024-09-19 09:17:14 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						bbaee3dbc6 
					 
					
						
						
							
							Add Swedish pipeline in all-tokenization feature  
						
						 
						
						
						
						
					 
					
						2024-09-19 08:34:51 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								meili-bors[bot] 
							
						 
					 
					
						
						
							
						
						ff523a2357 
					 
					
						
						
							
							Merge  #4939  
						
						 
						
						... 
						
						
						
						4939: Introduce the `STARTS WITH` filter operator r=irevoire a=Kerollmops
This PR fixes  #4872  by introducing the `STARTS WITH` filter operator and gating it under the _contains filter_ experimental feature along with the `CONTAINS` one. I also updated [the experimental feature discussion page](https://github.com/orgs/meilisearch/discussions/763 ).
Co-authored-by: Clément Renault <clement@meilisearch.com > 
						
						
					 
					
						2024-09-18 10:19:48 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						9f1fb4b425 
					 
					
						
						
							
							Introduce the STARTS WITH filter operator gated under an experimental feature  
						
						 
						
						
						
						
					 
					
						2024-09-17 16:44:11 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Louis Dureuil 
							
						 
					 
					
						
						
							
						
						3c5e363554 
					 
					
						
						
							
							Remove default embedders  
						
						 
						
						
						
						
					 
					
						2024-09-17 16:30:43 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						f4ab1f168e 
					 
					
						
						
							
							Prefer using Rc<str> than String when cloning a lot  
						
						 
						
						
						
						
					 
					
						2024-09-16 15:41:29 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						1a0e962299 
					 
					
						
						
							
							Replace hashmap by vectors in wpp  
						
						 
						
						
						
						
					 
					
						2024-09-16 15:01:20 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						f13e076b8a 
					 
					
						
						
							
							Use hashmap instead of Btree in wpp extractor  
						
						 
						
						
						
						
					 
					
						2024-09-16 14:40:40 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						7ba49b849e 
					 
					
						
						
							
							Extract and write facet databases  
						
						 
						
						
						
						
					 
					
						2024-09-16 09:35:16 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						f7652186e1 
					 
					
						
						
							
							WIP geo fields  
						
						 
						
						
						
						
					 
					
						2024-09-12 18:01:02 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Louis Dureuil 
							
						 
					 
					
						
						
							
						
						23e14138bb 
					 
					
						
						
							
							facet distribution: implement Display for OrderBy  
						
						 
						
						
						
						
					 
					
						2024-09-12 17:43:50 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Louis Dureuil 
							
						 
					 
					
						
						
							
						
						e44325683a 
					 
					
						
						
							
							Facet distribution: fix issue where truncated facet distribution would have a wrong order  
						
						 
						
						
						
						
					 
					
						2024-09-12 17:43:49 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						b2f4e67c9a 
					 
					
						
						
							
							Do not store useless updates  
						
						 
						
						
						
						
					 
					
						2024-09-12 15:38:31 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						ff5d3b59f5 
					 
					
						
						
							
							Move the document id extraction to the primary key code  
						
						 
						
						
						
						
					 
					
						2024-09-12 12:01:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						aa69308e45 
					 
					
						
						
							
							Use a bufWriter to build word FSTs  
						
						 
						
						
						
						
					 
					
						2024-09-12 11:48:00 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						eb9a20ff0b 
					 
					
						
						
							
							Fix fid_word_docids extraction  
						
						 
						
						
						
						
					 
					
						2024-09-12 11:08:18 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						3e9198ebaa 
					 
					
						
						
							
							Support guessing primary key again  
						
						 
						
						
						
						
					 
					
						2024-09-11 17:25:40 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						2a0ad0982f 
					 
					
						
						
							
							Fix the document counter  
						
						 
						
						
						
						
					 
					
						2024-09-11 15:59:36 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						2b317c681b 
					 
					
						
						
							
							Build mergers in parallel  
						
						 
						
						
						
						
					 
					
						2024-09-11 11:49:26 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						39b5990f64 
					 
					
						
						
							
							Mutualize tokenization  
						
						 
						
						
						
						
					 
					
						2024-09-11 10:22:38 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						8287c2644f 
					 
					
						
						
							
							Support CSV again  
						
						 
						
						
						
						
					 
					
						2024-09-10 21:10:28 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						c1c44a0b81 
					 
					
						
						
							
							Impl serialize on TopLevelMap  
						
						 
						
						
						
						
					 
					
						2024-09-10 19:32:03 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						04596f3616 
					 
					
						
						
							
							Move the TopLevelMap into a dedicated module  
						
						 
						
						
						
						
					 
					
						2024-09-10 18:01:17 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						24cb5839ad 
					 
					
						
						
							
							Move the document changes sorting logic to a new trait  
						
						 
						
						
						
						
					 
					
						2024-09-10 17:37:52 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						f69688e8f7 
					 
					
						
						
							
							Fix several warnings in extractors and remove unreachable macros  
						
						 
						
						
						
						
					 
					
						2024-09-09 14:52:50 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Louis Dureuil 
							
						 
					 
					
						
						
							
						
						f18e9cb7b3 
					 
					
						
						
							
							Change openai default model  
						
						 
						
						
						
						
					 
					
						2024-09-09 13:09:35 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						8fd0afaaaa 
					 
					
						
						
							
							Make sure we iterate over the payload documents in order  
						
						 
						
						
						
						
					 
					
						2024-09-06 08:09:08 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						72c6a21a30 
					 
					
						
						
							
							Use raw JSON to read the payloads  
						
						 
						
						
						
						
					 
					
						2024-09-05 20:08:23 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Clément Renault 
							
						 
					 
					
						
						
							
						
						8412be4a7d 
					 
					
						
						
							
							Cleanup CowStr and TopLevelMap struct  
						
						 
						
						
						
						
					 
					
						2024-09-05 18:32:55 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Louis Dureuil 
							
						 
					 
					
						
						
							
						
						10f09c531f 
					 
					
						
						
							
							add some commented code to read from json with raw values  
						
						 
						
						
						
						
					 
					
						2024-09-05 18:22:16 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ManyTheFish 
							
						 
					 
					
						
						
							
						
						8fd99b111b 
					 
					
						
						
							
							Add tracing timers logs  
						
						 
						
						
						
						
					 
					
						2024-09-05 18:00:22 +02:00