Kerollmops 
							
						 
					 
					
						
						
							
						
						11c7fef80a 
					 
					
						
						
							
							Implement a memory dumper  
						
						... 
						
						
						
						It moves the in memory HashMaps used when indexing to a disk based MTBL file 
						
						
					 
					
						2020-07-07 16:48:49 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						b12bfcb03b 
					 
					
						
						
							
							Reduce the deepness of the word position document ids  
						
						... 
						
						
						
						This helps reduce the number of allocations. 
						
						
					 
					
						2020-07-07 12:30:05 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						7178b6c2c4 
					 
					
						
						
							
							First basic version using MTBL again  
						
						
						
						
					 
					
						2020-07-07 11:32:33 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						adb1038b26 
					 
					
						
						
							
							Add a jobs parameter to set the number of threads the indexer uses  
						
						
						
						
					 
					
						2020-07-06 12:17:17 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						ec1023e790 
					 
					
						
						
							
							Intersect document ids by inverse popularity of the words  
						
						... 
						
						
						
						This reduces the worst request we had which took 56s to now took 3s ("the best of the do"). 
						
						
					 
					
						2020-07-05 19:33:51 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						cd7e64b2b3 
					 
					
						
						
							
							Allow users to set the arc cache size when indexing  
						
						
						
						
					 
					
						2020-07-04 18:12:41 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						ac8353a64f 
					 
					
						
						
							
							Merge pre-computed word attribute documents ids  
						
						
						
						
					 
					
						2020-07-04 17:02:27 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						fea7cac206 
					 
					
						
						
							
							Display the time it took to compute the word attribute documents ids  
						
						
						
						
					 
					
						2020-07-04 15:18:38 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						46ced5c828 
					 
					
						
						
							
							Introduce the RwIter append heed API  
						
						
						
						
					 
					
						2020-07-04 12:34:10 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						7e7440c431 
					 
					
						
						
							
							Finalize the LMDB indexing design  
						
						
						
						
					 
					
						2020-07-01 22:45:43 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						2ae3f40971 
					 
					
						
						
							
							Make the indexer ignore certain words  
						
						... 
						
						
						
						This is a preparation for making the indexing fully parallel by making the
indexer only be aware of certain words for each threads to avoid postings lists
conflicts for each words 
						
						
					 
					
						2020-07-01 17:49:46 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						a3ac2623d5 
					 
					
						
						
							
							Introduce multiple functions to clean up the code  
						
						
						
						
					 
					
						2020-07-01 17:24:55 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						ac5cc7ddad 
					 
					
						
						
							
							Introduce an Iterator yielding owned entries for the LruCache  
						
						
						
						
					 
					
						2020-07-01 17:21:52 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						014a25697d 
					 
					
						
						
							
							Use only one ARC cache based on the words  
						
						
						
						
					 
					
						2020-07-01 12:03:18 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						fc4013a43f 
					 
					
						
						
							
							Fix the ARC cache  
						
						
						
						
					 
					
						2020-07-01 10:35:07 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						2fcae719ad 
					 
					
						
						
							
							Use another LRU impl which uses hashbrown  
						
						
						
						
					 
					
						2020-06-29 22:26:06 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						f98b615bf3 
					 
					
						
						
							
							Replace the LRU by an Arc cache  
						
						
						
						
					 
					
						2020-06-29 20:48:57 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						07abebfc46 
					 
					
						
						
							
							Introduce a (too big) LRU cache  
						
						
						
						
					 
					
						2020-06-29 18:15:03 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						5f0088594b 
					 
					
						
						
							
							Index by writing directly into LMDB  
						
						
						
						
					 
					
						2020-06-29 13:54:47 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						63cbeca64e 
					 
					
						
						
							
							Skip all derived words when too short  
						
						
						
						
					 
					
						2020-06-28 12:13:12 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						736f0f7560 
					 
					
						
						
							
							Use the proximity instead of the attributes when searching for <= 7 proximities  
						
						
						
						
					 
					
						2020-06-28 12:13:12 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						fe3be8f18a 
					 
					
						
						
							
							Replace the HashMap by a Vec for attributes documents ids  
						
						
						
						
					 
					
						2020-06-28 12:13:12 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						6a2834f2b0 
					 
					
						
						
							
							Add a jobs parameter to set the number of threads the indexer uses  
						
						
						
						
					 
					
						2020-06-28 12:13:10 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						7e16afbdce 
					 
					
						
						
							
							Ignore documents which are not part of the candidates when exploring with A*  
						
						
						
						
					 
					
						2020-06-24 15:06:45 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						1c7a9a4132 
					 
					
						
						
							
							Remove the found documents from the candidates list  
						
						
						
						
					 
					
						2020-06-24 15:00:26 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						50169b9798 
					 
					
						
						
							
							Compute the full list of ids we are willing to find by attribute  
						
						
						
						
					 
					
						2020-06-24 14:48:04 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						374ec6773f 
					 
					
						
						
							
							Introduce a database to store all docids for a word and attribute  
						
						
						
						
					 
					
						2020-06-22 19:24:20 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						a044cb6cc8 
					 
					
						
						
							
							Clean up the warnings for prefix postings  
						
						
						
						
					 
					
						2020-06-22 18:10:31 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						ba3e805981 
					 
					
						
						
							
							Document the Index types and the internal LMDB databases  
						
						
						
						
					 
					
						2020-06-22 18:09:22 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						2f0e1afd16 
					 
					
						
						
							
							Introduce the roaring bitmap heed codec  
						
						
						
						
					 
					
						2020-06-22 17:56:07 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						8148210860 
					 
					
						
						
							
							Use the cache when retrieving the documents at the end  
						
						
						
						
					 
					
						2020-06-21 12:25:19 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						1628a31efa 
					 
					
						
						
							
							Cache the unions of the derived words positions  
						
						
						
						
					 
					
						2020-06-20 15:38:10 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						115e0142d9 
					 
					
						
						
							
							Add a feature flags to enable the export of stats  
						
						
						
						
					 
					
						2020-06-20 13:25:42 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						beb49b24f6 
					 
					
						
						
							
							Skip looking at connections for proximity 0  
						
						
						
						
					 
					
						2020-06-20 13:19:03 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						c84012d655 
					 
					
						
						
							
							Accept queries from standard input when not given as argument  
						
						
						
						
					 
					
						2020-06-20 12:01:15 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						55a8941922 
					 
					
						
						
							
							Optimize things  
						
						
						
						
					 
					
						2020-06-19 17:48:17 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						a3ca80d20d 
					 
					
						
						
							
							Ignore every proximities bigger or equal to 8  
						
						
						
						
					 
					
						2020-06-18 15:42:46 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						3577de04b8 
					 
					
						
						
							
							Reduce the number of KV lookups to the sucessfulls only  
						
						
						
						
					 
					
						2020-06-16 12:58:29 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						e974e6b3c9 
					 
					
						
						
							
							Acquire search intersections metrics  
						
						
						
						
					 
					
						2020-06-16 12:10:23 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						8db16ff306 
					 
					
						
						
							
							Add a cache to the contains_documents success function  
						
						
						
						
					 
					
						2020-06-14 13:39:39 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						a8cda248b4 
					 
					
						
						
							
							Introduce a customized A* algorithm.  
						
						... 
						
						
						
						This custom algo lazily compute the intersections between words, to avoid too much set operations and database reads 
						
						
					 
					
						2020-06-14 12:51:57 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						69285b22d3 
					 
					
						
						
							
							Check that an edges combination contains results  
						
						
						
						
					 
					
						2020-06-13 11:16:02 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						b9cc6c10af 
					 
					
						
						
							
							Introduce a function to ignore useless paths  
						
						
						
						
					 
					
						2020-06-13 00:17:43 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						d02c5cb023 
					 
					
						
						
							
							Fix node skipping by computing the accumulated proximity  
						
						
						
						
					 
					
						2020-06-12 14:08:46 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						37a48489da 
					 
					
						
						
							
							Reworked the best proximity algo a little bit  
						
						
						
						
					 
					
						2020-06-12 12:53:08 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						302866ad73 
					 
					
						
						
							
							Make the algo don't work with an astar  
						
						
						
						
					 
					
						2020-06-11 17:43:06 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						0a83a86e65 
					 
					
						
						
							
							Fix multiple bugs  
						
						
						
						
					 
					
						2020-06-11 11:55:03 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						4e86ecf807 
					 
					
						
						
							
							Retrieve the words before the intersect loops  
						
						
						
						
					 
					
						2020-06-10 22:05:01 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						6ca3579cc0 
					 
					
						
						
							
							Add more time debug measurements  
						
						
						
						
					 
					
						2020-06-10 21:35:01 +02:00 
						 
				 
			
				
					
						
							
							
								Kerollmops 
							
						 
					 
					
						
						
							
						
						66a4b26811 
					 
					
						
						
							
							Introduce a proximity based documents retriever  
						
						
						
						
					 
					
						2020-06-10 16:54:28 +02:00