| 
							
							
								 Kerollmops | bb15f16d8c | Merge other databases content while writing into LMDB at the same time | 2020-10-05 16:35:10 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 9af946a306 | Merging the main, word docids and words pairs proximity docids in parallel | 2020-10-04 18:40:34 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 99705deb7d | Directly use a writer for the docid word positions | 2020-10-04 18:17:53 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 67577a3760 | It is an error to merge docid word positions | 2020-10-04 17:31:12 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | ce8e56ee18 | Rewrite the indexer to use one MTBL by database This allows us to avoid prefixing keys and appending into LMDB databases | 2020-10-04 17:04:33 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 770f29fd05 | Bump the oxidized-mtbl dependency | 2020-10-04 17:04:33 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | acd2a63879 | Introduce a simple FST based chinese word segmenter | 2020-10-04 17:04:33 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 6cc6addc2f | Increase the CboRoaringBitmapCodec threshold | 2020-10-02 17:06:17 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | e41a3822a6 | Add a simple test for the CboRoaringBitmapCodec | 2020-10-02 16:52:36 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | c4b0c57059 | Reduce the default indexer max-memory parameter | 2020-10-02 16:47:41 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 007e647462 | Introduce the Mdfs Iterator that explore the proximity graph using a mana DFS | 2020-10-02 16:46:07 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | d4e80407e5 | Introduce the mana depth first search algorithm | 2020-10-02 16:46:07 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | f6a8096720 | Rename the quartile as percentiles 25th, 50th and 75th | 2020-10-02 16:46:07 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 891e0188dd | Introduce the database-stats infos subcommand | 2020-10-02 16:46:07 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 079742b4d3 | Clean up the stats and size of database infos subcommands | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | d0c73564b1 | Use the CboRoaringBitmapCodec for the word pair proximity docids | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 5a6a698e1d | Introduce the CboRoaringBitmapCodec | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 4eda149ffa | Rename the BoRoaringBitmap codec | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | ac84db2506 | Move the words pairs proximities average into the stats infos subcommand | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 30755e31e7 | Introduce the words pairs proximities stats info subcommand | 2020-10-02 16:46:06 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | bc35c9a598 | Introduce the size_of_database infos subcommand | 2020-10-02 16:46:05 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | c6b883289c | Remove the unused fetch_keywords function | 2020-09-30 15:41:23 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 58237bd67f | Introduce the average-number-of-document-by-word-pair-proximity infos subcommand | 2020-09-29 18:32:48 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 991be8950e | Rename the subcommand into average-number-of-positions-by-word-by-doc | 2020-09-29 18:15:44 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 54370e228a | Search for documents with longer proximities until we find enough | 2020-09-29 17:37:14 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | f277ea134f | Simplify some search function by reducing the number of parameters | 2020-09-29 16:08:58 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 68f4af7d2e | Improve the display of the number of processed documents | 2020-09-29 16:08:58 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 59a127d022 | Improve the indexing process We now store the words pairs proximity in a cache and only compute the
shortest proximity between pairs of words in a document. | 2020-09-29 15:09:18 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 6ddb3e722c | Depth-first search cache the docids unions | 2020-09-28 16:55:21 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | a3821a0b33 | Introduce the depth_first_search path resolution function | 2020-09-28 16:34:12 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 51c237f9d8 | Fix the benchmarks compilation | 2020-09-28 13:39:17 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | d8354f6f02 | Fix the word_docids capacity limit detection | 2020-09-27 11:52:05 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 25b2853b70 | Move the words pairs proximities compute into the write document function | 2020-09-23 15:02:40 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | ed05999f63 | Replace the arc cache by a simple linked hash map | 2020-09-23 14:50:52 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 4d22d80281 | Display only the key on heed error | 2020-09-23 14:13:51 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 5178b3d59d | Make the search system be aware of query words typos | 2020-09-23 12:01:39 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | b597a92487 | Add a default max-memory value to the indexer | 2020-09-23 12:00:36 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 1f6e00878d | Use the words pair proximities in the search algorithm | 2020-09-22 18:47:55 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 31224a8425 | Index the word pair proximities for both orders of the pair | 2020-09-22 14:49:22 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | a58ae5eb2a | Introduce the word-pair-proximities-docids infos subcommand | 2020-09-22 14:04:34 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | d6fa9c0414 | Index the intra documents word pair proximities | 2020-09-22 14:04:33 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 7b67ae6972 | Introduce the StrStrU8 heed codec | 2020-09-22 12:44:17 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | e34437b2d7 | Move the proximity function to a module | 2020-09-22 10:54:59 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 15208c7d3d | Simplify the indexer record loop | 2020-09-22 10:33:30 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | e5adfaade0 | Replace the token filter by a filter mapper | 2020-09-22 10:24:31 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | d21c80b865 | Apply the chunk compression parameters on all the MTBL writers | 2020-09-21 18:30:54 +02:00 |  | 
			
				
					| 
							
							
								 Clément Renault | 944df52e2a | Simplify the indexer main loop | 2020-09-21 14:59:48 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 3ded98e5fa | Bump the roaring version that fix a deserialization bug | 2020-09-10 22:37:51 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | d5e5baa20f | Bump the oxidized-mtbl dependency | 2020-09-10 13:29:12 +02:00 |  | 
			
				
					| 
							
							
								 Kerollmops | 0fb086f241 | Use the crates.io raoring library | 2020-09-08 15:16:04 +02:00 |  |