Commit Graph

119 Commits

Author SHA1 Message Date
fae694a102 Put the documents into an MTBL database 2020-08-07 12:14:40 +02:00
d5a356902a Update oxidized-mtbl 2020-08-07 12:14:03 +02:00
405a71d3a4 Accept csv from stdin 2020-08-06 13:38:21 +02:00
d3b1096510 Compute the word attribute postings lists on each threads 2020-08-06 11:50:27 +02:00
8d734941af Clean up some lines 2020-08-06 10:20:26 +02:00
a4e3c7c37c Force the Papa parse delimiter 2020-08-05 14:11:46 +02:00
6508d497ce Replace the regex highlighting by a simple algorithm 2020-08-05 13:52:27 +02:00
4873abe145 Introduce option flags to toggle the indexing engine 2020-08-05 12:10:41 +02:00
bd4b18541c Introduce a new indexer which uses an MTBL sorter 2020-08-04 15:44:37 +02:00
3f21760d56 Update README.md 2020-08-04 15:40:37 +02:00
bc3a0ac6a3 Display the milli logo and update the description 2020-08-04 15:40:02 +02:00
d7d8f38fb7 Update bulma to spread the logo more 2020-07-16 23:45:02 +02:00
ee305c9284 Replace the title by the milli logo 2020-07-15 23:55:28 +02:00
9ade00e27b Highlight all the matching words 2020-07-14 11:53:21 +02:00
085c376655 Use the regex crate to highlight "hello" 2020-07-14 11:28:40 +02:00
dd385ad05b Customize the mark tag css 2020-07-14 11:03:21 +02:00
aa92311d4e Add a dark theme to the dashboard 2020-07-13 23:51:41 +02:00
3d144e62c4 Search for best proximities in multiple attributes 2020-07-13 19:06:56 +02:00
576dd011a1 Compute the candidates but not by attribute 2020-07-13 18:16:05 +02:00
6b14b20369 Introduce a method to retrieve the number of attributes of the documents 2020-07-13 17:50:16 +02:00
54afec58a3 Add a fade in out animation when the server process 2020-07-12 11:34:48 +02:00
92c2b1dd2d Refine the help message of the binaries 2020-07-12 11:06:45 +02:00
f757df5dfd Introduce the stderr logger to the project 2020-07-12 11:04:35 +02:00
12358476da Use the log crate instead of stderr 2020-07-12 10:55:09 +02:00
2c62eeea3c Rename the project milli 2020-07-12 00:16:41 +02:00
d31da26a51 Avoid cloning RoraringBitmaps when unecessary 2020-07-11 23:51:32 +02:00
b8a1fc0126 Clean up the CSS style custom bulma rules 2020-07-11 14:51:59 +02:00
f6eae91c7d Pretty print the new dashboard numbers 2020-07-11 14:17:37 +02:00
d44428fa90 Display more informations on the dashboard 2020-07-11 11:51:56 +02:00
11c7fef80a Implement a memory dumper
It moves the in memory HashMaps used when indexing to a disk based MTBL file
2020-07-07 16:48:49 +02:00
b12bfcb03b Reduce the deepness of the word position document ids
This helps reduce the number of allocations.
2020-07-07 12:30:05 +02:00
7178b6c2c4 First basic version using MTBL again 2020-07-07 11:32:33 +02:00
45d0d7c3d4 Clean up the README 2020-07-06 17:38:22 +02:00
adb1038b26 Add a jobs parameter to set the number of threads the indexer uses 2020-07-06 12:17:17 +02:00
2a3b03138b Use heed 0.8.1 with the RwIter append method 2020-07-05 19:50:28 +02:00
ec1023e790 Intersect document ids by inverse popularity of the words
This reduces the worst request we had which took 56s to now took 3s ("the best of the do").
2020-07-05 19:33:51 +02:00
cd7e64b2b3 Allow users to set the arc cache size when indexing 2020-07-04 18:12:41 +02:00
ac8353a64f Merge pre-computed word attribute documents ids 2020-07-04 17:02:27 +02:00
fea7cac206 Display the time it took to compute the word attribute documents ids 2020-07-04 15:18:38 +02:00
46ced5c828 Introduce the RwIter append heed API 2020-07-04 12:34:10 +02:00
7e7440c431 Finalize the LMDB indexing design 2020-07-01 22:45:43 +02:00
2ae3f40971 Make the indexer ignore certain words
This is a preparation for making the indexing fully parallel by making the
indexer only be aware of certain words for each threads to avoid postings lists
conflicts for each words
2020-07-01 17:49:46 +02:00
a3ac2623d5 Introduce multiple functions to clean up the code 2020-07-01 17:24:55 +02:00
ac5cc7ddad Introduce an Iterator yielding owned entries for the LruCache 2020-07-01 17:21:52 +02:00
014a25697d Use only one ARC cache based on the words 2020-07-01 12:03:18 +02:00
fc4013a43f Fix the ARC cache 2020-07-01 10:35:07 +02:00
2fcae719ad Use another LRU impl which uses hashbrown 2020-06-29 22:26:06 +02:00
f98b615bf3 Replace the LRU by an Arc cache 2020-06-29 20:48:57 +02:00
07abebfc46 Introduce a (too big) LRU cache 2020-06-29 18:15:03 +02:00
5f0088594b Index by writing directly into LMDB 2020-06-29 13:54:47 +02:00