Commit Graph

40 Commits

Author SHA1 Message Date
e2bc054604 Update extract_facet_string_docids to support deladd obkvs 2023-10-30 11:32:36 +01:00
fcd3a1434d Update extract_facet_number_docids to support deladd obkvs 2023-10-30 11:31:04 +01:00
313b16bec2 Support diff indexing on extract_docid_word_positions 2023-10-30 11:24:19 +01:00
1dd97578a8 Make the transform struct return diff-based documents obkvs 2023-10-30 11:22:07 +01:00
1c5705c164 clean PR warnings 2023-10-30 11:22:05 +01:00
db1ca21231 add puffin in sorter into reeder function 2023-10-30 11:15:00 +01:00
17b647dfe5 Wip 2023-10-30 11:13:08 +01:00
c0f2724c2d get rids of the new introduced error code in favor of an io::Error 2023-10-10 15:12:23 +02:00
d772073dfa use a bufreader everytime there is a grenad<file> 2023-10-10 15:00:30 +02:00
b45c36cd71 Merge branch 'main' into tmp-release-v1.3.0 2023-08-01 15:05:17 +02:00
df528b41d8 Normalize for the search the facets values 2023-07-20 17:57:07 +02:00
eef95de30e First iteration on exposing puffin profiling 2023-07-18 17:38:13 +02:00
530a3e2df3 fix some typos
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-06-22 21:59:00 +08:00
8628a0c856 Remove docid_word_positions_db + fix deletion bug
That would happen when a word was deleted from all exact attributes
but not all regular attributes.
2023-06-07 10:52:50 +02:00
895ab2906c apply review suggestions 2023-02-16 18:42:47 +01:00
93f130a400 fix all warnings 2023-02-08 20:57:35 +01:00
89675e5f15 clippy: Replace seek 0 by rewind 2023-01-31 09:32:40 +01:00
8d0ace2d64 Avoid creating a MatchingWord for words that exceed the length limit 2022-11-28 10:20:13 +01:00
ac3baafbe8 Truncate facet values that are too long before indexing them 2022-11-17 11:29:42 +01:00
51961e1064 Polish some details 2022-10-26 13:47:04 +02:00
9026867d17 Give same interface to bulk and incremental facet indexing types
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
c3f49f766d Prepare refactor of facets database
Prepare refactor of facets database
2022-10-26 13:46:14 +02:00
beb987d3d1 Fixing piles of clippy errors.
Most of these are calling clone when the struct supports Copy.

Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.

I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
3794962330 Use an unstable algorithm for grenad::Sorter when possible 2022-09-13 14:49:53 +02:00
fe3973a51c Make sure that long words are correctly skipped 2022-09-07 15:03:32 +02:00
b799f3326b rename merge_nothing to merge_ignore_values 2022-04-05 18:44:35 +02:00
0a77be4ec0 introduce exact_word_docids db 2022-04-04 20:54:02 +02:00
5f9f82757d refactor spawn_extraction_task 2022-04-04 20:54:02 +02:00
04b1bbf932 Reintroduce appending sorted entries when possible 2022-02-24 14:50:45 +01:00
ff8d7a810d Change the behavior of the as_cloneable_grenad by taking a ref 2022-02-16 15:40:08 +01:00
f367cc2e75 Finally bump grenad to v0.4.1 2022-02-16 15:28:48 +01:00
51d1e64b23 Remove, now useless, the WriteMethod enum 2022-01-27 10:08:35 +01:00
d59e559317 Fix the computation of the newly added and common prefix words 2022-01-27 10:08:34 +01:00
5404bc02dd Move the fst_stream_into_hashset method in the helper methods 2022-01-27 10:06:00 +01:00
2dfe24f067 memmap -> memmap2 2021-10-10 22:47:12 +01:00
d18ee58ab9 Check if key are not empty in validator 2021-09-08 15:25:23 +02:00
741a4444a9 Remove log in chunk generator 2021-09-02 16:57:46 +02:00
db0c681bae Fix Pr comments 2021-09-02 15:17:52 +02:00
9452fabfb2 Optimize cbo roaring bitmaps merge 2021-09-01 16:48:40 +02:00
1d314328f0 Plug new indexer 2021-09-01 16:48:36 +02:00