Commit Graph

760 Commits

Author SHA1 Message Date
Tamo
b906e3ed70 improve the way we access the mutex 2024-11-20 10:51:06 +01:00
Tamo
4abcd9c04e add some stats on the batches 2024-11-20 10:51:06 +01:00
Tamo
229fa0f902 implements the batch details 2024-11-20 10:51:06 +01:00
Tamo
5d10c2312b remove unused file 2024-11-20 10:51:06 +01:00
Tamo
f1d38581e5 add the front end tests on the batches routes 2024-11-20 10:51:06 +01:00
Tamo
62646af7b9 implements the automatic batch deletion 2024-11-20 10:51:06 +01:00
Tamo
1fcb9526f5 fix the task cancelation 2024-11-20 10:51:06 +01:00
Tamo
15eefa4fcc fixes a lot of small issue, the test about the cancellation is still failing 2024-11-20 10:51:05 +01:00
Tamo
ad9763ffcd copy multiple task query tests to batches. Currently, they fails 2024-11-20 10:49:25 +01:00
Tamo
d489f5635f add the mapping between the task and batches 2024-11-20 10:49:23 +01:00
Tamo
a1251c3c83 Implements the get all batches route with filters working 2024-11-20 10:42:55 +01:00
Tamo
6062914654 add the batch_id to the tasks 2024-11-20 10:42:54 +01:00
Lukas Kalbertodt
057fcb3993 Add indices field to _matchesPosition to specify where in an array a match comes from (#5005)
Some checks are pending
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h4m31s
* Remove unreachable code

* Add `indices` field to `MatchBounds`

For matches inside arrays, this field holds the indices of the array
elements that matched. For example, searching for `cat` inside
`{ "a": ["dog", "cat", "fox"] }` would return `indices: [1]`. For nested
arrays, this contains multiple indices, starting with the one for the
top-most array. For matches in fields without arrays, `indices` is not
serialized (does not exist) to save space.
2024-11-20 01:00:43 +01:00
ManyTheFish
41dbdd2d18 Fix filtered_placeholder_search_should_not_return_deleted_documents and word_scale_set_and_reset 2024-11-19 16:08:25 +01:00
Louis Dureuil
bfefaf71c2 Progress displayed in logs 2024-11-19 09:32:52 +01:00
Louis Dureuil
c782c09208 Move step to a dedicated mod and replace it with an enum 2024-11-18 18:22:13 +01:00
Louis Dureuil
75943a5a9b Add TODO to remember replacing steps with an enum 2024-11-18 17:40:51 +01:00
Louis Dureuil
04c38220ca Move MostlySend, ThreadLocal, FullySend to their own commit 2024-11-18 16:43:05 +01:00
Louis Dureuil
5f93651cef fixes 2024-11-18 16:23:11 +01:00
ManyTheFish
510ca99996 Fixes #4974 2024-11-18 16:08:55 +01:00
ManyTheFish
8924d486db Add a test reproducing the bug 2024-11-18 16:08:55 +01:00
ManyTheFish
e0c3f3d560 Fix #4984 2024-11-18 16:08:53 +01:00
Louis Dureuil
0a21d9bfb3 Fix double borrow of new fields id map 2024-11-18 15:56:01 +01:00
Louis Dureuil
1f8b01a598 Fix snap since _vectors is no longer part of the field distributions 2024-11-18 12:50:59 +01:00
Louis Dureuil
e736a74729 Remove infinite loop in import_vectors 2024-11-18 12:50:56 +01:00
Louis Dureuil
e9d17136b2 Add deadline of 3 seconds to embedding requests made in the context of hybrid search 2024-11-18 12:15:11 +01:00
Louis Dureuil
a05e448cf8 Add test 2024-11-18 12:15:11 +01:00
ManyTheFish
cd796b0f4b Fix SDK test 2024-11-18 11:46:00 +01:00
Louis Dureuil
6570da3bcb Retry in case where the JSON deserialization fails 2024-11-18 11:33:09 +01:00
Clément Renault
5b4c06c24c Plug the grenad max memory parameter 2024-11-18 11:28:04 +01:00
Louis Dureuil
3a8051866a Use return_keyword_results function instead of returning raw keyword results when the embedder is broken 2024-11-18 11:17:15 +01:00
Louis Dureuil
9150c8f052 Accept changes to vector format 2024-11-18 11:04:57 +01:00
Louis Dureuil
c202f3dbe2 fix tests and revert change in behavior when primary_key_from_op != primary_key_from_db && index.is_empty() 2024-11-18 10:59:05 +01:00
Clément Renault
677d7293f5 Fix a lot of primary key related tests 2024-11-18 10:59:05 +01:00
Clément Renault
bd31ea2174 Check for at least one valid task after setting their statuses 2024-11-18 10:59:05 +01:00
Clément Renault
83865d2ebd Expose intermediate errors when processing batches 2024-11-18 10:59:05 +01:00
ManyTheFish
72ba353498 reproduce sdk fail 2024-11-18 10:03:23 +01:00
ManyTheFish
4ff2b3c2ee Fix test on locales 2024-11-14 15:45:04 +01:00
ManyTheFish
91c58cfa38 Fix positional databases 2024-11-14 11:40:12 +01:00
Clément Renault
9e8367f1e6 Move the rayon thread pool outside the extract method 2024-11-14 10:40:32 +01:00
ManyTheFish
0dd321afc7 reproduce #4984 2024-11-14 10:02:51 +01:00
Louis Dureuil
0e3c5d91ab Document deletion test passes 2024-11-14 08:42:56 +01:00
Louis Dureuil
695c2c6b99 Cosmetic fix 2024-11-14 08:42:39 +01:00
Louis Dureuil
40dd25d6b2 Fix issue with Replace document method when adding and deleting a document in the same batch 2024-11-13 22:10:00 +01:00
Clément Renault
8e5b1a3ec1 Compute the field distribution and convert _geo into an f64s 2024-11-13 17:44:05 +01:00
ManyTheFish
e627e182ce Fix facet strings 2024-11-13 17:43:02 +01:00
ManyTheFish
51b6293738 Add linear facet databases 2024-11-13 17:43:02 +01:00
Clément Renault
b17896d899 Finialize the GeoExtractor 2024-11-13 17:43:02 +01:00
Louis Dureuil
a01bc7b454 Fix error_document_field_limit_reached_in_one_document test 2024-11-13 10:34:54 +01:00
Louis Dureuil
7accfea624 Don't short circuit when we encounter a semantic error while extracting fields and external docid 2024-11-13 10:33:59 +01:00