Commit Graph

1856 Commits

Author SHA1 Message Date
a892a4a79c Introduce a function to extend from a JSON array of objects 2022-07-12 15:14:06 +02:00
dc61105554 Fix the nested document id fetching function 2022-07-12 15:14:06 +02:00
2eec290424 Check the validity of the latitute and longitude numbers 2022-07-12 15:14:06 +02:00
5d149d631f Remove tests for a function that no more exists 2022-07-12 15:14:06 +02:00
0bbcc7b180 Expose the DocumentId struct to be sure to inject the generated ids 2022-07-12 15:14:06 +02:00
d1a4da9812 Generate a real UUIDv4 when ids are auto-generated 2022-07-12 15:14:06 +02:00
c8ebf0de47 Rename the validate function as an enriching function 2022-07-12 15:14:06 +02:00
905af2a2e9 Use the primary key and external id in the transform 2022-07-12 15:14:05 +02:00
742543091e Constify the default primary key name 2022-07-12 14:55:52 +02:00
5f1bfb73ee Extract the primary key name and make it accessible 2022-07-12 14:55:52 +02:00
6a0a0ae94f Make the Transform read from an EnrichedDocumentsBatchReader 2022-07-12 14:55:52 +02:00
ea852200bb Fix the format used for a geo deleting benchmark 2022-07-12 14:55:52 +02:00
dc3f092d07 Do not leak an internal grenad Error 2022-07-12 14:55:52 +02:00
8ebf5eed0d Make the nested primary key work 2022-07-12 14:55:52 +02:00
19eb3b4708 Make sur that we do not accept floats as documents ids 2022-07-12 14:55:52 +02:00
2ceeb51c37 Support the auto-generated ids when validating documents 2022-07-12 14:55:51 +02:00
399eec5c01 Fix the indexation tests 2022-07-12 14:55:51 +02:00
fcfc4caf8c Move the Object type in the lib.rs file and use it everywhere 2022-07-12 14:55:51 +02:00
0146175fe6 Introduce the validate_documents_batch function 2022-07-12 14:55:51 +02:00
cefffde9af Improve the .gitignore of the fuzz crate 2022-07-12 14:55:51 +02:00
bdc4263883 Introduce the validate_documents_batch function 2022-07-12 14:55:51 +02:00
a97d4d63b9 Fix the benchmarks 2022-07-12 14:55:50 +02:00
f29114f94a Fix http-ui to fit with the new DocumentsBatchBuilder/Reader structs 2022-07-12 14:52:56 +02:00
a4ceef9624 Fix the cli for the new DocumentsBatchBuilder/Reader structs 2022-07-12 14:52:56 +02:00
6d0498df24 Fix the fuzz tests 2022-07-12 14:52:56 +02:00
e8297ad27e Fix the tests for the new DocumentsBatchBuilder/Reader 2022-07-12 14:52:56 +02:00
419ce3966c Rework the DocumentsBatchBuilder/Reader to use grenad 2022-07-12 14:52:55 +02:00
eb63af1f10 Update grenad to 0.4.2 2022-07-12 14:52:55 +02:00
048e174efb Do not allocate when parsing CSV headers 2022-07-12 14:52:55 +02:00
ce90fc628a Merge #583
583: Use BufReader to read datasets in benchmarks r=ManyTheFish a=loiclec

## What does this PR do?
Ensure that the datasets used by the benchmarks are read efficiently by using a `BufReader`.

## Why?
Using a `BufReader` is more representative of how `meilisearch` works. It will also make performance comparisons between different branches of `milli` more  accurate.




Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-07-07 08:13:07 +00:00
aae03356cb Use BufReader to read datasets in benchmarks 2022-07-06 18:20:15 +02:00
ebddfdb9a3 Merge #578
578: Bump uuid to 1.1.2 r=ManyTheFish a=Kerollmops

Just to [align the version with Meilisearch](https://github.com/meilisearch/meilisearch/pull/2584).

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-07-05 14:56:08 +00:00
eeba196053 Merge #572
572: Add reindexing benchmarks r=Kerollmops a=irevoire

With #557 coming, we should add benchmarks that measure our impact on the reindexing process.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-05 14:43:01 +00:00
1bfdcfc84f Bump uuid to 1.1.2 2022-07-05 16:23:36 +02:00
dd1e606f13 Merge #557
557: Fasten documents deletion and update r=Kerollmops a=irevoire

When a document deletion occurs, instead of deleting the document we mark it as deleted in the new “soft deleted” bitmap. It is then removed from the search and all the other endpoints.

I ran the benchmarks against main;
```
% ./compare.sh indexing_main_83ad1aaf.json indexing_fasten-document-deletion_abab51fb.json
group                                                                     indexing_fasten-document-deletion_abab51fb    indexing_main_83ad1aaf
-----                                                                     ------------------------------------------    ----------------------
indexing/-geo-delete-facetedNumber-facetedGeo-searchable-                 1.05      2.0±0.40ms        ? ?/sec           1.00  1904.9±190.00µs        ? ?/sec
indexing/-movies-delete-facetedString-facetedNumber-searchable-           1.00     10.3±2.64ms        ? ?/sec           961.61      9.9±0.12s        ? ?/sec
indexing/-movies-delete-facetedString-facetedNumber-searchable-nested-    1.00     15.1±3.90ms        ? ?/sec           554.63      8.4±0.12s        ? ?/sec
indexing/-songs-delete-facetedString-facetedNumber-searchable-            1.00     45.1±7.53ms        ? ?/sec           710.15     32.0±0.10s        ? ?/sec
indexing/-wiki-delete-searchable-                                         1.00    277.8±7.97ms        ? ?/sec           1946.57    540.8±3.15s        ? ?/sec
indexing/Indexing geo_point                                               1.00      12.0±0.20s        ? ?/sec           1.03      12.4±0.19s        ? ?/sec
indexing/Indexing movies in three batches                                 1.00      19.3±0.30s        ? ?/sec           1.01      19.4±0.16s        ? ?/sec
indexing/Indexing movies with default settings                            1.00      18.8±0.09s        ? ?/sec           1.00      18.9±0.10s        ? ?/sec
indexing/Indexing nested movies with default settings                     1.00      25.9±0.19s        ? ?/sec           1.00      25.9±0.12s        ? ?/sec
indexing/Indexing nested movies without any facets                        1.00      24.8±0.17s        ? ?/sec           1.00      24.8±0.18s        ? ?/sec
indexing/Indexing songs in three batches with default settings            1.00      65.9±0.96s        ? ?/sec           1.03      67.8±0.82s        ? ?/sec
indexing/Indexing songs with default settings                             1.00      58.8±1.11s        ? ?/sec           1.02      59.9±2.09s        ? ?/sec
indexing/Indexing songs without any facets                                1.00      53.4±0.72s        ? ?/sec           1.01      54.2±0.88s        ? ?/sec
indexing/Indexing songs without faceted numbers                           1.00      57.9±1.17s        ? ?/sec           1.01      58.3±1.20s        ? ?/sec
indexing/Indexing wiki                                                    1.00   1065.2±13.26s        ? ?/sec           1.00   1065.8±12.66s        ? ?/sec
indexing/Indexing wiki in three batches                                   1.00    1182.4±6.20s        ? ?/sec           1.01    1190.8±8.48s        ? ?/sec
```

Most things do not change, we lost 0.1ms on the indexing of geo point (I don’t get why), and then we are between 500 and 1900 times faster when we delete documents.


Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-05 14:14:38 +00:00
250be9fe6c put the threshold back to 10k 2022-07-05 15:57:44 +02:00
62692c171d Merge #577
577: Fix deserialisation of NDJson documents in benchmarks r=irevoire a=loiclec

Previously, the first document in the NDJson file was read over and over again. So the `geo_point` benchmark was not working properly: it only indexed one document.

Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-07-05 13:54:47 +00:00
9bc7627e27 Fix deserialisation of NDJson documents in benchmarks 2022-07-05 15:51:06 +02:00
b61efd09fc Makes the internal soft deleted error a UserError 2022-07-05 15:34:45 +02:00
eaf28b0628 Apply review suggestions
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-07-05 15:30:33 +02:00
3b309f654a Fasten the document deletion
When a document deletion occurs, instead of deleting the document we mark it as deleted
in the new “soft deleted” bitmap. It is then removed from the search, and all the other
endpoints.
2022-07-05 15:30:33 +02:00
2700d8dc67 Add reindexing benchmarks 2022-07-05 14:46:46 +02:00
77c837fc1b Merge #575
575: Bump charabia r=loiclec a=irevoire

This fix #573

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-07-05 11:53:57 +00:00
446439e8be bump charabia 2022-07-05 12:19:30 +02:00
c6f4775fde Merge #568
568: Fix not equal filter when field contains both number and strings r=Kerollmops a=GraDKh

Related to https://github.com/meilisearch/meilisearch/issues/2516
Looks like the issue should be moved to this repo, but I'm not sure what the right procedure for it.

Co-authored-by: Dmytro Gordon <dmytro@bigstream.co>
2022-06-28 08:46:23 +00:00
3ff03a3f5f Fix not equal filter when field contains both number and strings 2022-06-27 15:55:17 +03:00
83ad1aaf05 Merge #567
567: Bump the milli version to 0.31.1 r=curquiza a=Kerollmops



Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-06-22 15:07:03 +00:00
cc48992e79 Bump the milli version to 0.31.1 2022-06-22 17:05:51 +02:00
68bb170732 Merge #566
566: Introduce the copy_to_path method on the Index r=irevoire a=Kerollmops

Meilisearch needs this method to do snapshots.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-06-22 14:52:19 +00:00
238692a8e7 Introduce the copy_to_path method on the Index 2022-06-22 16:49:47 +02:00