Commit Graph

119 Commits

Author SHA1 Message Date
Louis Dureuil
90e6b6416f new extractor bugfixes:
- fix old_has_fragments
- new_is_user_provided is always false when generating fragments,
  even if no fragment ever matches
2025-07-03 14:35:02 +02:00
Louis Dureuil
735634e998 Send owned metadata and clear inputs in case of error 2025-07-03 10:32:57 +02:00
Louis Dureuil
a06cb1bfd6 Remove Embed::process_embeddings and have it be an inherent function of the type that uses it 2025-07-03 10:02:16 +02:00
Louis Dureuil
b086c51a23 new settings indexer 2025-07-02 00:05:13 +02:00
Louis Dureuil
f8232976ed Implement in new document indexer 2025-07-02 00:05:12 +02:00
ManyTheFish
e414284335 Clippy too many arguments 2025-06-30 14:25:28 +02:00
ManyTheFish
7a204609fe Move document context and identifiers in document.rs 2025-06-30 14:21:46 +02:00
ManyTheFish
6db5939f84 Re-integrate embedder stats 2025-06-30 09:52:06 +02:00
ManyTheFish
d35b2d8d33 minor fixes 2025-06-30 09:52:06 +02:00
ManyTheFish
0687cf058a Avoid rewritting documents that don't change
Ensure being on a reindex action before getting embedder_category_id

Fix document skip function
2025-06-30 09:52:06 +02:00
ManyTheFish
77802dabf6 rename DocumentChangeContext into DocumentContext 2025-06-26 18:14:48 +02:00
ManyTheFish
900be0ccad Extract or regenerate vectors related to settings changes 2025-06-26 18:14:48 +02:00
ManyTheFish
51a087b764 Write back user provided vectors from deleted embedders 2025-06-26 18:14:48 +02:00
Mubelotix
29f6eeff8f Remove lots of Arcs 2025-06-26 12:15:08 +02:00
Mubelotix
d08e89ea3d Remove options 2025-06-24 15:10:15 +02:00
Mubelotix
695877043a Fix warnings 2025-06-24 14:53:39 +02:00
Mubelotix
4925b30196 Move embedder stats out of progress 2025-06-23 15:24:14 +02:00
Mubelotix
4cadc8113b Add embedder stats in batches 2025-06-20 12:42:22 +02:00
Clément Renault
abb399b802 Merge pull request #5674 from meilisearch/release-v1.15.2
Bring back v1.15.2 to main
2025-06-16 11:36:07 +00:00
Louis Dureuil
7200437246 Comment the cases 2025-06-12 15:55:52 +02:00
Louis Dureuil
68e7bfb37f Don't fail if you cannot render previous version 2025-06-12 15:55:33 +02:00
Louis Dureuil
209c4bfc18 Switch the versions of the documents for rendering :/ 2025-06-12 15:47:47 +02:00
Louis Dureuil
396d76046d Regenerate embeddings more often:
- When `regenerate` was previously `false` and became `true`
- When rendering the old version of the docs failed
2025-06-12 15:41:53 +02:00
Clément Renault
9bda9a9a64 Merge remote-tracking branch 'origin/main' into tmp-release-v1.15.1 2025-06-12 10:21:07 +02:00
Clément Renault
82313a4444 Cargo fmt 2025-06-03 15:39:26 +02:00
nnethercott
1811168b96 remove duplicated check on geo field changes 2025-05-28 15:45:13 +02:00
Nate Nethercott
b06cc1e0a2 Update crates/milli/src/update/new/extract/faceted/extract_facets.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2025-05-28 15:38:23 +02:00
Nate Nethercott
44f812c36d Update crates/milli/src/update/new/extract/faceted/extract_facets.rs
Co-authored-by: Many the fish <many@meilisearch.com>
2025-05-28 15:38:12 +02:00
nnethercott
9ad43b6841 rename has_changed to has_changed_for_facets 2025-05-26 18:37:20 +02:00
nnethercott
c9ec502ed9 refactor for readability 2025-05-26 18:32:59 +02:00
nnethercott
18aed75d3b fix logic 2025-05-26 18:20:55 +02:00
nnethercott
6738a4f6ee feat: mettre a jour the insta snapshots 2025-05-26 16:36:36 +02:00
nnethercott
95821d0bde refactor: update macro 2025-05-26 10:07:13 +02:00
nnethercott
f690fa0686 feat: add macro_rules to factorize 2025-05-26 09:46:14 +02:00
nnethercott
24e94b28c1 feat: uncouple geo extraction from full doc 2025-05-26 09:22:20 +02:00
ManyTheFish
63a4dfa2a8 Add disableOnNumber setting 2025-04-23 16:57:50 +02:00
Tamo
b025f1bcf1 Merge branch 'main' into release-v1.14.0-tmp 2025-04-14 12:35:47 +02:00
Clément Renault
a0bfcf8872 Make cargo fmt happy 2025-04-01 11:27:41 +02:00
Clément Renault
64477aac60 Box the large GeoError error variant 2025-04-01 11:26:34 +02:00
Clément Renault
4d90e3d2ec Make Cargo and Clippy happy 2025-04-01 11:26:34 +02:00
Louis Dureuil
f729864466 Check dimension mismatch at insertion time 2025-03-31 15:27:49 +02:00
ManyTheFish
d3cd5ea689 Check if the geo fields changed additionally to the other faceted fields when reindexing facets 2025-03-12 11:20:10 +01:00
ManyTheFish
8790880589 Fix clippy 2025-03-11 15:22:39 +01:00
ManyTheFish
6d52c6e711 Merge branch 'main' into granular-filterable-attributes 2025-03-11 10:05:58 +01:00
ManyTheFish
689e69d6d2 Take into account PR messages 2025-03-10 13:46:33 +01:00
meili-bors[bot]
3fd86e8d76 Merge #5371
Some checks failed
Test suite / Tests on ubuntu-22.04 (push) Failing after 12s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test with Ollama (push) Failing after 8s
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 11s
Test suite / Run Clippy (push) Successful in 7m1s
Test suite / Run Rustfmt (push) Successful in 2m44s
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m40s
Indexing bench (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of indexing (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Has been cancelled
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Has been cancelled
Test suite / Tests on macos-13 (push) Has been cancelled
Test suite / Tests on windows-2022 (push) Has been cancelled
5371: Composite embedders r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #5343 

## What does this PR do?
- Implement [public usage](https://www.notion.so/meilisearch/Composite-embedder-usage-14a4b06b651f81859dc3df21e8cd02a0)
- Refactor the way we check if a parameter is mandatory/allowed/disallowed for a given source
- Take the "nesting context" into account for computer if a parameter is mandatory/allowed/disallowed
- Add tests checking all parameters with all sources, and made sure the results didn't change compared with v1.13

## Dumpless Upgrade

- This adds a new value for an existing parameter => compatible without change
- This adds new optional parameters => compatible without change

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-03-05 17:18:11 +00:00
ManyTheFish
ae8d453868 Refactor Document indexing process (searchables)
**Changes:**
The searchable database extraction is now relying on the AttributePatterns and FieldIdMapWithMetadata to match the field to extract.
Remove the SearchableExtractor trait to make the code less complex.

**Impact:**
- Document Addition/modification searchable indexing
- Document deletion searchable indexing
2025-03-03 10:32:42 +01:00
ManyTheFish
95bccaf5f5 Refactor Document indexing process (Facets)
**Changes:**
The Documents changes now take a selector closure instead of a list of field to match the field to extract.
The seek_leaf_values_in_object function now uses a selector closure of a list of field to match the field to extract
The facet database extraction is now relying on the FilterableAttributesRule to match the field to extract.
The facet-search database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.
The facet level database extraction is now relying on the FieldIdMapWithMetadata to select the field to index.

**Important:**
Because the filterable attributes are patterns now,
the fieldIdMap will only register the fields that exists in at least one document.
if a field doesn't exist in any document, it will not be registered even if it has been specified in the filterable fields.

**Impact:**
- Document Addition/modification facet indexing
- Document deletion facet indexing
2025-03-03 10:32:03 +01:00
ManyTheFish
9f3663e768 Implement Incremental document database stats computing 2025-02-26 17:01:35 +01:00
Louis Dureuil
4a2643daa2 Rename embed_one to embed_search and embed_chunks* to embed_index* 2025-02-24 13:58:26 +01:00