Files
meilisearch/docs/embedder_settings.md
2025-06-12 11:21:47 +02:00

1398 lines
13 KiB
Markdown

The tables below have been generated by calling `cargo run --bin embedder_settings`
## List of the embedder settings
<table>
<tbody>
<thead>
<tr>
<th>Setting</th>
<th>Description</th>
<th>Type</th>
<th>Default Value</th>
<th>Regenerate on Change</th>
</tr>
</thead>
<tr>
<td>
`source`
</td>
<td>
The source used to provide the embeddings.
Which embedder parameters are available and mandatory is determined by the value of this setting.
</td>
<td>
"openAi" | "huggingFace" | "userProvided" | "ollama" | "rest" | "composite"
</td>
<td>
"openAi"
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`model`
</td>
<td>
The name of the model to use.
</td>
<td>
string
</td>
<td>
- For source `openAi`, defaults to "text-embedding-3-small"
- For source `huggingFace`, defaults to "BAAI/bge-base-en-v1.5"
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`revision`
</td>
<td>
The revision (commit SHA1) of the model to use.
If unspecified, Meilisearch picks the latest revision of the model.
</td>
<td>
string
</td>
<td>
- When `model` is set to default, defaults to "617ca489d9e86b49b8167676d8220688b99db36e"
- Otherwise, defaults to `null`
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`pooling`
</td>
<td>
The pooling method to use.
</td>
<td>
"useModel" | "forceCls" | "forceMean"
</td>
<td>
"useModel"
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`apiKey`
</td>
<td>
The API key to pass to the remote embedder while making requests.
</td>
<td>
string
</td>
<td>
`null`
</td>
<td>
🌱 Never
</td>
</tr>
<tr>
<td>
`dimensions`
</td>
<td>
The expected dimensions of the embeddings produced by this embedder.
</td>
<td>
number
</td>
<td>
`null`
</td>
<td>
- 🏗️ When the source is `openAi`, changing the value of this parameter always regenerates embeddings
- 🌱 For other sources, changing the value of this parameter never regenerates embeddings
</td>
</tr>
<tr>
<td>
`documentTemplate`
</td>
<td>
A liquid template used to render documents to a text that can be embedded.
Meillisearch interpolates the template for each document and sends the resulting text to the embedder.
The embedder then generates document vectors based on this text.
</td>
<td>
string
</td>
<td>
{% for field in fields %}{% if field.is_searchable and field.value != nil %}{{ field.name }}: {{ field.value }}
{% endif %}{% endfor %}
</td>
<td>
- 🏗️ When modified, embeddings are regenerated for documents whose rendering through the template produces a different text.
</td>
</tr>
<tr>
<td>
`documentTemplateMaxBytes`
</td>
<td>
Rendered texts are truncated to this size before embedding.
</td>
<td>
number
</td>
<td>
400
</td>
<td>
- 🏗️ When increased, embeddings are regenerated for documents whose rendering through the template produces a different text.
- 🌱 When decreased, embeddings are never regenerated
</td>
</tr>
<tr>
<td>
`url`
</td>
<td>
URL to reach the remote embedder.
</td>
<td>
string
</td>
<td>
`null`
</td>
<td>
- 🌱 When modified for source `openAi`, embeddings are never regenerated
- 🏗️ When modified for sources `ollama` and `rest`, embeddings are always regenerated
</td>
</tr>
<tr>
<td>
`request`
</td>
<td>
Template request to send to the remote embedder.
</td>
<td>
any
</td>
<td>
`null`
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`response`
</td>
<td>
Template response indicating how to find the embeddings in the response from the remote embedder.
</td>
<td>
any
</td>
<td>
`null`
</td>
<td>
🏗️ Always
</td>
</tr>
<tr>
<td>
`headers`
</td>
<td>
Additional headers to send to the remote embedder.
</td>
<td>
object
</td>
<td>
`null`
</td>
<td>
🌱 Never
</td>
</tr>
<tr>
<td>
`searchEmbedder`
</td>
<td>
Embedder settings for the embedder used at search time.
</td>
<td>
object
</td>
<td>
`null`
</td>
<td>
🌱 Never
</td>
</tr>
<tr>
<td>
`indexingEmbedder`
</td>
<td>
Embedder settings for the embedder used at indexing time.
</td>
<td>
object
</td>
<td>
`null`
</td>
<td>
- Embedding are regenerated when the setting modified in the indexing embedder require regeneration.
</td>
</tr>
<tr>
<td>
`distribution`
</td>
<td>
Affine transformation applied to the semantic score to make it more comparable to the ranking score.
</td>
<td>
object
</td>
<td>
`null`
</td>
<td>
🌱 Never
</td>
</tr>
<tr>
<td>
`binaryQuantized`
</td>
<td>
Whether to binary quantize the embeddings of this embedder.
Binary quantized embeddings are smaller than regular embeddings, which improves
disk usage and retrieval speed, at the cost of relevancy.
</td>
<td>
boolean
</td>
<td>
`false`
</td>
<td>
- Embeddings are not regenerated, but the binary quantization takes time during indexing.
</td>
</tr>
</tbody>
</table>
## Availability of the settings depending on the selected source
<table>
<tbody>
<thead>
<tr>
<th>Setting</th>
<th>
openAi
</th>
<th>
huggingFace
</th>
<th>
ollama
</th>
<th>
userProvided
</th>
<th>
rest
</th>
<th>
composite
</th>
</tr>
</thead>
<tr>
<td>
`model`
</td>
<td>
✅ Allowed
</td>
<td>
✅ Allowed
</td>
<td>
🔐 **Mandatory**
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`revision`
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`pooling`
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`apiKey`
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`dimensions`
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🔐 **Mandatory**
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`documentTemplate`
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`documentTemplateMaxBytes`
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, ✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`url`
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🔐 **Mandatory**
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`request`
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🔐 **Mandatory**
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`response`
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🔐 **Mandatory**
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`headers`
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
✅ Allowed
</td>
<td>
🚫 Disallowed
</td>
</tr>
<tr>
<td>
`searchEmbedder`
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🔐 **Mandatory**
</td>
</tr>
<tr>
<td>
`indexingEmbedder`
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🚫 Disallowed
</td>
<td>
🔐 **Mandatory**
</td>
</tr>
<tr>
<td>
`distribution`
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
</tr>
<tr>
<td>
`binaryQuantized`
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
<td>
- Usually, ✅ Allowed
- When used in `searchEmbedder` in a `composite` embedder, 🚫 Disallowed
- When used in `indexingEmbedder` in a `composite` embedder, 🚫 Disallowed
</td>
</tr>
</tbody>
</table>