Compare commits

..

36 Commits

Author SHA1 Message Date
Clément Renault
6c1739218c Settings in queue 2025-05-21 18:34:11 +02:00
Clément Renault
72d4998dce Correctly list the chat settings key actions 2025-05-21 16:24:51 +02:00
Clément Renault
fde11573da Always use the frequency matching strategy 2025-05-21 16:18:37 +02:00
Clément Renault
41220f786b Remove templating validation 2025-05-21 16:10:31 +02:00
Clément Renault
4d59fdb65d Correctly support document templates on the chat API 2025-05-21 15:32:34 +02:00
Clément Renault
3e51c0a4c1 Introduce the new index chat settings 2025-05-21 11:07:06 +02:00
Clément Renault
91c6ab8392 Make sure errorneous calls are handled and forwarded to the LLM 2025-05-20 18:01:08 +02:00
Clément Renault
beff6adeb1 Catch invalid argument calls to search function 2025-05-20 17:55:21 +02:00
Clément Renault
18eab165a7 Support multiple indexes and not only main 2025-05-20 17:43:24 +02:00
Clément Renault
5c6b63df65 Limit the number of internal loop calls and change the function name 2025-05-20 16:44:28 +02:00
Clément Renault
7266aed770 Correctly support tenant tokens and filters 2025-05-20 16:15:49 +02:00
Clément Renault
bae6c98aa3 Stream errors 2025-05-20 12:23:22 +02:00
Clément Renault
42c95cf3c4 Stop the stream when the connexion stops and chnage the events 2025-05-20 12:05:51 +02:00
Clément Renault
4f919db344 Generate a new default chat API key 2025-05-20 11:00:19 +02:00
Clément Renault
295840d07a Change the /chat route to /chat/completions to be OpenAI-compatible 2025-05-20 10:14:56 +02:00
Clément Renault
c0c3bddda8 Better stop the stream 2025-05-16 17:12:48 +02:00
Clément Renault
10b5fcd4ba Update the streaming detection to work with Mistral 2025-05-16 15:17:01 +02:00
Clément Renault
8113d4a52e Make it compatible with the Mistral API 2025-05-16 14:33:53 +02:00
Clément Renault
5964289284 Support base_api in the settings 2025-05-15 18:28:02 +02:00
Clément Renault
6b81854d48 Make clippy happy 2025-05-15 18:16:06 +02:00
Clément Renault
9e5b466426 Display pre-query prompt in search tool response 2025-05-15 18:10:09 +02:00
Clément Renault
b43ffd8fac Commit when putting stuff in LMDB 2025-05-15 18:03:26 +02:00
Clément Renault
43da2bcb8c Remove useless function 2025-05-15 17:52:26 +02:00
Clément Renault
5e3b126d73 Expose new chat settings routes 2025-05-15 17:48:10 +02:00
Clément Renault
6c034754ca Factorise a bit the code 2025-05-15 15:39:38 +02:00
Clément Renault
6329cf7ed6 Display the different tool calls we need to do 2025-05-15 11:17:34 +02:00
Clément Renault
e0c8c11a94 Send an event with the content of the tool calling 2025-05-14 17:15:32 +02:00
Clément Renault
6e8b371111 Streaming supports tool calling 2025-05-14 14:58:01 +02:00
Clément Renault
da7d651f4b Nearly support tools on the streaming route 2025-05-14 14:29:41 +02:00
Clément Renault
24050f06e4 Return the right message format 2025-05-14 12:03:43 +02:00
Clément Renault
af482d8ee9 Aggregate tool calls and display the calls to make. 2025-05-14 11:53:03 +02:00
Clément Renault
7d62307739 Implement a first version of a streamed chat API 2025-05-14 11:18:21 +02:00
Clément Renault
3a71df7b5a Make it work by retrieving content from the index 2025-05-13 16:35:46 +02:00
Clément Renault
ac39a436d9 Support overwriten prompts of the search query 2025-05-13 16:33:58 +02:00
Clément Renault
e5c963a170 Support querying the index named main 2025-05-13 15:26:24 +02:00
Clément Renault
9baf2ce1a6 Introduce the first version of the /chat route that mimics the OpenAI API 2025-05-13 11:19:32 +02:00
52 changed files with 1904 additions and 3131 deletions

169
CLAUDE.md
View File

@@ -1,169 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Repository Overview
Meilisearch is a lightning-fast search engine written in Rust. It's organized as a Rust workspace with multiple crates that handle different aspects of the search engine functionality.
## Architecture
### Core Crates Structure
- **`crates/meilisearch/`** - Main HTTP server implementing the REST API with Actix Web
- **`crates/milli/`** - Core search engine library (indexing, search algorithms, ranking)
- **`crates/index-scheduler/`** - Task scheduling, batching, and index lifecycle management
- **`crates/meilisearch-auth/`** - Authentication and API key management
- **`crates/meilisearch-types/`** - Shared types and data structures
- **`crates/dump/`** - Database dump and restore functionality
- **`crates/meilitool/`** - CLI tool for maintenance operations
### Key Architectural Patterns
1. **Data Flow**:
- Write: HTTP Request → Task Creation → Index Scheduler → Milli Engine → LMDB Storage
- Read: HTTP Request → Search Queue → Milli Engine → Response
2. **Concurrency Model**:
- Single writer, multiple readers for index operations
- Task batching for improved throughput
- Search queue for managing concurrent requests
3. **Storage**: LMDB (Lightning Memory-Mapped Database) with separate environments for tasks, auth, and indexes
## Development Commands
### Building and Running
```bash
# Development
cargo run
# Production build with optimizations
cargo run --release
# Build specific crates
cargo build --release -p meilisearch -p meilitool
# Build without default features
cargo build --locked --release --no-default-features --all
```
### Testing
```bash
# Run all tests
cargo test
# Run tests with release optimizations
cargo test --locked --release --all
# Run a specific test
cargo test test_name
# Run tests in a specific crate
cargo test -p milli
```
### Benchmarking
```bash
# List available features
cargo xtask list-features
# Run workload-based benchmarks
cargo xtask bench -- workloads/hackernews.json
# Run benchmarks without dashboard
cargo xtask bench --no-dashboard -- workloads/hackernews.json
# Run criterion benchmarks
cd crates/benchmarks && cargo bench
```
### Performance Optimizations
```bash
# Speed up builds with lindera cache
export LINDERA_CACHE=$HOME/.cache/lindera
# Prevent rebuilds on directory changes (development only)
export MEILI_NO_VERGEN=1
# Enable full snapshot creation for debugging tests
export MEILI_TEST_FULL_SNAPS=true
```
## Testing Strategy
- **Unit tests**: Colocated with source code using `#[cfg(test)]` modules
- **Integration tests**: Located in `crates/meilisearch/tests/`
- **Snapshot testing**: Using `insta` for deterministic testing
- **Test organization**: By feature (auth, documents, search, settings, index operations)
## Important Files and Directories
- `Cargo.toml` - Workspace configuration
- `rust-toolchain.toml` - Rust version (1.85.1)
- `crates/meilisearch/src/main.rs` - Server entry point
- `crates/milli/src/lib.rs` - Core engine entry point
- `crates/meilisearch-mcp/` - MCP server implementation
- `workloads/` - Benchmark workload definitions
- `assets/` - Static assets and demo files
## Feature Flags
Key features that can be enabled/disabled:
- Language-specific tokenizations (chinese, hebrew, japanese, thai, greek, khmer, vietnamese)
- `mini-dashboard` - Web UI for testing
- `metrics` - Prometheus metrics
- `vector-hnsw` - Vector search with CUDA support
- `mcp` - Model Context Protocol server for AI assistants
## Logging and Profiling
The codebase uses `tracing` for structured logging with these conventions:
- Regular logging spans
- Profiling spans (TRACE level, prefixed with `indexing::` or `search::`)
- Benchmarking spans
For indexing profiling, enable the `exportPuffinReports` experimental feature to generate `.puffin` files.
## Common Development Tasks
### Adding a New Route
1. Add route handler in `crates/meilisearch/src/routes/`
2. Update OpenAPI documentation if API changes
3. Add integration tests in `crates/meilisearch/tests/`
4. If MCP is enabled, routes are automatically exposed via MCP
### Modifying Index Operations
1. Core logic lives in `crates/milli/src/update/`
2. Task scheduling in `crates/index-scheduler/src/`
3. HTTP handlers in `crates/meilisearch/src/routes/indexes/`
### Working with Search
1. Search algorithms in `crates/milli/src/search/`
2. Query parsing in `crates/filter-parser/`
3. Search handlers in `crates/meilisearch/src/routes/indexes/search.rs`
### Working with MCP Server
1. MCP implementation in `crates/meilisearch-mcp/`
2. Tools are auto-generated from OpenAPI specification
3. Enable with `--features mcp` flag
4. Access via `/mcp` endpoint (SSE or POST)
## CI/CD and Git Workflow
- Main branch: `main`
- GitHub Merge Queue enforces rebasing and test passing
- Benchmarks run automatically on push to `main`
- Manual benchmark runs: comment `/bench workloads/*.json` on PRs
## Environment Variables
Key environment variables for development:
- `MEILI_NO_ANALYTICS` - Disable telemetry
- `MEILI_DB_PATH` - Database storage location
- `MEILI_HTTP_ADDR` - Server binding address
- `MEILI_MASTER_KEY` - Master API key for authentication

654
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -5,7 +5,6 @@ members = [
"crates/meilitool",
"crates/meilisearch-types",
"crates/meilisearch-auth",
"crates/meilisearch-mcp",
"crates/meili-snap",
"crates/index-scheduler",
"crates/dump",

View File

@@ -89,53 +89,6 @@ We also offer a wide range of dedicated guides to all Meilisearch features, such
Finally, for more in-depth information, refer to our articles explaining fundamental Meilisearch concepts such as [documents](https://www.meilisearch.com/docs/learn/core_concepts/documents?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=advanced) and [indexes](https://www.meilisearch.com/docs/learn/core_concepts/indexes?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=advanced).
## 🤖 MCP (Model Context Protocol) Server
Meilisearch now supports the [Model Context Protocol](https://modelcontextprotocol.io/), allowing AI assistants and LLM applications to directly interact with your search engine.
### Enabling MCP Server
To enable the MCP server, compile Meilisearch with the `mcp` feature:
```bash
cargo build --release --features mcp
```
The MCP server will be available at `/mcp` endpoint, supporting both SSE (Server-Sent Events) and regular HTTP POST requests.
### Features
- **Automatic Tool Discovery**: All Meilisearch API endpoints are automatically exposed as MCP tools
- **Full API Coverage**: Search, index management, document operations, and more
- **Authentication Support**: Works with existing Meilisearch API keys
- **Streaming Support**: Long-running operations can stream progress updates
### Example Usage
AI assistants can discover available tools:
```json
{
"method": "tools/list"
}
```
And call Meilisearch operations:
```json
{
"method": "tools/call",
"params": {
"name": "searchDocuments",
"arguments": {
"indexUid": "movies",
"q": "science fiction",
"limit": 10
}
}
}
```
For more details on MCP integration, see the [MCP documentation](crates/meilisearch-mcp/README.md).
## 📊 Telemetry
Meilisearch collects **anonymized** user data to help us improve our product. You can [deactivate this](https://www.meilisearch.com/docs/learn/what_is_meilisearch/telemetry?utm_campaign=oss&utm_source=github&utm_medium=meilisearch&utm_content=telemetry#how-to-disable-data-collection) whenever you want.

View File

@@ -31,7 +31,7 @@ anyhow = "1.0.95"
bytes = "1.9.0"
convert_case = "0.6.0"
flate2 = "1.0.35"
reqwest = { version = "0.12.12", features = ["blocking", "rustls-tls"], default-features = false }
reqwest = { version = "0.12.15", features = ["blocking", "rustls-tls"], default-features = false }
[features]
default = ["milli/all-tokenizations"]

View File

@@ -398,6 +398,7 @@ impl<T> From<v5::Settings<T>> for v6::Settings<v6::Unchecked> {
search_cutoff_ms: v6::Setting::NotSet,
facet_search: v6::Setting::NotSet,
prefix_search: v6::Setting::NotSet,
chat: v6::Setting::NotSet,
_kind: std::marker::PhantomData,
}
}

View File

@@ -28,6 +28,7 @@ mod lru;
mod processing;
mod queue;
mod scheduler;
mod settings;
#[cfg(test)]
mod test_utils;
pub mod upgrade;
@@ -53,8 +54,8 @@ use flate2::Compression;
use meilisearch_types::batches::Batch;
use meilisearch_types::features::{InstanceTogglableFeatures, Network, RuntimeTogglableFeatures};
use meilisearch_types::heed::byteorder::BE;
use meilisearch_types::heed::types::I128;
use meilisearch_types::heed::{self, Env, RoTxn, WithoutTls};
use meilisearch_types::heed::types::{SerdeJson, Str, I128};
use meilisearch_types::heed::{self, Database, Env, RoTxn, Unspecified, WithoutTls};
use meilisearch_types::milli::index::IndexEmbeddingConfig;
use meilisearch_types::milli::update::IndexerConfig;
use meilisearch_types::milli::vector::{Embedder, EmbedderOptions, EmbeddingConfigs};
@@ -142,6 +143,8 @@ pub struct IndexScheduler {
/// The list of tasks currently processing
pub(crate) processing_tasks: Arc<RwLock<ProcessingTasks>>,
/// The main database that also has the chat settings.
pub main: Database<Str, Unspecified>,
/// A database containing only the version of the index-scheduler
pub version: versioning::Versioning,
/// The queue containing both the tasks and the batches.
@@ -196,7 +199,7 @@ impl IndexScheduler {
version: self.version.clone(),
queue: self.queue.private_clone(),
scheduler: self.scheduler.private_clone(),
main: self.main.clone(),
index_mapper: self.index_mapper.clone(),
cleanup_enabled: self.cleanup_enabled,
webhook_url: self.webhook_url.clone(),
@@ -267,6 +270,7 @@ impl IndexScheduler {
let features = features::FeatureData::new(&env, &mut wtxn, options.instance_features)?;
let queue = Queue::new(&env, &mut wtxn, &options)?;
let index_mapper = IndexMapper::new(&env, &mut wtxn, &options, budget)?;
let chat_settings = env.create_database(&mut wtxn, Some("chat-settings"))?;
wtxn.commit()?;
// allow unreachable_code to get rids of the warning in the case of a test build.
@@ -290,6 +294,7 @@ impl IndexScheduler {
#[cfg(test)]
run_loop_iteration: Arc::new(RwLock::new(0)),
features,
chat_settings,
};
this.run();
@@ -857,6 +862,18 @@ impl IndexScheduler {
.collect();
res.map(EmbeddingConfigs::new)
}
pub fn chat_settings(&self) -> Result<Option<serde_json::Value>> {
let rtxn = self.env.read_txn().map_err(Error::HeedTransaction)?;
self.chat_settings.get(&rtxn, "main").map_err(Into::into)
}
pub fn put_chat_settings(&self, settings: &serde_json::Value) -> Result<()> {
let mut wtxn = self.env.write_txn().map_err(Error::HeedTransaction)?;
self.chat_settings.put(&mut wtxn, "main", settings)?;
wtxn.commit().map_err(Error::HeedTransaction)?;
Ok(())
}
}
/// The outcome of calling the [`IndexScheduler::tick`] function.

View File

@@ -0,0 +1,432 @@
use std::collections::{BTreeMap, BTreeSet};
use std::convert::Infallible;
use std::fmt;
use std::marker::PhantomData;
use std::num::NonZeroUsize;
use std::ops::{ControlFlow, Deref};
use std::str::FromStr;
use deserr::{DeserializeError, Deserr, ErrorKind, MergeWithError, ValuePointerRef};
use fst::IntoStreamer;
use milli::disabled_typos_terms::DisabledTyposTerms;
use milli::index::{IndexEmbeddingConfig, PrefixSearch};
use milli::proximity::ProximityPrecision;
use milli::update::Setting;
use milli::{FilterableAttributesRule, Index};
use serde::{Deserialize, Serialize, Serializer};
use utoipa::ToSchema;
use crate::deserr::DeserrJsonError;
use crate::error::deserr_codes::*;
use crate::heed::RoTxn;
use crate::IndexScheduler;
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr, ToSchema)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(deny_unknown_fields, rename_all = camelCase, where_predicate = __Deserr_E: deserr::MergeWithError<DeserrJsonError<InvalidSettingsTypoTolerance>>)]
pub struct PromptsSettings {
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
pub system: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsTypoTolerance>)]
#[schema(value_type = Option<MinWordSizeTyposSetting>, example = json!({ "oneTypo": 5, "twoTypo": 9 }))]
pub search_description: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
pub search_q_param: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
pub pre_query: Setting<String>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr, ToSchema)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
pub enum ChatSource {
#[default]
OpenAi,
}
/// Holds all the settings for an index. `T` can either be `Checked` if they represents settings
/// whose validity is guaranteed, or `Unchecked` if they need to be validated. In the later case, a
/// call to `check` will return a `Settings<Checked>` from a `Settings<Unchecked>`.
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr, ToSchema)]
#[serde(
deny_unknown_fields,
rename_all = "camelCase",
bound(serialize = "T: Serialize", deserialize = "T: Deserialize<'static>")
)]
#[deserr(error = DeserrJsonError, rename_all = camelCase, deny_unknown_fields)]
#[schema(rename_all = "camelCase")]
pub struct ChatSettings<T> {
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsDisplayedAttributes>)]
#[schema(value_type = Option<String>)]
pub source: Setting<ChatSource>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSearchableAttributes>)]
#[schema(value_type = Option<String>)]
pub base_api: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSearchableAttributes>)]
#[schema(value_type = Option<String>)]
pub api_key: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsFilterableAttributes>)]
#[schema(value_type = Option<PromptsSettings>)]
pub prompts: Setting<PromptsSettings>,
#[serde(skip)]
#[deserr(skip)]
pub _kind: PhantomData<T>,
}
impl<T> ChatSettings<T> {
pub fn hide_secrets(&mut self) {
match &mut self.api_key {
Setting::Set(key) => Self::hide_secrets(key),
Setting::Reset => todo!(),
Setting::NotSet => todo!(),
}
}
fn hide_secret(secret: &mut String) {
match secret.len() {
x if x < 10 => {
secret.replace_range(.., "XXX...");
}
x if x < 20 => {
secret.replace_range(2.., "XXXX...");
}
x if x < 30 => {
secret.replace_range(3.., "XXXXX...");
}
_x => {
secret.replace_range(5.., "XXXXXX...");
}
}
}
}
impl ChatSettings<Checked> {
pub fn cleared() -> ChatSettings<Checked> {
ChatSettings {
source: Setting::Reset,
base_api: Setting::Reset,
api_key: Setting::Reset,
prompts: Setting::Reset,
_kind: PhantomData,
}
}
pub fn into_unchecked(self) -> ChatSettings<Unchecked> {
let Self { source, base_api, api_key, prompts, _kind } = self;
ChatSettings { source, base_api, api_key, prompts, _kind: PhantomData }
}
}
impl ChatSettings<Unchecked> {
pub fn check(self) -> ChatSettings<Checked> {
ChatSettings {
source: self.source,
base_api: self.base_api,
api_key: self.api_key,
prompts: self.prompts,
_kind: PhantomData,
}
}
pub fn validate(self) -> Result<Self, milli::Error> {
self.validate_prompt_settings()?;
self.validate_global_settings()
}
fn validate_global_settings(mut self) -> Result<Self, milli::Error> {
// Check that the ApiBase is a valid URL
Ok(self)
}
fn validate_prompt_settings(mut self) -> Result<Self, milli::Error> {
// TODO
// let Setting::Set(mut configs) = self.embedders else { return Ok(self) };
// for (name, config) in configs.iter_mut() {
// let config_to_check = std::mem::take(config);
// let checked_config =
// milli::update::validate_embedding_settings(config_to_check.inner, name)?;
// *config = SettingEmbeddingSettings { inner: checked_config };
// }
// self.embedders = Setting::Set(configs);
Ok(self)
}
pub fn merge(&mut self, other: &Self) {
// For most settings only the latest version is kept
*self = Self {
source: other.source.or(self.source),
base_api: other.base_api.or(self.base_api),
api_key: other.api_key.or(self.api_key),
prompts: match (self.prompts, other.prompts) {
(Setting::NotSet, set) | (set, Setting::NotSet) => set,
(Setting::Set(_) | Setting::Reset, Setting::Reset) => Setting::Reset,
(Setting::Reset, Setting::Set(set)) => Setting::Set(set),
// If both are set we must merge the prompts settings
(Setting::Set(this), Setting::Set(other)) => Setting::Set(PromptsSettings {
system: other.system.or(system),
search_description: other.search_description.or(search_description),
search_q_param: other.search_q_param.or(search_q_param),
pre_query: other.pre_query.or(pre_query),
}),
},
_kind: PhantomData,
}
}
}
pub fn apply_settings_to_builder(
settings: &ChatSettings<Checked>,
// TODO we must not store this into milli but in the index scheduler
builder: &mut milli::update::Settings,
) {
let ChatSettings { source, base_api, api_key, prompts, _kind } = settings;
match source.deref() {
Setting::Set(ref names) => builder.set_searchable_fields(names.clone()),
Setting::Reset => builder.reset_searchable_fields(),
Setting::NotSet => (),
}
match displayed_attributes.deref() {
Setting::Set(ref names) => builder.set_displayed_fields(names.clone()),
Setting::Reset => builder.reset_displayed_fields(),
Setting::NotSet => (),
}
match filterable_attributes {
Setting::Set(ref facets) => {
builder.set_filterable_fields(facets.clone().into_iter().collect())
}
Setting::Reset => builder.reset_filterable_fields(),
Setting::NotSet => (),
}
match sortable_attributes {
Setting::Set(ref fields) => builder.set_sortable_fields(fields.iter().cloned().collect()),
Setting::Reset => builder.reset_sortable_fields(),
Setting::NotSet => (),
}
match ranking_rules {
Setting::Set(ref criteria) => {
builder.set_criteria(criteria.iter().map(|c| c.clone().into()).collect())
}
Setting::Reset => builder.reset_criteria(),
Setting::NotSet => (),
}
match stop_words {
Setting::Set(ref stop_words) => builder.set_stop_words(stop_words.clone()),
Setting::Reset => builder.reset_stop_words(),
Setting::NotSet => (),
}
match non_separator_tokens {
Setting::Set(ref non_separator_tokens) => {
builder.set_non_separator_tokens(non_separator_tokens.clone())
}
Setting::Reset => builder.reset_non_separator_tokens(),
Setting::NotSet => (),
}
match separator_tokens {
Setting::Set(ref separator_tokens) => {
builder.set_separator_tokens(separator_tokens.clone())
}
Setting::Reset => builder.reset_separator_tokens(),
Setting::NotSet => (),
}
match dictionary {
Setting::Set(ref dictionary) => builder.set_dictionary(dictionary.clone()),
Setting::Reset => builder.reset_dictionary(),
Setting::NotSet => (),
}
match synonyms {
Setting::Set(ref synonyms) => builder.set_synonyms(synonyms.clone().into_iter().collect()),
Setting::Reset => builder.reset_synonyms(),
Setting::NotSet => (),
}
match distinct_attribute {
Setting::Set(ref attr) => builder.set_distinct_field(attr.clone()),
Setting::Reset => builder.reset_distinct_field(),
Setting::NotSet => (),
}
match proximity_precision {
Setting::Set(ref precision) => builder.set_proximity_precision((*precision).into()),
Setting::Reset => builder.reset_proximity_precision(),
Setting::NotSet => (),
}
match localized_attributes_rules {
Setting::Set(ref rules) => builder
.set_localized_attributes_rules(rules.iter().cloned().map(|r| r.into()).collect()),
Setting::Reset => builder.reset_localized_attributes_rules(),
Setting::NotSet => (),
}
match typo_tolerance {
Setting::Set(ref value) => {
match value.enabled {
Setting::Set(val) => builder.set_autorize_typos(val),
Setting::Reset => builder.reset_authorize_typos(),
Setting::NotSet => (),
}
match value.min_word_size_for_typos {
Setting::Set(ref setting) => {
match setting.one_typo {
Setting::Set(val) => builder.set_min_word_len_one_typo(val),
Setting::Reset => builder.reset_min_word_len_one_typo(),
Setting::NotSet => (),
}
match setting.two_typos {
Setting::Set(val) => builder.set_min_word_len_two_typos(val),
Setting::Reset => builder.reset_min_word_len_two_typos(),
Setting::NotSet => (),
}
}
Setting::Reset => {
builder.reset_min_word_len_one_typo();
builder.reset_min_word_len_two_typos();
}
Setting::NotSet => (),
}
match value.disable_on_words {
Setting::Set(ref words) => {
builder.set_exact_words(words.clone());
}
Setting::Reset => builder.reset_exact_words(),
Setting::NotSet => (),
}
match value.disable_on_attributes {
Setting::Set(ref words) => {
builder.set_exact_attributes(words.iter().cloned().collect())
}
Setting::Reset => builder.reset_exact_attributes(),
Setting::NotSet => (),
}
match value.disable_on_numbers {
Setting::Set(val) => builder.set_disable_on_numbers(val),
Setting::Reset => builder.reset_disable_on_numbers(),
Setting::NotSet => (),
}
}
Setting::Reset => {
// all typo settings need to be reset here.
builder.reset_authorize_typos();
builder.reset_min_word_len_one_typo();
builder.reset_min_word_len_two_typos();
builder.reset_exact_words();
builder.reset_exact_attributes();
}
Setting::NotSet => (),
}
match faceting {
Setting::Set(FacetingSettings { max_values_per_facet, sort_facet_values_by }) => {
match max_values_per_facet {
Setting::Set(val) => builder.set_max_values_per_facet(*val),
Setting::Reset => builder.reset_max_values_per_facet(),
Setting::NotSet => (),
}
match sort_facet_values_by {
Setting::Set(val) => builder.set_sort_facet_values_by(
val.iter().map(|(name, order)| (name.clone(), (*order).into())).collect(),
),
Setting::Reset => builder.reset_sort_facet_values_by(),
Setting::NotSet => (),
}
}
Setting::Reset => {
builder.reset_max_values_per_facet();
builder.reset_sort_facet_values_by();
}
Setting::NotSet => (),
}
match pagination {
Setting::Set(ref value) => match value.max_total_hits {
Setting::Set(val) => builder.set_pagination_max_total_hits(val),
Setting::Reset => builder.reset_pagination_max_total_hits(),
Setting::NotSet => (),
},
Setting::Reset => builder.reset_pagination_max_total_hits(),
Setting::NotSet => (),
}
match embedders {
Setting::Set(value) => builder.set_embedder_settings(
value.iter().map(|(k, v)| (k.clone(), v.inner.clone())).collect(),
),
Setting::Reset => builder.reset_embedder_settings(),
Setting::NotSet => (),
}
match search_cutoff_ms {
Setting::Set(cutoff) => builder.set_search_cutoff(*cutoff),
Setting::Reset => builder.reset_search_cutoff(),
Setting::NotSet => (),
}
match prefix_search {
Setting::Set(prefix_search) => {
builder.set_prefix_search(PrefixSearch::from(*prefix_search))
}
Setting::Reset => builder.reset_prefix_search(),
Setting::NotSet => (),
}
match facet_search {
Setting::Set(facet_search) => builder.set_facet_search(*facet_search),
Setting::Reset => builder.reset_facet_search(),
Setting::NotSet => (),
}
match chat {
Setting::Set(chat) => builder.set_chat(chat.clone()),
Setting::Reset => builder.reset_chat(),
Setting::NotSet => (),
}
}
pub enum SecretPolicy {
RevealSecrets,
HideSecrets,
}
pub fn settings(
index_scheduler: &IndexScheduler,
rtxn: &RoTxn,
secret_policy: SecretPolicy,
) -> Result<Settings<Checked>, milli::Error> {
let mut settings = index_scheduler.chat_settings(rtxn)?;
if let SecretPolicy::HideSecrets = secret_policy {
settings.hide_secrets()
}
Ok(settings)
}

View File

@@ -0,0 +1,3 @@
mod chat;
pub use chat::ChatSettings;

View File

@@ -351,6 +351,7 @@ pub struct IndexSearchRules {
fn generate_default_keys(store: &HeedAuthStore) -> Result<()> {
store.put_api_key(Key::default_admin())?;
store.put_api_key(Key::default_search())?;
store.put_api_key(Key::default_chat())?;
Ok(())
}

View File

@@ -1,34 +0,0 @@
[package]
name = "meilisearch-mcp"
version = "1.13.0"
authors = ["Clément Renault <clement@meilisearch.com>"]
description = "MCP (Model Context Protocol) server for Meilisearch"
homepage = "https://www.meilisearch.com"
readme = "README.md"
edition = "2021"
license = "MIT"
[dependencies]
actix-web = { version = "4.8.0", default-features = false }
anyhow = "1.0.86"
async-stream = "0.3.5"
async-trait = "0.1.81"
futures = "0.3.30"
# Removed meilisearch dependency to avoid cyclic dependency
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
serde = { version = "1.0.204", features = ["derive"] }
serde_json = { version = "1.0.120", features = ["preserve_order"] }
thiserror = "1.0.61"
tokio = { version = "1.38.0", features = ["full"] }
tracing = "0.1.40"
utoipa = { version = "5.3.1", features = ["actix_extras", "time"] }
uuid = { version = "1.10.0", features = ["serde", "v4"] }
regex = "1.10.2"
reqwest = { version = "0.12.5", features = ["json"] }
[dev-dependencies]
insta = "1.39.0"
tokio = { version = "1.38.0", features = ["test-util"] }
actix-web = { version = "4.8.0", features = ["macros"] }
actix-rt = "2.10.0"

View File

@@ -1,159 +0,0 @@
# Meilisearch MCP Server
This crate implements a Model Context Protocol (MCP) server for Meilisearch, enabling AI assistants and LLM applications to interact with Meilisearch through a standardized protocol.
## Overview
The MCP server automatically exposes all Meilisearch HTTP API endpoints as MCP tools, allowing AI assistants to:
- Search documents
- Manage indexes
- Add, update, or delete documents
- Configure settings
- Monitor tasks
- And more...
## Architecture
### Dynamic Tool Generation
The server dynamically generates MCP tools from Meilisearch's OpenAPI specification. This ensures:
- Complete API coverage
- Automatic updates when new endpoints are added
- Consistent parameter validation
- Type-safe operations
### Components
1. **Protocol Module** (`protocol.rs`): Defines MCP protocol types and messages
2. **Registry Module** (`registry.rs`): Converts OpenAPI specs to MCP tools
3. **Server Module** (`server.rs`): Handles MCP requests and SSE communication
4. **Integration Module** (`integration.rs`): Connects with the main Meilisearch server
## Usage
### Enabling the MCP Server
The MCP server is an optional feature. To enable it:
```bash
cargo build --release --features mcp
```
### Accessing the MCP Server
Once enabled, the MCP server is available at:
- SSE endpoint: `GET /mcp`
- HTTP endpoint: `POST /mcp`
### Authentication
The MCP server integrates with Meilisearch's existing authentication:
```json
{
"method": "tools/call",
"params": {
"name": "searchDocuments",
"arguments": {
"_auth": {
"apiKey": "your-api-key"
},
"indexUid": "movies",
"q": "search query"
}
}
}
```
## Protocol Flow
1. **Initialize**: Client establishes connection and negotiates protocol version
2. **List Tools**: Client discovers available Meilisearch operations
3. **Call Tools**: Client executes Meilisearch operations through MCP tools
4. **Stream Results**: Server streams responses, especially for long-running operations
## Example Interactions
### Initialize Connection
```json
{
"method": "initialize",
"params": {
"protocol_version": "2024-11-05",
"capabilities": {},
"client_info": {
"name": "my-ai-assistant",
"version": "1.0.0"
}
}
}
```
### List Available Tools
```json
{
"method": "tools/list"
}
```
Response includes tools like:
- `searchDocuments` - Search within an index
- `createIndex` - Create a new index
- `addDocuments` - Add documents to an index
- `getTask` - Check task status
- And many more...
### Search Documents
```json
{
"method": "tools/call",
"params": {
"name": "searchDocuments",
"arguments": {
"indexUid": "products",
"q": "laptop",
"filter": "price < 1000",
"limit": 20,
"attributesToRetrieve": ["name", "price", "description"]
}
}
}
```
## Testing
The crate includes comprehensive tests:
```bash
# Run all tests
cargo test -p meilisearch-mcp
# Run specific test categories
cargo test -p meilisearch-mcp conversion_tests
cargo test -p meilisearch-mcp integration_tests
cargo test -p meilisearch-mcp e2e_tests
```
## Development
### Adding New Features
Since tools are generated dynamically from the OpenAPI specification, new Meilisearch endpoints are automatically available through MCP without code changes.
### Customizing Tool Names
Tool names are generated automatically from endpoint paths and HTTP methods. The naming convention:
- `GET /indexes``getIndexes`
- `POST /indexes/{index_uid}/search``searchDocuments`
- `DELETE /indexes/{index_uid}``deleteIndex`
## Future Enhancements
- WebSocket support for bidirectional communication
- Tool result caching
- Batch operations
- Custom tool aliases
- Rate limiting per MCP client

View File

@@ -1,358 +0,0 @@
use crate::registry::{McpTool, McpToolRegistry};
use serde_json::json;
use utoipa::openapi::{OpenApi, PathItem};
#[test]
fn test_convert_simple_get_endpoint() {
let tool = McpTool::from_openapi_path(
"/indexes/{index_uid}",
"GET",
&create_mock_path_item_get(),
);
assert_eq!(tool.name, "getIndex");
assert_eq!(tool.description, "Get information about an index");
assert_eq!(tool.http_method, "GET");
assert_eq!(tool.path_template, "/indexes/{index_uid}");
let schema = &tool.input_schema;
assert_eq!(schema["type"], "object");
assert_eq!(schema["required"], json!(["indexUid"]));
assert_eq!(schema["properties"]["indexUid"]["type"], "string");
}
#[test]
fn test_convert_search_endpoint_with_query_params() {
let tool = McpTool::from_openapi_path(
"/indexes/{index_uid}/search",
"POST",
&create_mock_search_path_item(),
);
assert_eq!(tool.name, "searchDocuments");
assert_eq!(tool.description, "Search for documents in an index");
assert_eq!(tool.http_method, "POST");
let schema = &tool.input_schema;
assert_eq!(schema["type"], "object");
assert_eq!(schema["required"], json!(["indexUid"]));
assert!(schema["properties"]["q"].is_object());
assert!(schema["properties"]["limit"].is_object());
assert!(schema["properties"]["offset"].is_object());
assert!(schema["properties"]["filter"].is_object());
}
#[test]
fn test_convert_document_addition_endpoint() {
let tool = McpTool::from_openapi_path(
"/indexes/{index_uid}/documents",
"POST",
&create_mock_add_documents_path_item(),
);
assert_eq!(tool.name, "addDocuments");
assert_eq!(tool.description, "Add or replace documents in an index");
assert_eq!(tool.http_method, "POST");
let schema = &tool.input_schema;
assert_eq!(schema["type"], "object");
assert_eq!(schema["required"], json!(["indexUid", "documents"]));
assert_eq!(schema["properties"]["documents"]["type"], "array");
}
#[test]
fn test_registry_deduplication() {
let mut registry = McpToolRegistry::new();
let tool1 = McpTool {
name: "searchDocuments".to_string(),
description: "Search documents".to_string(),
input_schema: json!({}),
http_method: "POST".to_string(),
path_template: "/indexes/{index_uid}/search".to_string(),
};
let tool2 = McpTool {
name: "searchDocuments".to_string(),
description: "Updated description".to_string(),
input_schema: json!({"updated": true}),
http_method: "POST".to_string(),
path_template: "/indexes/{index_uid}/search".to_string(),
};
registry.register_tool(tool1);
registry.register_tool(tool2);
assert_eq!(registry.list_tools().len(), 1);
assert_eq!(registry.get_tool("searchDocuments").unwrap().description, "Updated description");
}
#[test]
fn test_openapi_to_mcp_tool_conversion() {
let openapi = create_mock_openapi();
let registry = McpToolRegistry::from_openapi(&openapi);
let tools = registry.list_tools();
assert!(tools.len() > 0);
let search_tool = registry.get_tool("searchDocuments");
assert!(search_tool.is_some());
let index_tool = registry.get_tool("getIndex");
assert!(index_tool.is_some());
}
#[test]
fn test_tool_name_generation() {
let test_cases = vec![
("/indexes", "GET", "getIndexes"),
("/indexes", "POST", "createIndex"),
("/indexes/{index_uid}", "GET", "getIndex"),
("/indexes/{index_uid}", "PUT", "updateIndex"),
("/indexes/{index_uid}", "DELETE", "deleteIndex"),
("/indexes/{index_uid}/documents", "GET", "getDocuments"),
("/indexes/{index_uid}/documents", "POST", "addDocuments"),
("/indexes/{index_uid}/documents", "DELETE", "deleteDocuments"),
("/indexes/{index_uid}/search", "POST", "searchDocuments"),
("/indexes/{index_uid}/settings", "GET", "getSettings"),
("/indexes/{index_uid}/settings", "PATCH", "updateSettings"),
("/tasks", "GET", "getTasks"),
("/tasks/{task_uid}", "GET", "getTask"),
("/keys", "GET", "getApiKeys"),
("/keys", "POST", "createApiKey"),
("/multi-search", "POST", "multiSearch"),
("/swap-indexes", "POST", "swapIndexes"),
];
for (path, method, expected_name) in test_cases {
let name = McpTool::generate_tool_name(path, method);
assert_eq!(name, expected_name, "Path: {}, Method: {}", path, method);
}
}
#[test]
fn test_parameter_extraction() {
let tool = McpTool::from_openapi_path(
"/indexes/{index_uid}/documents/{document_id}",
"GET",
&create_mock_get_document_path_item(),
);
let schema = &tool.input_schema;
assert_eq!(schema["required"], json!(["indexUid", "documentId"]));
assert_eq!(schema["properties"]["indexUid"]["type"], "string");
assert_eq!(schema["properties"]["documentId"]["type"], "string");
}
fn create_mock_path_item_get() -> PathItem {
serde_json::from_value(json!({
"get": {
"summary": "Get information about an index",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"description": "Index information"
}
}
}
}))
.unwrap()
}
fn create_mock_search_path_item() -> PathItem {
serde_json::from_value(json!({
"post": {
"summary": "Search for documents in an index",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": {
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"q": {
"type": "string",
"description": "Search query"
},
"limit": {
"type": "integer",
"default": 20
},
"offset": {
"type": "integer",
"default": 0
},
"filter": {
"type": "string"
}
}
}
}
}
},
"responses": {
"200": {
"description": "Search results"
}
}
}
}))
.unwrap()
}
fn create_mock_add_documents_path_item() -> PathItem {
serde_json::from_value(json!({
"post": {
"summary": "Add or replace documents in an index",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": {
"content": {
"application/json": {
"schema": {
"type": "array",
"items": {
"type": "object"
}
}
}
}
},
"responses": {
"202": {
"description": "Accepted"
}
}
}
}))
.unwrap()
}
fn create_mock_get_document_path_item() -> PathItem {
serde_json::from_value(json!({
"get": {
"summary": "Get a specific document",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
},
{
"name": "document_id",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"description": "Document found"
}
}
}
}))
.unwrap()
}
fn create_mock_openapi() -> OpenApi {
serde_json::from_value(json!({
"openapi": "3.1.0",
"info": {
"title": "Meilisearch API",
"version": "1.0.0"
},
"paths": {
"/indexes": {
"get": {
"summary": "List all indexes",
"responses": {
"200": {
"description": "List of indexes"
}
}
},
"post": {
"summary": "Create an index",
"responses": {
"202": {
"description": "Index created"
}
}
}
},
"/indexes/{index_uid}": {
"get": {
"summary": "Get information about an index",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"description": "Index information"
}
}
}
},
"/indexes/{index_uid}/search": {
"post": {
"summary": "Search for documents in an index",
"parameters": [
{
"name": "index_uid",
"in": "path",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"description": "Search results"
}
}
}
}
}
}))
.unwrap()
}

View File

@@ -1,285 +0,0 @@
use actix_web::{test, web, App};
use serde_json::json;
#[actix_rt::test]
async fn test_mcp_server_sse_communication() {
let app = test::init_service(
App::new()
.app_data(web::Data::new(crate::server::McpServer::new(
crate::registry::McpToolRegistry::new(),
)))
.route("/mcp", web::get().to(crate::server::mcp_sse_handler)),
)
.await;
let req = test::TestRequest::get()
.uri("/mcp")
.insert_header(("Accept", "text/event-stream"))
.to_request();
let resp = test::call_service(&app, req).await;
assert!(resp.status().is_success());
assert_eq!(
resp.headers().get("Content-Type").unwrap(),
"text/event-stream"
);
}
#[actix_rt::test]
async fn test_mcp_full_workflow() {
// This test simulates a complete MCP client-server interaction
let registry = create_test_registry();
let server = crate::server::McpServer::new(registry);
// 1. Initialize
let init_request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "initialize".to_string(),
params: Some(json!({
"protocol_version": "2024-11-05",
"capabilities": {},
"client_info": {
"name": "test-client",
"version": "1.0.0"
}
})),
id: json!(1),
};
let init_response = server.handle_json_rpc_request(init_request).await;
assert!(matches!(init_response, crate::protocol::JsonRpcResponse::Success { .. }));
// 2. List tools
let list_request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/list".to_string(),
params: None,
id: json!(2),
};
let list_response = server.handle_json_rpc_request(list_request).await;
let tools = match list_response {
crate::protocol::JsonRpcResponse::Success { result, .. } => {
let list_result: crate::protocol::ListToolsResult = serde_json::from_value(result).unwrap();
list_result.tools
},
_ => panic!("Expected success response"),
};
assert!(!tools.is_empty());
// 3. Call a tool
let call_request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": tools[0].name.clone(),
"arguments": {
"indexUid": "test-index"
}
})),
id: json!(3),
};
let call_response = server.handle_json_rpc_request(call_request).await;
assert!(matches!(call_response, crate::protocol::JsonRpcResponse::Success { .. }));
}
#[actix_rt::test]
async fn test_mcp_authentication_integration() {
let registry = create_test_registry();
let server = crate::server::McpServer::new(registry);
// Test with valid API key
let request_with_auth = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "getStats",
"arguments": {
"_auth": {
"apiKey": "test-api-key"
}
}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request_with_auth).await;
// Depending on auth implementation, this should either succeed or fail appropriately
assert!(matches!(response,
crate::protocol::JsonRpcResponse::Success { .. } |
crate::protocol::JsonRpcResponse::Error { .. }
));
}
#[actix_rt::test]
async fn test_mcp_tool_execution_with_params() {
let registry = create_test_registry();
let server = crate::server::McpServer::new(registry);
// Test tool with complex parameters
let request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "searchDocuments",
"arguments": {
"indexUid": "products",
"q": "laptop",
"limit": 10,
"offset": 0,
"filter": "price > 500",
"sort": ["price:asc"],
"facets": ["brand", "category"]
}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
crate::protocol::JsonRpcResponse::Success { result, .. } => {
let call_result: crate::protocol::CallToolResult = serde_json::from_value(result).unwrap();
assert!(!call_result.content.is_empty());
assert_eq!(call_result.content[0].content_type, "text");
// Verify the response contains search-related content
assert!(call_result.content[0].text.contains("search") ||
call_result.content[0].text.contains("products"));
}
_ => panic!("Expected success response"),
}
}
#[actix_rt::test]
async fn test_mcp_error_handling() {
let registry = create_test_registry();
let server = crate::server::McpServer::new(registry);
// Test with non-existent tool
let request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "nonExistentTool",
"arguments": {}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
crate::protocol::JsonRpcResponse::Error { error, .. } => {
assert_eq!(error.code, crate::protocol::METHOD_NOT_FOUND);
assert!(error.message.contains("Tool not found"));
}
_ => panic!("Expected error response"),
}
// Test with invalid parameters
let request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "searchDocuments",
"arguments": {
// Missing required indexUid parameter
"q": "test"
}
})),
id: json!(2),
};
let response = server.handle_json_rpc_request(request).await;
match response {
crate::protocol::JsonRpcResponse::Error { error, .. } => {
assert_eq!(error.code, crate::protocol::INVALID_PARAMS);
assert!(error.message.contains("Invalid parameters") ||
error.message.contains("required"));
}
_ => panic!("Expected error response"),
}
}
#[actix_rt::test]
async fn test_mcp_protocol_version_negotiation() {
let server = crate::server::McpServer::new(crate::registry::McpToolRegistry::new());
// Test with different protocol versions
let request = crate::protocol::JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "initialize".to_string(),
params: Some(json!({
"protocol_version": "2024-01-01", // Old version
"capabilities": {},
"client_info": {
"name": "test-client",
"version": "1.0.0"
}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
crate::protocol::JsonRpcResponse::Success { result, .. } => {
let init_result: crate::protocol::InitializeResult = serde_json::from_value(result).unwrap();
// Server should respond with its supported version
assert_eq!(init_result.protocol_version, "2024-11-05");
}
_ => panic!("Expected success response"),
}
}
fn create_test_registry() -> crate::registry::McpToolRegistry {
let mut registry = crate::registry::McpToolRegistry::new();
// Add test tools
registry.register_tool(crate::registry::McpTool {
name: "getStats".to_string(),
description: "Get server statistics".to_string(),
input_schema: json!({
"type": "object",
"properties": {},
"required": []
}),
http_method: "GET".to_string(),
path_template: "/stats".to_string(),
});
registry.register_tool(crate::registry::McpTool {
name: "searchDocuments".to_string(),
description: "Search for documents in an index".to_string(),
input_schema: json!({
"type": "object",
"properties": {
"indexUid": {
"type": "string",
"description": "The index UID"
},
"q": {
"type": "string",
"description": "Query string"
},
"limit": {
"type": "integer",
"description": "Maximum number of results"
},
"offset": {
"type": "integer",
"description": "Number of results to skip"
}
},
"required": ["indexUid"]
}),
http_method: "POST".to_string(),
path_template: "/indexes/{indexUid}/search".to_string(),
});
registry
}

View File

@@ -1,49 +0,0 @@
use thiserror::Error;
#[derive(Debug, Error)]
pub enum Error {
#[error("Protocol error: {0}")]
Protocol(String),
#[error("Tool not found: {0}")]
ToolNotFound(String),
#[error("Invalid parameters: {0}")]
InvalidParameters(String),
#[error("Authentication failed: {0}")]
AuthenticationFailed(String),
#[error("Internal error: {0}")]
Internal(#[from] anyhow::Error),
#[error("JSON error: {0}")]
Json(#[from] serde_json::Error),
#[error("Meilisearch error: {0}")]
Meilisearch(String),
}
impl Error {
pub fn to_mcp_error(&self) -> serde_json::Value {
serde_json::json!({
"jsonrpc": "2.0",
"error": {
"code": self.error_code(),
"message": self.to_string(),
}
})
}
fn error_code(&self) -> i32 {
match self {
Error::Protocol(_) => -32700,
Error::ToolNotFound(_) => -32601,
Error::InvalidParameters(_) => -32602,
Error::AuthenticationFailed(_) => -32000,
Error::Internal(_) => -32603,
Error::Json(_) => -32700,
Error::Meilisearch(_) => -32000,
}
}
}

View File

@@ -1,129 +0,0 @@
use crate::registry::McpToolRegistry;
use crate::server::{McpServer, MeilisearchClient};
use crate::Error;
use actix_web::{web, HttpResponse};
use serde_json::Value;
use utoipa::openapi::OpenApi;
pub struct MeilisearchMcpClient {
base_url: String,
client: reqwest::Client,
}
impl MeilisearchMcpClient {
pub fn new(base_url: String) -> Self {
Self {
base_url,
client: reqwest::Client::new(),
}
}
}
#[async_trait::async_trait]
impl MeilisearchClient for MeilisearchMcpClient {
async fn call_endpoint(
&self,
method: &str,
path: &str,
body: Option<Value>,
auth_header: Option<String>,
) -> Result<Value, Error> {
let url = format!("{}{}", self.base_url, path);
let mut request = match method {
"GET" => self.client.get(&url),
"POST" => self.client.post(&url),
"PUT" => self.client.put(&url),
"DELETE" => self.client.delete(&url),
"PATCH" => self.client.patch(&url),
_ => return Err(Error::Protocol(format!("Unsupported method: {}", method))),
};
if let Some(auth) = auth_header {
request = request.header("Authorization", auth);
}
if let Some(body) = body {
request = request.json(&body);
}
let response = request
.send()
.await
.map_err(|e| Error::Internal(e.into()))?;
if response.status().is_success() {
response
.json()
.await
.map_err(|e| Error::Internal(e.into()))
} else {
let status = response.status();
let error_body = response
.text()
.await
.unwrap_or_else(|_| "Failed to read error response".to_string());
Err(Error::Meilisearch(format!(
"Request failed with status {}: {}",
status, error_body
)))
}
}
}
pub fn create_mcp_server_from_openapi(openapi: OpenApi) -> McpServer {
// Create registry from OpenAPI
let registry = McpToolRegistry::from_openapi(&openapi);
// Create MCP server
McpServer::new(registry)
}
pub fn configure_mcp_route(cfg: &mut web::ServiceConfig, openapi: OpenApi) {
let server = create_mcp_server_from_openapi(openapi);
cfg.app_data(web::Data::new(server))
.service(
web::resource("/mcp")
.route(web::get().to(crate::server::mcp_sse_handler))
.route(web::post().to(mcp_post_handler))
.route(web::method(actix_web::http::Method::OPTIONS).to(mcp_options_handler))
);
}
async fn mcp_post_handler(
req_body: web::Json<crate::protocol::JsonRpcRequest>,
server: web::Data<McpServer>,
) -> Result<HttpResponse, actix_web::Error> {
let response = server.handle_json_rpc_request(req_body.into_inner()).await;
Ok(HttpResponse::Ok()
.insert_header(("Access-Control-Allow-Origin", "*"))
.insert_header(("Access-Control-Allow-Headers", "*"))
.json(response))
}
async fn mcp_options_handler() -> Result<HttpResponse, actix_web::Error> {
Ok(HttpResponse::Ok()
.insert_header(("Access-Control-Allow-Origin", "*"))
.insert_header(("Access-Control-Allow-Methods", "GET, POST, OPTIONS"))
.insert_header(("Access-Control-Allow-Headers", "*"))
.finish())
}
#[cfg(test)]
mod tests {
use super::*;
use utoipa::openapi::{OpenApiBuilder, InfoBuilder};
#[test]
fn test_create_mcp_server() {
let openapi = OpenApiBuilder::new()
.info(InfoBuilder::new()
.title("Test API")
.version("1.0")
.build())
.build();
let _server = create_mcp_server_from_openapi(openapi);
// Server should be created successfully
assert!(true);
}
}

View File

@@ -1,283 +0,0 @@
use crate::protocol::*;
use crate::server::McpServer;
use crate::registry::McpToolRegistry;
use serde_json::json;
use tokio;
#[tokio::test]
async fn test_mcp_initialize_request() {
let server = McpServer::new(McpToolRegistry::new());
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "initialize".to_string(),
params: Some(json!({
"protocol_version": "2024-11-05",
"capabilities": {},
"client_info": {
"name": "test-client",
"version": "1.0.0"
}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Success { result, .. } => {
let init_result: InitializeResult = serde_json::from_value(result).unwrap();
assert_eq!(init_result.protocol_version, "2024-11-05");
assert_eq!(init_result.server_info.name, "meilisearch-mcp");
assert!(init_result.capabilities.tools.list_changed);
}
_ => panic!("Expected success response"),
}
}
#[tokio::test]
async fn test_mcp_list_tools_request() {
let mut registry = McpToolRegistry::new();
registry.register_tool(crate::registry::McpTool {
name: "searchDocuments".to_string(),
description: "Search for documents".to_string(),
input_schema: json!({
"type": "object",
"properties": {
"indexUid": { "type": "string" },
"q": { "type": "string" }
},
"required": ["indexUid"]
}),
http_method: "POST".to_string(),
path_template: "/indexes/{index_uid}/search".to_string(),
});
let server = McpServer::new(registry);
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/list".to_string(),
params: None,
id: json!(2),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Success { result, .. } => {
let list_result: ListToolsResult = serde_json::from_value(result).unwrap();
assert_eq!(list_result.tools.len(), 1);
assert_eq!(list_result.tools[0].name, "searchDocuments");
assert_eq!(list_result.tools[0].description, "Search for documents");
assert!(list_result.tools[0].input_schema["type"] == "object");
}
_ => panic!("Expected success response"),
}
}
#[tokio::test]
async fn test_mcp_call_tool_request_success() {
let mut registry = McpToolRegistry::new();
registry.register_tool(crate::registry::McpTool {
name: "getStats".to_string(),
description: "Get server statistics".to_string(),
input_schema: json!({
"type": "object",
"properties": {},
}),
http_method: "GET".to_string(),
path_template: "/stats".to_string(),
});
let server = McpServer::new(registry);
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "getStats",
"arguments": {}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Success { result, .. } => {
let call_result: CallToolResult = serde_json::from_value(result).unwrap();
assert!(!call_result.content.is_empty());
assert_eq!(call_result.content[0].content_type, "text");
assert!(call_result.is_error.is_none() || !call_result.is_error.unwrap());
}
_ => panic!("Expected success response"),
}
}
#[tokio::test]
async fn test_mcp_call_unknown_tool() {
let server = McpServer::new(McpToolRegistry::new());
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "unknownTool",
"arguments": {}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Error { error, .. } => {
assert_eq!(error.code, crate::protocol::METHOD_NOT_FOUND);
assert!(error.message.contains("Tool not found"));
}
_ => panic!("Expected error response"),
}
}
#[tokio::test]
async fn test_mcp_call_tool_with_invalid_params() {
let mut registry = McpToolRegistry::new();
registry.register_tool(crate::registry::McpTool {
name: "searchDocuments".to_string(),
description: "Search for documents".to_string(),
input_schema: json!({
"type": "object",
"properties": {
"indexUid": { "type": "string" },
"q": { "type": "string" }
},
"required": ["indexUid"]
}),
http_method: "POST".to_string(),
path_template: "/indexes/{index_uid}/search".to_string(),
});
let server = McpServer::new(registry);
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "tools/call".to_string(),
params: Some(json!({
"name": "searchDocuments",
"arguments": {} // Missing required indexUid
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Error { error, .. } => {
assert_eq!(error.code, crate::protocol::INVALID_PARAMS);
assert!(error.message.contains("Invalid parameters"));
}
_ => panic!("Expected error response"),
}
}
#[tokio::test]
async fn test_protocol_version_negotiation() {
let server = McpServer::new(McpToolRegistry::new());
let test_versions = vec![
"2024-11-05",
"2024-11-01", // Older version
"2025-01-01", // Future version
];
for version in test_versions {
let request = JsonRpcRequest {
jsonrpc: "2.0".to_string(),
method: "initialize".to_string(),
params: Some(json!({
"protocol_version": version,
"capabilities": {},
"client_info": {
"name": "test-client",
"version": "1.0.0"
}
})),
id: json!(1),
};
let response = server.handle_json_rpc_request(request).await;
match response {
JsonRpcResponse::Success { result, .. } => {
let init_result: InitializeResult = serde_json::from_value(result).unwrap();
// Server should always return its supported version
assert_eq!(init_result.protocol_version, "2024-11-05");
}
_ => panic!("Expected success response"),
}
}
}
#[tokio::test]
async fn test_json_rpc_response_serialization() {
let response = JsonRpcResponse::Success {
jsonrpc: "2.0".to_string(),
result: json!({
"protocol_version": "2024-11-05",
"capabilities": {
"tools": {
"list_changed": true
},
"experimental": {}
},
"server_info": {
"name": "meilisearch-mcp",
"version": env!("CARGO_PKG_VERSION")
}
}),
id: json!(1),
};
let serialized = serde_json::to_string(&response).unwrap();
let deserialized: JsonRpcResponse = serde_json::from_str(&serialized).unwrap();
match deserialized {
JsonRpcResponse::Success { result, .. } => {
assert_eq!(result["protocol_version"], "2024-11-05");
assert_eq!(result["server_info"]["name"], "meilisearch-mcp");
}
_ => panic!("Deserialization failed"),
}
}
#[tokio::test]
async fn test_tool_result_formatting() {
let result = CallToolResult {
content: vec![
ToolContent {
content_type: "text".to_string(),
text: "Success: Index created".to_string(),
},
],
is_error: None,
};
let serialized = serde_json::to_string(&result).unwrap();
assert!(serialized.contains("\"type\":\"text\""));
assert!(serialized.contains("Success: Index created"));
assert!(!serialized.contains("is_error"));
}
#[tokio::test]
async fn test_error_response_formatting() {
let error_response = JsonRpcResponse::Error {
jsonrpc: "2.0".to_string(),
error: JsonRpcError {
code: -32601,
message: "Method not found".to_string(),
data: Some(json!({ "method": "unknownMethod" })),
},
id: json!(1),
};
let serialized = serde_json::to_string(&error_response).unwrap();
assert!(serialized.contains("\"code\":-32601"));
assert!(serialized.contains("Method not found"));
assert!(serialized.contains("unknownMethod"));
}

View File

@@ -1,16 +0,0 @@
pub mod error;
pub mod integration;
pub mod protocol;
pub mod registry;
pub mod server;
#[cfg(test)]
mod conversion_tests;
#[cfg(test)]
mod integration_tests;
#[cfg(test)]
mod e2e_tests;
pub use error::Error;
pub use registry::McpToolRegistry;
pub use server::McpServer;

View File

@@ -1,147 +0,0 @@
use serde::{Deserialize, Serialize};
use serde_json::Value;
// JSON-RPC 2.0 wrapper types
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct JsonRpcRequest {
pub jsonrpc: String,
pub method: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub params: Option<Value>,
pub id: Value,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(untagged)]
pub enum JsonRpcResponse {
Success {
jsonrpc: String,
result: Value,
id: Value,
},
Error {
jsonrpc: String,
error: JsonRpcError,
id: Value,
},
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct JsonRpcError {
pub code: i32,
pub message: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub data: Option<Value>,
}
// MCP-specific request types
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "method")]
pub enum McpRequest {
#[serde(rename = "initialize")]
Initialize {
#[serde(default)]
params: InitializeParams,
},
#[serde(rename = "tools/list")]
ListTools,
#[serde(rename = "tools/call")]
CallTool {
params: CallToolParams,
},
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "camelCase")]
pub struct InitializeParams {
pub protocol_version: String,
pub capabilities: ClientCapabilities,
pub client_info: ClientInfo,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct ClientCapabilities {
#[serde(default)]
pub experimental: Value,
#[serde(default)]
pub sampling: Value,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "camelCase")]
pub struct ClientInfo {
pub name: String,
pub version: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CallToolParams {
pub name: String,
#[serde(default)]
pub arguments: Value,
}
// Response types are now just the result objects, wrapped in JsonRpcResponse
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct InitializeResult {
pub protocol_version: String,
pub capabilities: ServerCapabilities,
pub server_info: ServerInfo,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ServerCapabilities {
pub tools: ToolsCapability,
#[serde(default)]
pub experimental: Value,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ToolsCapability {
pub list_changed: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ServerInfo {
pub name: String,
pub version: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ListToolsResult {
pub tools: Vec<Tool>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Tool {
pub name: String,
pub description: String,
#[serde(rename = "inputSchema")]
pub input_schema: Value,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct CallToolResult {
pub content: Vec<ToolContent>,
#[serde(skip_serializing_if = "Option::is_none")]
pub is_error: Option<bool>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolContent {
#[serde(rename = "type")]
pub content_type: String,
pub text: String,
}
// Standard JSON-RPC error codes
pub const PARSE_ERROR: i32 = -32700;
pub const INVALID_REQUEST: i32 = -32600;
pub const METHOD_NOT_FOUND: i32 = -32601;
pub const INVALID_PARAMS: i32 = -32602;
pub const INTERNAL_ERROR: i32 = -32603;

View File

@@ -1,381 +0,0 @@
use crate::protocol::Tool;
use serde::{Deserialize, Serialize};
use serde_json::{json, Value};
use std::collections::HashMap;
use utoipa::openapi::{OpenApi, PathItem};
use utoipa::openapi::path::Operation;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpTool {
pub name: String,
pub description: String,
#[serde(rename = "inputSchema")]
pub input_schema: Value,
pub http_method: String,
pub path_template: String,
}
pub struct McpToolRegistry {
tools: HashMap<String, McpTool>,
}
impl McpToolRegistry {
pub fn new() -> Self {
Self {
tools: HashMap::new(),
}
}
pub fn from_openapi(openapi: &OpenApi) -> Self {
let mut registry = Self::new();
// openapi.paths is of type Paths
for (path, path_item) in openapi.paths.paths.iter() {
registry.process_path_item(path, path_item);
}
registry
}
pub fn register_tool(&mut self, tool: McpTool) {
self.tools.insert(tool.name.clone(), tool);
}
pub fn get_tool(&self, name: &str) -> Option<&McpTool> {
self.tools.get(name)
}
pub fn list_tools(&self) -> Vec<Tool> {
self.tools
.values()
.map(|mcp_tool| Tool {
name: mcp_tool.name.clone(),
description: mcp_tool.description.clone(),
input_schema: mcp_tool.input_schema.clone(),
})
.collect()
}
fn process_path_item(&mut self, path: &str, path_item: &PathItem) {
let methods = [
("GET", &path_item.get),
("POST", &path_item.post),
("PUT", &path_item.put),
("DELETE", &path_item.delete),
("PATCH", &path_item.patch),
];
for (method_type, operation) in methods {
if let Some(op) = operation {
if let Some(tool) = McpTool::from_operation(path, method_type, op) {
self.register_tool(tool);
}
}
}
}
}
impl McpTool {
pub fn from_openapi_path(
path: &str,
method: &str,
path_item: &PathItem,
) -> Self {
// Get the operation based on method
let operation = match method.to_uppercase().as_str() {
"GET" => path_item.get.as_ref(),
"POST" => path_item.post.as_ref(),
"PUT" => path_item.put.as_ref(),
"DELETE" => path_item.delete.as_ref(),
"PATCH" => path_item.patch.as_ref(),
_ => None,
};
if let Some(op) = operation {
Self::from_operation(path, method, op).unwrap_or_else(|| {
// Fallback if operation parsing fails
let name = Self::generate_tool_name(path, method);
let description = format!("{} {}", method, path);
Self {
name,
description,
input_schema: json!({
"type": "object",
"properties": {},
"required": []
}),
http_method: method.to_string(),
path_template: path.to_string(),
}
})
} else {
// No operation found, use basic extraction
let name = Self::generate_tool_name(path, method);
let description = format!("{} {}", method, path);
// Extract path parameters from the path template
let mut properties = serde_json::Map::new();
let mut required = Vec::new();
// Find parameters in curly braces
let re = regex::Regex::new(r"\{([^}]+)\}").unwrap();
for cap in re.captures_iter(path) {
let param_name = &cap[1];
let camel_name = to_camel_case(param_name);
properties.insert(
camel_name.clone(),
json!({
"type": "string",
"description": format!("The {}", param_name.replace('_', " "))
}),
);
required.push(camel_name);
}
Self {
name,
description,
input_schema: json!({
"type": "object",
"properties": properties,
"required": required
}),
http_method: method.to_string(),
path_template: path.to_string(),
}
}
}
fn from_operation(path: &str, method: &str, operation: &Operation) -> Option<Self> {
let name = Self::generate_tool_name(path, method);
let description = operation
.summary
.as_ref()
.or(operation.description.as_ref())
.cloned()
.unwrap_or_else(|| format!("{} {}", method, path));
let mut properties = serde_json::Map::new();
let mut required = Vec::new();
// Extract path parameters
if let Some(params) = &operation.parameters {
for param in params {
let camel_name = to_camel_case(&param.name);
properties.insert(
camel_name.clone(),
json!({
"type": "string",
"description": param.description.as_deref().unwrap_or("")
}),
);
if matches!(param.required, utoipa::openapi::Required::True) {
required.push(camel_name);
}
}
}
// Extract request body schema
if let Some(request_body) = &operation.request_body {
if let Some(content) = request_body.content.get("application/json") {
if let Some(_schema) = &content.schema {
// Special handling for known endpoints
if path.contains("/documents") && method == "POST" {
// Document addition endpoint expects an array
properties.insert(
"documents".to_string(),
json!({
"type": "array",
"items": {"type": "object"},
"description": "Array of documents to add or update"
}),
);
required.push("documents".to_string());
} else if path.contains("/search") {
// Search endpoint has specific properties
properties.insert("q".to_string(), json!({"type": "string", "description": "Query string"}));
properties.insert("limit".to_string(), json!({"type": "integer", "description": "Maximum number of results", "default": 20}));
properties.insert("offset".to_string(), json!({"type": "integer", "description": "Number of results to skip", "default": 0}));
properties.insert("filter".to_string(), json!({"type": "string", "description": "Filter expression"}));
} else {
// Generic request body handling
properties.insert(
"body".to_string(),
json!({
"type": "object",
"description": "Request body"
}),
);
}
}
}
}
let input_schema = json!({
"type": "object",
"properties": properties,
"required": required,
});
Some(Self {
name,
description,
input_schema,
http_method: method.to_string(),
path_template: path.to_string(),
})
}
pub fn generate_tool_name(path: &str, method: &str) -> String {
let parts: Vec<&str> = path
.split('/')
.filter(|s| !s.is_empty() && !s.starts_with('{'))
.collect();
let resource = parts.last().unwrap_or(&"resource");
// Check if the path ends with a resource name (not a parameter)
let ends_with_param = path.ends_with('}');
match method.to_uppercase().as_str() {
"GET" => {
if ends_with_param {
// Getting a single resource by ID
format!("get{}", to_pascal_case(&singularize(resource)))
} else {
// Getting a collection
if resource == &"keys" {
"getApiKeys".to_string()
} else if resource.ends_with('s') {
format!("get{}", to_pascal_case(resource))
} else {
format!("get{}", to_pascal_case(&pluralize(resource)))
}
}
}
"POST" => {
if resource == &"search" {
"searchDocuments".to_string()
} else if resource == &"multi-search" {
"multiSearch".to_string()
} else if resource == &"swap-indexes" {
"swapIndexes".to_string()
} else if resource == &"documents" {
"addDocuments".to_string()
} else if resource == &"keys" {
"createApiKey".to_string()
} else {
format!("create{}", to_pascal_case(&singularize(resource)))
}
}
"PUT" => format!("update{}", to_pascal_case(&singularize(resource))),
"DELETE" => {
if resource == &"documents" && !ends_with_param {
"deleteDocuments".to_string()
} else {
format!("delete{}", to_pascal_case(&singularize(resource)))
}
},
"PATCH" => {
if resource == &"settings" {
"updateSettings".to_string()
} else {
format!("update{}", to_pascal_case(&singularize(resource)))
}
},
_ => format!("{}{}", method.to_lowercase(), to_pascal_case(resource)),
}
}
}
fn to_camel_case(s: &str) -> String {
let parts: Vec<&str> = s.split(&['_', '-'][..]).collect();
if parts.is_empty() {
return String::new();
}
let mut result = parts[0].to_lowercase();
for part in &parts[1..] {
result.push_str(&to_pascal_case(part));
}
result
}
fn to_pascal_case(s: &str) -> String {
s.split(&['_', '-'][..])
.map(|part| {
let mut chars = part.chars();
chars
.next()
.map(|c| c.to_uppercase().collect::<String>() + chars.as_str().to_lowercase().as_str())
.unwrap_or_default()
})
.collect()
}
fn singularize(word: &str) -> String {
if word.ends_with("ies") {
word[..word.len() - 3].to_string() + "y"
} else if word.ends_with("es") {
word[..word.len() - 2].to_string()
} else if word.ends_with('s') {
word[..word.len() - 1].to_string()
} else {
word.to_string()
}
}
fn pluralize(word: &str) -> String {
if word.ends_with('y') {
word[..word.len() - 1].to_string() + "ies"
} else if word.ends_with('s') || word.ends_with('x') || word.ends_with("ch") {
word.to_string() + "es"
} else {
word.to_string() + "s"
}
}
fn extract_schema_properties(schema: &utoipa::openapi::RefOr<utoipa::openapi::Schema>) -> Option<serde_json::Map<String, Value>> {
// This is a simplified extraction - in a real implementation,
// we would properly handle $ref resolution and nested schemas
match schema {
utoipa::openapi::RefOr::T(_schema) => {
// Extract properties from the schema
// This would need proper implementation based on the schema type
Some(serde_json::Map::new())
}
utoipa::openapi::RefOr::Ref { .. } => {
// Handle schema references
Some(serde_json::Map::new())
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_tool_name_generation() {
assert_eq!(McpTool::generate_tool_name("/indexes", "GET"), "getIndexes");
assert_eq!(McpTool::generate_tool_name("/indexes/{index_uid}", "GET"), "getIndex");
assert_eq!(McpTool::generate_tool_name("/indexes/{index_uid}/search", "POST"), "searchDocuments");
}
#[test]
fn test_camel_case_conversion() {
assert_eq!(to_camel_case("index_uid"), "indexUid");
assert_eq!(to_camel_case("document-id"), "documentId");
assert_eq!(to_camel_case("simple"), "simple");
}
#[test]
fn test_pascal_case_conversion() {
assert_eq!(to_pascal_case("index"), "Index");
assert_eq!(to_pascal_case("multi-search"), "MultiSearch");
assert_eq!(to_pascal_case("api_key"), "ApiKey");
}
}

View File

@@ -1,325 +0,0 @@
use crate::error::Error;
use crate::protocol::*;
use crate::registry::McpToolRegistry;
use actix_web::{web, HttpRequest, HttpResponse};
use async_stream::try_stream;
use futures::stream::{StreamExt, TryStreamExt};
use serde_json::{json, Value};
use std::sync::Arc;
pub struct McpServer {
registry: Arc<McpToolRegistry>,
meilisearch_client: Option<Arc<dyn MeilisearchClient>>,
}
#[async_trait::async_trait]
pub trait MeilisearchClient: Send + Sync {
async fn call_endpoint(
&self,
method: &str,
path: &str,
body: Option<Value>,
auth_header: Option<String>,
) -> Result<Value, Error>;
}
impl McpServer {
pub fn new(registry: McpToolRegistry) -> Self {
Self {
registry: Arc::new(registry),
meilisearch_client: None,
}
}
pub fn with_client(mut self, client: Arc<dyn MeilisearchClient>) -> Self {
self.meilisearch_client = Some(client);
self
}
pub async fn handle_json_rpc_request(&self, request: JsonRpcRequest) -> JsonRpcResponse {
// Parse the method and params
let result = match request.method.as_str() {
"initialize" => {
let params: InitializeParams = match request.params {
Some(p) => match serde_json::from_value(p) {
Ok(params) => params,
Err(e) => return self.error_response(request.id, INVALID_PARAMS, &format!("Invalid params: {}", e)),
},
None => InitializeParams::default(),
};
self.handle_initialize(params)
}
"tools/list" => self.handle_list_tools(),
"tools/call" => {
let params: CallToolParams = match request.params {
Some(p) => match serde_json::from_value(p) {
Ok(params) => params,
Err(e) => return self.error_response(request.id, INVALID_PARAMS, &format!("Invalid params: {}", e)),
},
None => return self.error_response(request.id, INVALID_PARAMS, "Missing params"),
};
self.handle_call_tool(params).await
}
_ => return self.error_response(request.id, METHOD_NOT_FOUND, &format!("Method not found: {}", request.method)),
};
match result {
Ok(value) => JsonRpcResponse::Success {
jsonrpc: "2.0".to_string(),
result: value,
id: request.id,
},
Err((code, message, data)) => JsonRpcResponse::Error {
jsonrpc: "2.0".to_string(),
error: JsonRpcError { code, message, data },
id: request.id,
},
}
}
fn error_response(&self, id: Value, code: i32, message: &str) -> JsonRpcResponse {
JsonRpcResponse::Error {
jsonrpc: "2.0".to_string(),
error: JsonRpcError {
code,
message: message.to_string(),
data: None,
},
id,
}
}
fn handle_initialize(&self, _params: InitializeParams) -> Result<Value, (i32, String, Option<Value>)> {
let result = InitializeResult {
protocol_version: "2024-11-05".to_string(),
capabilities: ServerCapabilities {
tools: ToolsCapability {
list_changed: true,
},
experimental: json!({}),
},
server_info: ServerInfo {
name: "meilisearch-mcp".to_string(),
version: env!("CARGO_PKG_VERSION").to_string(),
},
};
Ok(serde_json::to_value(result).unwrap())
}
fn handle_list_tools(&self) -> Result<Value, (i32, String, Option<Value>)> {
let tools = self.registry.list_tools();
let result = ListToolsResult { tools };
Ok(serde_json::to_value(result).unwrap())
}
async fn handle_call_tool(&self, params: CallToolParams) -> Result<Value, (i32, String, Option<Value>)> {
// Get the tool definition
let tool = match self.registry.get_tool(&params.name) {
Some(tool) => tool,
None => {
return Err((
METHOD_NOT_FOUND,
format!("Tool not found: {}", params.name),
None,
));
}
};
// Validate parameters
if let Err(e) = self.validate_parameters(&params.arguments, &tool.input_schema) {
return Err((
INVALID_PARAMS,
format!("Invalid parameters: {}", e),
Some(json!({ "schema": tool.input_schema })),
));
}
// Execute the tool
match self.execute_tool(tool, params.arguments).await {
Ok(result_text) => {
let result = CallToolResult {
content: vec![ToolContent {
content_type: "text".to_string(),
text: result_text,
}],
is_error: None,
};
Ok(serde_json::to_value(result).unwrap())
}
Err(e) => Err((
INTERNAL_ERROR,
format!("Tool execution failed: {}", e),
None,
)),
}
}
fn validate_parameters(&self, args: &Value, schema: &Value) -> Result<(), String> {
// Check if args is an object
if !args.is_object() {
return Err("Arguments must be an object".to_string());
}
// Basic validation - check required fields
if let (Some(args_obj), Some(schema_obj)) = (args.as_object(), schema.as_object()) {
if let Some(required) = schema_obj.get("required").and_then(|r| r.as_array()) {
for req_field in required {
if let Some(field_name) = req_field.as_str() {
if !args_obj.contains_key(field_name) {
return Err(format!("Missing required field: {}", field_name));
}
}
}
}
}
Ok(())
}
async fn execute_tool(
&self,
tool: &crate::registry::McpTool,
mut arguments: Value,
) -> Result<String, Error> {
// Extract authentication if provided
let auth_header = arguments
.as_object_mut()
.and_then(|obj| obj.remove("_auth"))
.and_then(|auth| {
auth.get("apiKey")
.and_then(|k| k.as_str())
.map(|s| s.to_string())
})
.map(|key| format!("Bearer {}", key));
// Build the actual path by replacing parameters
let mut path = tool.path_template.clone();
if let Some(args_obj) = arguments.as_object() {
for (key, value) in args_obj {
let param_pattern = format!("{{{}}}", camel_to_snake_case(key));
if let Some(val_str) = value.as_str() {
path = path.replace(&param_pattern, val_str);
}
}
}
// Prepare request body for POST/PUT/PATCH methods
let body = match tool.http_method.as_str() {
"POST" | "PUT" | "PATCH" => {
// Remove path parameters from body
if let Some(args_obj) = arguments.as_object_mut() {
let mut body_obj = args_obj.clone();
// Remove any parameters that were used in the path
for (key, _) in args_obj.iter() {
let param_pattern = format!("{{{}}}", camel_to_snake_case(key));
if tool.path_template.contains(&param_pattern) {
body_obj.remove(key);
}
}
Some(Value::Object(body_obj))
} else {
Some(arguments.clone())
}
}
_ => None,
};
// Execute the request
if let Some(client) = &self.meilisearch_client {
match client.call_endpoint(&tool.http_method, &path, body, auth_header).await {
Ok(response) => Ok(serde_json::to_string_pretty(&response)?),
Err(e) => Err(e),
}
} else {
// Mock response for testing
Ok(json!({
"status": "success",
"message": format!("Executed {} {}", tool.http_method, path)
})
.to_string())
}
}
}
pub async fn mcp_sse_handler(
req: HttpRequest,
_server: web::Data<McpServer>,
) -> Result<HttpResponse, actix_web::Error> {
// MCP SSE transport implementation
// This endpoint handles server-to-client messages via SSE
// Client-to-server messages come via POST requests
// Check for session ID header
let session_id = req.headers()
.get("Mcp-Session-Id")
.and_then(|h| h.to_str().ok())
.map(|s| s.to_string())
.unwrap_or_else(|| uuid::Uuid::new_v4().to_string());
// Check for Last-Event-ID header for resumability
let _last_event_id = req.headers()
.get("Last-Event-ID")
.and_then(|h| h.to_str().ok())
.and_then(|s| s.parse::<u64>().ok());
// Create a channel for this SSE connection
let (_tx, mut rx) = tokio::sync::mpsc::unbounded_channel::<String>();
// Store the sender for this session (in a real implementation, you'd use a shared state)
// For now, we'll just keep the connection open
let stream = try_stream! {
// Always send the endpoint event first
yield format!("event: endpoint\ndata: {{\"uri\": \"/mcp\"}}\n\n");
// Keep connection alive and handle any messages
loop {
tokio::select! {
Some(message) = rx.recv() => {
yield message;
}
_ = tokio::time::sleep(tokio::time::Duration::from_secs(30)) => {
yield format!(": keepalive\n\n");
}
}
}
};
let mut response = HttpResponse::Ok();
response.content_type("text/event-stream");
response.insert_header(("Cache-Control", "no-cache"));
response.insert_header(("Connection", "keep-alive"));
response.insert_header(("X-Accel-Buffering", "no"));
response.insert_header(("Access-Control-Allow-Origin", "*"));
response.insert_header(("Access-Control-Allow-Headers", "*"));
response.insert_header(("Mcp-Session-Id", session_id));
Ok(response.streaming(stream.map(|result: Result<String, anyhow::Error>| {
result.map(|s| actix_web::web::Bytes::from(s))
}).map_err(|e| actix_web::error::ErrorInternalServerError(e))))
}
fn camel_to_snake_case(s: &str) -> String {
let mut result = String::new();
for (i, ch) in s.chars().enumerate() {
if ch.is_uppercase() && i > 0 {
result.push('_');
}
result.push(ch.to_lowercase().next().unwrap());
}
result
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_camel_to_snake_case() {
assert_eq!(camel_to_snake_case("indexUid"), "index_uid");
assert_eq!(camel_to_snake_case("documentId"), "document_id");
assert_eq!(camel_to_snake_case("simple"), "simple");
}
}

View File

@@ -387,7 +387,8 @@ VectorEmbeddingError , InvalidRequest , BAD_REQUEST ;
NotFoundSimilarId , InvalidRequest , BAD_REQUEST ;
InvalidDocumentEditionContext , InvalidRequest , BAD_REQUEST ;
InvalidDocumentEditionFunctionFilter , InvalidRequest , BAD_REQUEST ;
EditDocumentsByFunctionError , InvalidRequest , BAD_REQUEST
EditDocumentsByFunctionError , InvalidRequest , BAD_REQUEST ;
InvalidSettingsIndexChat , InvalidRequest , BAD_REQUEST
}
impl ErrorCode for JoinError {

View File

@@ -158,6 +158,21 @@ impl Key {
updated_at: now,
}
}
pub fn default_chat() -> Self {
let now = OffsetDateTime::now_utc();
let uid = Uuid::new_v4();
Self {
name: Some("Default Chat API Key".to_string()),
description: Some("Use it to chat and search from the frontend".to_string()),
uid,
actions: vec![Action::Chat, Action::Search],
indexes: vec![IndexUidPattern::all()],
expires_at: None,
created_at: now,
updated_at: now,
}
}
}
fn parse_expiration_date(
@@ -308,6 +323,18 @@ pub enum Action {
#[serde(rename = "network.update")]
#[deserr(rename = "network.update")]
NetworkUpdate,
#[serde(rename = "chat.get")]
#[deserr(rename = "chat.get")]
Chat,
#[serde(rename = "chatSettings.*")]
#[deserr(rename = "chatSettings.*")]
ChatSettingsAll,
#[serde(rename = "chatSettings.get")]
#[deserr(rename = "chatSettings.get")]
ChatSettingsGet,
#[serde(rename = "chatSettings.update")]
#[deserr(rename = "chatSettings.update")]
ChatSettingsUpdate,
}
impl Action {
@@ -333,6 +360,9 @@ impl Action {
SETTINGS_ALL => Some(Self::SettingsAll),
SETTINGS_GET => Some(Self::SettingsGet),
SETTINGS_UPDATE => Some(Self::SettingsUpdate),
CHAT_SETTINGS_ALL => Some(Self::ChatSettingsAll),
CHAT_SETTINGS_GET => Some(Self::ChatSettingsGet),
CHAT_SETTINGS_UPDATE => Some(Self::ChatSettingsUpdate),
STATS_ALL => Some(Self::StatsAll),
STATS_GET => Some(Self::StatsGet),
METRICS_ALL => Some(Self::MetricsAll),
@@ -349,6 +379,7 @@ impl Action {
EXPERIMENTAL_FEATURES_UPDATE => Some(Self::ExperimentalFeaturesUpdate),
NETWORK_GET => Some(Self::NetworkGet),
NETWORK_UPDATE => Some(Self::NetworkUpdate),
CHAT => Some(Self::Chat),
_otherwise => None,
}
}
@@ -397,4 +428,9 @@ pub mod actions {
pub const NETWORK_GET: u8 = NetworkGet.repr();
pub const NETWORK_UPDATE: u8 = NetworkUpdate.repr();
pub const CHAT: u8 = Chat.repr();
pub const CHAT_SETTINGS_ALL: u8 = ChatSettingsAll.repr();
pub const CHAT_SETTINGS_GET: u8 = ChatSettingsGet.repr();
pub const CHAT_SETTINGS_UPDATE: u8 = ChatSettingsUpdate.repr();
}

View File

@@ -11,11 +11,13 @@ use fst::IntoStreamer;
use milli::disabled_typos_terms::DisabledTyposTerms;
use milli::index::{IndexEmbeddingConfig, PrefixSearch};
use milli::proximity::ProximityPrecision;
pub use milli::update::ChatSettings;
use milli::update::Setting;
use milli::{Criterion, CriterionError, FilterableAttributesRule, Index, DEFAULT_VALUES_PER_FACET};
use serde::{Deserialize, Serialize, Serializer};
use utoipa::ToSchema;
use super::{Checked, Unchecked};
use crate::deserr::DeserrJsonError;
use crate::error::deserr_codes::*;
use crate::facet_values_sort::FacetValuesSort;
@@ -199,72 +201,86 @@ pub struct Settings<T> {
#[deserr(default, error = DeserrJsonError<InvalidSettingsDisplayedAttributes>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["id", "title", "description", "url"]))]
pub displayed_attributes: WildcardSetting,
/// Fields in which to search for matching query words sorted by order of importance.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSearchableAttributes>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["title", "description"]))]
pub searchable_attributes: WildcardSetting,
/// Attributes to use for faceting and filtering. See [Filtering and Faceted Search](https://www.meilisearch.com/docs/learn/filtering_and_sorting/search_with_facet_filters).
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsFilterableAttributes>)]
#[schema(value_type = Option<Vec<FilterableAttributesRule>>, example = json!(["release_date", "genre"]))]
pub filterable_attributes: Setting<Vec<FilterableAttributesRule>>,
/// Attributes to use when sorting search results.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSortableAttributes>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["release_date"]))]
pub sortable_attributes: Setting<BTreeSet<String>>,
/// List of ranking rules sorted by order of importance. The order is customizable.
/// [A list of ordered built-in ranking rules](https://www.meilisearch.com/docs/learn/relevancy/relevancy).
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsRankingRules>)]
#[schema(value_type = Option<Vec<String>>, example = json!([RankingRuleView::Words, RankingRuleView::Typo, RankingRuleView::Proximity, RankingRuleView::Attribute, RankingRuleView::Exactness]))]
pub ranking_rules: Setting<Vec<RankingRuleView>>,
/// List of words ignored when present in search queries.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsStopWords>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["the", "a", "them", "their"]))]
pub stop_words: Setting<BTreeSet<String>>,
/// List of characters not delimiting where one term begins and ends.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsNonSeparatorTokens>)]
#[schema(value_type = Option<Vec<String>>, example = json!([" ", "\n"]))]
pub non_separator_tokens: Setting<BTreeSet<String>>,
/// List of characters delimiting where one term begins and ends.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSeparatorTokens>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["S"]))]
pub separator_tokens: Setting<BTreeSet<String>>,
/// List of strings Meilisearch should parse as a single term.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsDictionary>)]
#[schema(value_type = Option<Vec<String>>, example = json!(["iPhone pro"]))]
pub dictionary: Setting<BTreeSet<String>>,
/// List of associated words treated similarly. A word associated to an array of word as synonyms.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSynonyms>)]
#[schema(value_type = Option<BTreeMap<String, Vec<String>>>, example = json!({ "he": ["she", "they", "them"], "phone": ["iPhone", "android"]}))]
pub synonyms: Setting<BTreeMap<String, Vec<String>>>,
/// Search returns documents with distinct (different) values of the given field.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsDistinctAttribute>)]
#[schema(value_type = Option<String>, example = json!("sku"))]
pub distinct_attribute: Setting<String>,
/// Precision level when calculating the proximity ranking rule.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsProximityPrecision>)]
#[schema(value_type = Option<String>, example = json!(ProximityPrecisionView::ByAttribute))]
pub proximity_precision: Setting<ProximityPrecisionView>,
/// Customize typo tolerance feature.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsTypoTolerance>)]
#[schema(value_type = Option<TypoSettings>, example = json!({ "enabled": true, "disableOnAttributes": ["title"]}))]
pub typo_tolerance: Setting<TypoSettings>,
/// Faceting settings.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsFaceting>)]
#[schema(value_type = Option<FacetingSettings>, example = json!({ "maxValuesPerFacet": 10, "sortFacetValuesBy": { "genre": FacetValuesSort::Count }}))]
pub faceting: Setting<FacetingSettings>,
/// Pagination settings.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsPagination>)]
@@ -276,24 +292,34 @@ pub struct Settings<T> {
#[deserr(default, error = DeserrJsonError<InvalidSettingsEmbedders>)]
#[schema(value_type = Option<BTreeMap<String, SettingEmbeddingSettings>>)]
pub embedders: Setting<BTreeMap<String, SettingEmbeddingSettings>>,
/// Maximum duration of a search query.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsSearchCutoffMs>)]
#[schema(value_type = Option<u64>, example = json!(50))]
pub search_cutoff_ms: Setting<u64>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsLocalizedAttributes>)]
#[schema(value_type = Option<Vec<LocalizedAttributesRuleView>>, example = json!(50))]
pub localized_attributes: Setting<Vec<LocalizedAttributesRuleView>>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsFacetSearch>)]
#[schema(value_type = Option<bool>, example = json!(true))]
pub facet_search: Setting<bool>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsPrefixSearch>)]
#[schema(value_type = Option<PrefixSearchSettings>, example = json!("Hemlo"))]
pub prefix_search: Setting<PrefixSearchSettings>,
/// Customize the chat prompting.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default, error = DeserrJsonError<InvalidSettingsIndexChat>)]
#[schema(value_type = Option<ChatSettings>)]
pub chat: Setting<ChatSettings>,
#[serde(skip)]
#[deserr(skip)]
pub _kind: PhantomData<T>,
@@ -359,6 +385,7 @@ impl Settings<Checked> {
localized_attributes: Setting::Reset,
facet_search: Setting::Reset,
prefix_search: Setting::Reset,
chat: Setting::Reset,
_kind: PhantomData,
}
}
@@ -385,6 +412,7 @@ impl Settings<Checked> {
localized_attributes: localized_attributes_rules,
facet_search,
prefix_search,
chat,
_kind,
} = self;
@@ -409,6 +437,7 @@ impl Settings<Checked> {
localized_attributes: localized_attributes_rules,
facet_search,
prefix_search,
chat,
_kind: PhantomData,
}
}
@@ -459,6 +488,7 @@ impl Settings<Unchecked> {
localized_attributes: self.localized_attributes,
facet_search: self.facet_search,
prefix_search: self.prefix_search,
chat: self.chat,
_kind: PhantomData,
}
}
@@ -533,8 +563,9 @@ impl Settings<Unchecked> {
Setting::Set(this)
}
},
prefix_search: other.prefix_search.or(self.prefix_search),
facet_search: other.facet_search.or(self.facet_search),
prefix_search: other.prefix_search.or(self.prefix_search),
chat: other.chat.clone().or(self.chat.clone()),
_kind: PhantomData,
}
}
@@ -573,6 +604,7 @@ pub fn apply_settings_to_builder(
localized_attributes: localized_attributes_rules,
facet_search,
prefix_search,
chat,
_kind,
} = settings;
@@ -783,6 +815,12 @@ pub fn apply_settings_to_builder(
Setting::Reset => builder.reset_facet_search(),
Setting::NotSet => (),
}
match chat {
Setting::Set(chat) => builder.set_chat(chat.clone()),
Setting::Reset => builder.reset_chat(),
Setting::NotSet => (),
}
}
pub enum SecretPolicy {
@@ -880,14 +918,11 @@ pub fn settings(
})
.collect();
let embedders = Setting::Set(embedders);
let search_cutoff_ms = index.search_cutoff(rtxn)?;
let localized_attributes_rules = index.localized_attributes_rules(rtxn)?;
let prefix_search = index.prefix_search(rtxn)?.map(PrefixSearchSettings::from);
let facet_search = index.facet_search(rtxn)?;
let chat = index.chat_config(rtxn).map(ChatSettings::from)?;
let mut settings = Settings {
displayed_attributes: match displayed_attributes {
@@ -925,8 +960,9 @@ pub fn settings(
Some(rules) => Setting::Set(rules.into_iter().map(|r| r.into()).collect()),
None => Setting::Reset,
},
prefix_search: Setting::Set(prefix_search.unwrap_or_default()),
facet_search: Setting::Set(facet_search),
prefix_search: Setting::Set(prefix_search.unwrap_or_default()),
chat: Setting::Set(chat),
_kind: PhantomData,
};
@@ -1154,6 +1190,7 @@ pub(crate) mod test {
search_cutoff_ms: Setting::NotSet,
facet_search: Setting::NotSet,
prefix_search: Setting::NotSet,
chat: Setting::NotSet,
_kind: PhantomData::<Unchecked>,
};
@@ -1185,6 +1222,8 @@ pub(crate) mod test {
search_cutoff_ms: Setting::NotSet,
facet_search: Setting::NotSet,
prefix_search: Setting::NotSet,
chat: Setting::NotSet,
_kind: PhantomData::<Unchecked>,
};

View File

@@ -32,6 +32,7 @@ async-trait = "0.1.85"
bstr = "1.11.3"
byte-unit = { version = "5.1.6", features = ["serde"] }
bytes = "1.9.0"
bumpalo = "3.16.0"
clap = { version = "4.5.24", features = ["derive", "env"] }
crossbeam-channel = "0.5.15"
deserr = { version = "0.6.3", features = ["actix-web"] }
@@ -48,9 +49,9 @@ is-terminal = "0.4.13"
itertools = "0.14.0"
jsonwebtoken = "9.3.0"
lazy_static = "1.5.0"
liquid = "0.26.9"
meilisearch-auth = { path = "../meilisearch-auth" }
meilisearch-types = { path = "../meilisearch-types" }
meilisearch-mcp = { path = "../meilisearch-mcp", optional = true }
mimalloc = { version = "0.1.43", default-features = false }
mime = "0.3.17"
num_cpus = "1.16.0"
@@ -112,6 +113,8 @@ utoipa = { version = "5.3.1", features = [
"openapi_extensions",
] }
utoipa-scalar = { version = "0.3.0", optional = true, features = ["actix-web"] }
async-openai = { git = "https://github.com/meilisearch/async-openai", branch = "optional-type-function" }
actix-web-lab = { version = "0.24.1", default-features = false }
[dev-dependencies]
actix-rt = "2.10.0"
@@ -143,7 +146,6 @@ zip = { version = "2.3.0", optional = true }
default = ["meilisearch-types/all-tokenizations", "mini-dashboard"]
swagger = ["utoipa-scalar"]
test-ollama = []
mcp = ["meilisearch-mcp"]
mini-dashboard = [
"static-files",
"anyhow",

View File

@@ -4,6 +4,7 @@ use std::marker::PhantomData;
use std::ops::Deref;
use std::pin::Pin;
use actix_web::http::header::AUTHORIZATION;
use actix_web::web::Data;
use actix_web::FromRequest;
pub use error::AuthenticationError;
@@ -94,36 +95,44 @@ impl<P: Policy + 'static, D: 'static + Clone> FromRequest for GuardedData<P, D>
_payload: &mut actix_web::dev::Payload,
) -> Self::Future {
match req.app_data::<Data<AuthController>>().cloned() {
Some(auth) => match req
.headers()
.get("Authorization")
.map(|type_token| type_token.to_str().unwrap_or_default().splitn(2, ' '))
{
Some(mut type_token) => match type_token.next() {
Some("Bearer") => {
// TODO: find a less hardcoded way?
let index = req.match_info().get("index_uid");
match type_token.next() {
Some(token) => Box::pin(Self::auth_bearer(
auth,
token.to_string(),
index.map(String::from),
req.app_data::<D>().cloned(),
)),
None => Box::pin(err(AuthenticationError::InvalidToken.into())),
}
}
_otherwise => {
Box::pin(err(AuthenticationError::MissingAuthorizationHeader.into()))
}
},
None => Box::pin(Self::auth_token(auth, req.app_data::<D>().cloned())),
Some(auth) => match extract_token_from_request(req) {
Ok(Some(token)) => {
// TODO: find a less hardcoded way?
let index = req.match_info().get("index_uid");
Box::pin(Self::auth_bearer(
auth,
token.to_string(),
index.map(String::from),
req.app_data::<D>().cloned(),
))
}
Ok(None) => Box::pin(Self::auth_token(auth, req.app_data::<D>().cloned())),
Err(e) => Box::pin(err(e.into())),
},
None => Box::pin(err(AuthenticationError::IrretrievableState.into())),
}
}
}
pub fn extract_token_from_request(
req: &actix_web::HttpRequest,
) -> Result<Option<&str>, AuthenticationError> {
match req
.headers()
.get(AUTHORIZATION)
.map(|type_token| type_token.to_str().unwrap_or_default().splitn(2, ' '))
{
Some(mut type_token) => match type_token.next() {
Some("Bearer") => match type_token.next() {
Some(token) => Ok(Some(token)),
None => Err(AuthenticationError::InvalidToken),
},
_otherwise => Err(AuthenticationError::MissingAuthorizationHeader),
},
None => Ok(None),
}
}
pub trait Policy {
fn authenticate(
auth: Data<AuthController>,
@@ -299,8 +308,8 @@ pub mod policies {
auth: &AuthController,
token: &str,
) -> Result<TenantTokenOutcome, AuthError> {
// Only search action can be accessed by a tenant token.
if A != actions::SEARCH {
// Only search and chat actions can be accessed by a tenant token.
if A != actions::SEARCH && A != actions::CHAT {
return Ok(TenantTokenOutcome::NotATenantToken);
}

View File

@@ -630,7 +630,6 @@ pub fn configure_data(
.app_data(
web::QueryConfig::default().error_handler(|err, _req| PayloadError::from(err).into()),
);
}
#[cfg(feature = "mini-dashboard")]

View File

@@ -0,0 +1,560 @@
use std::cell::RefCell;
use std::collections::HashMap;
use std::mem;
use std::sync::RwLock;
use std::time::Duration;
use actix_web::web::{self, Data};
use actix_web::{Either, HttpRequest, HttpResponse, Responder};
use actix_web_lab::sse::{self, Event, Sse};
use async_openai::config::OpenAIConfig;
use async_openai::types::{
ChatCompletionMessageToolCall, ChatCompletionMessageToolCallChunk,
ChatCompletionRequestAssistantMessageArgs, ChatCompletionRequestMessage,
ChatCompletionRequestSystemMessage, ChatCompletionRequestSystemMessageContent,
ChatCompletionRequestToolMessage, ChatCompletionRequestToolMessageContent,
ChatCompletionStreamResponseDelta, ChatCompletionToolArgs, ChatCompletionToolType,
CreateChatCompletionRequest, FinishReason, FunctionCall, FunctionCallStream,
FunctionObjectArgs,
};
use async_openai::Client;
use bumpalo::Bump;
use futures::StreamExt;
use index_scheduler::IndexScheduler;
use meilisearch_auth::AuthController;
use meilisearch_types::error::ResponseError;
use meilisearch_types::heed::RoTxn;
use meilisearch_types::keys::actions;
use meilisearch_types::milli::index::ChatConfig;
use meilisearch_types::milli::prompt::{Prompt, PromptData};
use meilisearch_types::milli::update::new::document::DocumentFromDb;
use meilisearch_types::milli::{
DocumentId, FieldIdMapWithMetadata, GlobalFieldsIdsMap, MetadataBuilder, TimeBudget,
};
use meilisearch_types::Index;
use serde::Deserialize;
use serde_json::json;
use tokio::runtime::Handle;
use tokio::sync::mpsc::error::SendError;
use super::settings::chat::{ChatPrompts, GlobalChatSettings};
use crate::error::MeilisearchHttpError;
use crate::extractors::authentication::policies::ActionPolicy;
use crate::extractors::authentication::{extract_token_from_request, GuardedData, Policy as _};
use crate::metrics::MEILISEARCH_DEGRADED_SEARCH_REQUESTS;
use crate::routes::indexes::search::search_kind;
use crate::search::{
add_search_rules, prepare_search, search_from_kind, HybridQuery, MatchingStrategy, SearchQuery,
SemanticRatio,
};
use crate::search_queue::SearchQueue;
const EMBEDDER_NAME: &str = "openai";
const SEARCH_IN_INDEX_FUNCTION_NAME: &str = "_meiliSearchInIndex";
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(web::resource("/completions").route(web::post().to(chat)));
}
/// Get a chat completion
async fn chat(
index_scheduler: GuardedData<ActionPolicy<{ actions::CHAT }>, Data<IndexScheduler>>,
auth_ctrl: web::Data<AuthController>,
req: HttpRequest,
search_queue: web::Data<SearchQueue>,
web::Json(chat_completion): web::Json<CreateChatCompletionRequest>,
) -> impl Responder {
// To enable later on, when the feature will be experimental
// index_scheduler.features().check_chat("Using the /chat route")?;
assert_eq!(
chat_completion.n.unwrap_or(1),
1,
"Meilisearch /chat only support one completion at a time (n = 1, n = null)"
);
if chat_completion.stream.unwrap_or(false) {
Either::Right(
streamed_chat(index_scheduler, auth_ctrl, req, search_queue, chat_completion).await,
)
} else {
Either::Left(
non_streamed_chat(index_scheduler, auth_ctrl, req, search_queue, chat_completion).await,
)
}
}
/// Setup search tool in chat completion request
fn setup_search_tool(
index_scheduler: &Data<IndexScheduler>,
filters: &meilisearch_auth::AuthFilter,
chat_completion: &mut CreateChatCompletionRequest,
prompts: &ChatPrompts,
) -> Result<(), ResponseError> {
let tools = chat_completion.tools.get_or_insert_default();
if tools.iter().find(|t| t.function.name == SEARCH_IN_INDEX_FUNCTION_NAME).is_some() {
panic!("{SEARCH_IN_INDEX_FUNCTION_NAME} function already set");
}
let index_uids: Vec<_> = index_scheduler
.index_names()?
.into_iter()
.filter(|index_uid| filters.is_index_authorized(&index_uid))
.collect();
let tool = ChatCompletionToolArgs::default()
.r#type(ChatCompletionToolType::Function)
.function(
FunctionObjectArgs::default()
.name(SEARCH_IN_INDEX_FUNCTION_NAME)
.description(&prompts.search_description)
.parameters(json!({
"type": "object",
"properties": {
"index_uid": {
"type": "string",
"enum": index_uids,
"description": prompts.search_index_uid_param,
},
"q": {
// Unfortunately, Mistral does not support an array of types, here.
// "type": ["string", "null"],
"type": "string",
"description": prompts.search_q_param,
}
},
"required": ["index_uid", "q"],
"additionalProperties": false,
}))
.strict(true)
.build()
.unwrap(),
)
.build()
.unwrap();
tools.push(tool);
chat_completion.messages.insert(
0,
ChatCompletionRequestMessage::System(ChatCompletionRequestSystemMessage {
content: ChatCompletionRequestSystemMessageContent::Text(prompts.system.clone()),
name: None,
}),
);
Ok(())
}
/// Process search request and return formatted results
async fn process_search_request(
index_scheduler: &GuardedData<ActionPolicy<{ actions::CHAT }>, Data<IndexScheduler>>,
auth_ctrl: web::Data<AuthController>,
search_queue: &web::Data<SearchQueue>,
auth_token: &str,
index_uid: String,
q: Option<String>,
) -> Result<(Index, String), ResponseError> {
let mut query = SearchQuery {
q,
hybrid: Some(HybridQuery {
semantic_ratio: SemanticRatio::default(),
embedder: EMBEDDER_NAME.to_string(),
}),
limit: 20,
matching_strategy: MatchingStrategy::Frequency,
..Default::default()
};
let auth_filter = ActionPolicy::<{ actions::SEARCH }>::authenticate(
auth_ctrl,
auth_token,
Some(index_uid.as_str()),
)?;
// Tenant token search_rules.
if let Some(search_rules) = auth_filter.get_index_search_rules(&index_uid) {
add_search_rules(&mut query.filter, search_rules);
}
// TBD
// let mut aggregate = SearchAggregator::<SearchPOST>::from_query(&query);
let index = index_scheduler.index(&index_uid)?;
let search_kind =
search_kind(&query, index_scheduler.get_ref(), index_uid.to_string(), &index)?;
let permit = search_queue.try_get_search_permit().await?;
let features = index_scheduler.features();
let index_cloned = index.clone();
let search_result = tokio::task::spawn_blocking(move || -> Result<_, ResponseError> {
let rtxn = index_cloned.read_txn()?;
let time_budget = match index_cloned
.search_cutoff(&rtxn)
.map_err(|e| MeilisearchHttpError::from_milli(e, Some(index_uid.clone())))?
{
Some(cutoff) => TimeBudget::new(Duration::from_millis(cutoff)),
None => TimeBudget::default(),
};
let (search, _is_finite_pagination, _max_total_hits, _offset) =
prepare_search(&index_cloned, &rtxn, &query, &search_kind, time_budget, features)?;
search_from_kind(index_uid, search_kind, search)
.map(|(search_results, _)| search_results)
.map_err(ResponseError::from)
})
.await;
permit.drop().await;
let search_result = search_result?;
if let Ok(ref search_result) = search_result {
// aggregate.succeed(search_result);
if search_result.degraded {
MEILISEARCH_DEGRADED_SEARCH_REQUESTS.inc();
}
}
// analytics.publish(aggregate, &req);
let search_result = search_result?;
let rtxn = index.read_txn()?;
let render_alloc = Bump::new();
let formatted = format_documents(&rtxn, &index, &render_alloc, search_result.documents_ids)?;
let text = formatted.join("\n");
drop(rtxn);
Ok((index, text))
}
async fn non_streamed_chat(
index_scheduler: GuardedData<ActionPolicy<{ actions::CHAT }>, Data<IndexScheduler>>,
auth_ctrl: web::Data<AuthController>,
req: HttpRequest,
search_queue: web::Data<SearchQueue>,
mut chat_completion: CreateChatCompletionRequest,
) -> Result<HttpResponse, ResponseError> {
let filters = index_scheduler.filters();
let chat_settings = match index_scheduler.chat_settings().unwrap() {
Some(value) => serde_json::from_value(value).unwrap(),
None => GlobalChatSettings::default(),
};
let mut config = OpenAIConfig::default();
if let Some(api_key) = chat_settings.api_key.as_ref() {
config = config.with_api_key(api_key);
}
if let Some(base_api) = chat_settings.base_api.as_ref() {
config = config.with_api_base(base_api);
}
let client = Client::with_config(config);
let auth_token = extract_token_from_request(&req)?.unwrap();
setup_search_tool(&index_scheduler, filters, &mut chat_completion, &chat_settings.prompts)?;
let mut response;
loop {
response = client.chat().create(chat_completion.clone()).await.unwrap();
let choice = &mut response.choices[0];
match choice.finish_reason {
Some(FinishReason::ToolCalls) => {
let tool_calls = mem::take(&mut choice.message.tool_calls).unwrap_or_default();
let (meili_calls, other_calls): (Vec<_>, Vec<_>) = tool_calls
.into_iter()
.partition(|call| call.function.name == SEARCH_IN_INDEX_FUNCTION_NAME);
chat_completion.messages.push(
ChatCompletionRequestAssistantMessageArgs::default()
.tool_calls(meili_calls.clone())
.build()
.unwrap()
.into(),
);
for call in meili_calls {
let result = match serde_json::from_str(&call.function.arguments) {
Ok(SearchInIndexParameters { index_uid, q }) => process_search_request(
&index_scheduler,
auth_ctrl.clone(),
&search_queue,
&auth_token,
index_uid,
q,
)
.await
.map_err(|e| e.to_string()),
Err(err) => Err(err.to_string()),
};
let text = match result {
Ok((_, text)) => text,
Err(err) => err,
};
chat_completion.messages.push(ChatCompletionRequestMessage::Tool(
ChatCompletionRequestToolMessage {
tool_call_id: call.id.clone(),
content: ChatCompletionRequestToolMessageContent::Text(format!(
"{}\n\n{text}",
chat_settings.prompts.pre_query
)),
},
));
}
// Let the client call other tools by themselves
if !other_calls.is_empty() {
response.choices[0].message.tool_calls = Some(other_calls);
break;
}
}
_ => break,
}
}
Ok(HttpResponse::Ok().json(response))
}
async fn streamed_chat(
index_scheduler: GuardedData<ActionPolicy<{ actions::CHAT }>, Data<IndexScheduler>>,
auth_ctrl: web::Data<AuthController>,
req: HttpRequest,
search_queue: web::Data<SearchQueue>,
mut chat_completion: CreateChatCompletionRequest,
) -> Result<impl Responder, ResponseError> {
let filters = index_scheduler.filters();
let chat_settings = match index_scheduler.chat_settings().unwrap() {
Some(value) => serde_json::from_value(value).unwrap(),
None => GlobalChatSettings::default(),
};
let mut config = OpenAIConfig::default();
if let Some(api_key) = chat_settings.api_key.as_ref() {
config = config.with_api_key(api_key);
}
if let Some(base_api) = chat_settings.base_api.as_ref() {
config = config.with_api_base(base_api);
}
let auth_token = extract_token_from_request(&req)?.unwrap().to_string();
setup_search_tool(&index_scheduler, filters, &mut chat_completion, &chat_settings.prompts)?;
let (tx, rx) = tokio::sync::mpsc::channel(10);
let _join_handle = Handle::current().spawn(async move {
let client = Client::with_config(config.clone());
let mut global_tool_calls = HashMap::<u32, Call>::new();
let mut finish_reason = None;
// Limit the number of internal calls to satisfy the search requests of the LLM
'main: for _ in 0..20 {
let mut response = client.chat().create_stream(chat_completion.clone()).await.unwrap();
while let Some(result) = response.next().await {
match result {
Ok(resp) => {
let choice = &resp.choices[0];
finish_reason = choice.finish_reason;
#[allow(deprecated)]
let ChatCompletionStreamResponseDelta {
content,
// Using deprecated field but keeping for compatibility
function_call: _,
ref tool_calls,
role: _,
refusal: _,
} = &choice.delta;
if content.is_some() {
if let Err(SendError(_)) = tx.send(Event::Data(sse::Data::new_json(&resp).unwrap())).await {
return;
}
}
match tool_calls {
Some(tool_calls) => {
for chunk in tool_calls {
let ChatCompletionMessageToolCallChunk {
index,
id,
r#type: _,
function,
} = chunk;
let FunctionCallStream { name, arguments } =
function.as_ref().unwrap();
global_tool_calls
.entry(*index)
.and_modify(|call| call.append(arguments.as_ref().unwrap()))
.or_insert_with(|| Call {
id: id.as_ref().unwrap().clone(),
function_name: name.as_ref().unwrap().clone(),
arguments: arguments.as_ref().unwrap().clone(),
});
}
}
None if !global_tool_calls.is_empty() => {
let (meili_calls, _other_calls): (Vec<_>, Vec<_>) =
mem::take(&mut global_tool_calls)
.into_values()
.map(|call| ChatCompletionMessageToolCall {
id: call.id,
r#type: Some(ChatCompletionToolType::Function),
function: FunctionCall {
name: call.function_name,
arguments: call.arguments,
},
})
.partition(|call| call.function.name == SEARCH_IN_INDEX_FUNCTION_NAME);
chat_completion.messages.push(
ChatCompletionRequestAssistantMessageArgs::default()
.tool_calls(meili_calls.clone())
.build()
.unwrap()
.into(),
);
for call in meili_calls {
if let Err(SendError(_)) = tx.send(Event::Data(
sse::Data::new_json(json!({
"object": "chat.completion.tool.call",
"tool": call,
}))
.unwrap(),
))
.await {
return;
}
let result = match serde_json::from_str(&call.function.arguments) {
Ok(SearchInIndexParameters { index_uid, q }) => process_search_request(
&index_scheduler,
auth_ctrl.clone(),
&search_queue,
&auth_token,
index_uid,
q,
).await.map_err(|e| e.to_string()),
Err(err) => Err(err.to_string()),
};
let is_error = result.is_err();
let text = match result {
Ok((_, text)) => text,
Err(err) => err,
};
let tool = ChatCompletionRequestToolMessage {
tool_call_id: call.id.clone(),
content: ChatCompletionRequestToolMessageContent::Text(
format!("{}\n\n{text}", chat_settings.prompts.pre_query),
),
};
if let Err(SendError(_)) = tx.send(Event::Data(
sse::Data::new_json(json!({
"object": if is_error {
"chat.completion.tool.error"
} else {
"chat.completion.tool.output"
},
"tool": ChatCompletionRequestToolMessage {
tool_call_id: call.id,
content: ChatCompletionRequestToolMessageContent::Text(
text,
),
},
}))
.unwrap(),
))
.await {
return;
}
chat_completion.messages.push(ChatCompletionRequestMessage::Tool(tool));
}
}
None => (),
}
}
Err(err) => {
tracing::error!("{err:?}");
if let Err(SendError(_)) = tx.send(Event::Data(sse::Data::new_json(&json!({
"object": "chat.completion.error",
"tool": err.to_string(),
})).unwrap())).await {
return;
}
break 'main;
}
}
}
// We must stop if the finish reason is not something we can solve with Meilisearch
if finish_reason.map_or(true, |fr| fr != FinishReason::ToolCalls) {
break;
}
}
let _ = tx.send(Event::Data(sse::Data::new("[DONE]")));
});
Ok(Sse::from_infallible_receiver(rx).with_retry_duration(Duration::from_secs(10)))
}
/// The structure used to aggregate the function calls to make.
#[derive(Debug)]
struct Call {
id: String,
function_name: String,
arguments: String,
}
impl Call {
fn append(&mut self, arguments: &str) {
self.arguments.push_str(arguments);
}
}
#[derive(Deserialize)]
struct SearchInIndexParameters {
/// The index uid to search in.
index_uid: String,
/// The query parameter to use.
q: Option<String>,
}
fn format_documents<'t, 'doc>(
rtxn: &RoTxn<'t>,
index: &Index,
doc_alloc: &'doc Bump,
internal_docids: Vec<DocumentId>,
) -> Result<Vec<&'doc str>, ResponseError> {
let ChatConfig { prompt: PromptData { template, max_bytes }, .. } = index.chat_config(rtxn)?;
let prompt = Prompt::new(template, max_bytes).unwrap();
let fid_map = index.fields_ids_map(rtxn)?;
let metadata_builder = MetadataBuilder::from_index(index, rtxn)?;
let fid_map_with_meta = FieldIdMapWithMetadata::new(fid_map.clone(), metadata_builder);
let global = RwLock::new(fid_map_with_meta);
let gfid_map = RefCell::new(GlobalFieldsIdsMap::new(&global));
let external_ids: Vec<String> = index
.external_id_of(rtxn, internal_docids.iter().copied())?
.into_iter()
.collect::<Result<_, _>>()?;
let mut renders = Vec::new();
for (docid, external_docid) in internal_docids.into_iter().zip(external_ids) {
let document = match DocumentFromDb::new(docid, rtxn, index, &fid_map)? {
Some(doc) => doc,
None => continue,
};
let text = prompt.render_document(&external_docid, document, &gfid_map, doc_alloc).unwrap();
renders.push(text);
}
Ok(renders)
}

View File

@@ -6,7 +6,7 @@ use meilisearch_types::deserr::DeserrJsonError;
use meilisearch_types::error::ResponseError;
use meilisearch_types::index_uid::IndexUid;
use meilisearch_types::settings::{
settings, SecretPolicy, SettingEmbeddingSettings, Settings, Unchecked,
settings, ChatSettings, SecretPolicy, SettingEmbeddingSettings, Settings, Unchecked,
};
use meilisearch_types::tasks::KindWithContent;
use tracing::debug;
@@ -508,6 +508,17 @@ make_setting_routes!(
camelcase_attr: "prefixSearch",
analytics: PrefixSearchAnalytics
},
{
route: "/chat",
update_verb: put,
value_type: ChatSettings,
err_type: meilisearch_types::deserr::DeserrJsonError<
meilisearch_types::error::deserr_codes::InvalidSettingsIndexChat,
>,
attr: chat,
camelcase_attr: "chat",
analytics: ChatAnalytics
},
);
#[utoipa::path(
@@ -597,6 +608,7 @@ pub async fn update_all(
),
facet_search: FacetSearchAnalytics::new(new_settings.facet_search.as_ref().set()),
prefix_search: PrefixSearchAnalytics::new(new_settings.prefix_search.as_ref().set()),
chat: ChatAnalytics::new(new_settings.chat.as_ref().set()),
},
&req,
);

View File

@@ -10,8 +10,8 @@ use meilisearch_types::locales::{Locale, LocalizedAttributesRuleView};
use meilisearch_types::milli::update::Setting;
use meilisearch_types::milli::FilterableAttributesRule;
use meilisearch_types::settings::{
FacetingSettings, PaginationSettings, PrefixSearchSettings, ProximityPrecisionView,
RankingRuleView, SettingEmbeddingSettings, TypoSettings,
ChatSettings, FacetingSettings, PaginationSettings, PrefixSearchSettings,
ProximityPrecisionView, RankingRuleView, SettingEmbeddingSettings, TypoSettings,
};
use serde::Serialize;
@@ -39,6 +39,7 @@ pub struct SettingsAnalytics {
pub non_separator_tokens: NonSeparatorTokensAnalytics,
pub facet_search: FacetSearchAnalytics,
pub prefix_search: PrefixSearchAnalytics,
pub chat: ChatAnalytics,
}
impl Aggregate for SettingsAnalytics {
@@ -198,6 +199,7 @@ impl Aggregate for SettingsAnalytics {
set: new.prefix_search.set | self.prefix_search.set,
value: new.prefix_search.value.or(self.prefix_search.value),
},
chat: ChatAnalytics { set: new.chat.set | self.chat.set },
})
}
@@ -674,3 +676,18 @@ impl PrefixSearchAnalytics {
SettingsAnalytics { prefix_search: self, ..Default::default() }
}
}
#[derive(Serialize, Default)]
pub struct ChatAnalytics {
pub set: bool,
}
impl ChatAnalytics {
pub fn new(settings: Option<&ChatSettings>) -> Self {
Self { set: settings.is_some() }
}
pub fn into_settings(self) -> SettingsAnalytics {
SettingsAnalytics { chat: self, ..Default::default() }
}
}

View File

@@ -52,6 +52,7 @@ const PAGINATION_DEFAULT_LIMIT_FN: fn() -> usize = || 20;
mod api_key;
pub mod batches;
pub mod chat;
mod dump;
pub mod features;
pub mod indexes;
@@ -61,6 +62,7 @@ mod multi_search;
mod multi_search_analytics;
pub mod network;
mod open_api_utils;
pub mod settings;
mod snapshot;
mod swap_indexes;
pub mod tasks;
@@ -113,14 +115,9 @@ pub fn configure(cfg: &mut web::ServiceConfig) {
.service(web::scope("/swap-indexes").configure(swap_indexes::configure))
.service(web::scope("/metrics").configure(metrics::configure))
.service(web::scope("/experimental-features").configure(features::configure))
.service(web::scope("/network").configure(network::configure));
#[cfg(feature = "mcp")]
{
use meilisearch_mcp::integration::configure_mcp_route;
let openapi = MeilisearchApi::openapi();
configure_mcp_route(cfg, openapi);
}
.service(web::scope("/network").configure(network::configure))
.service(web::scope("/chat").configure(chat::configure))
.service(web::scope("/settings/chat").configure(settings::chat::configure));
#[cfg(feature = "swagger")]
{

View File

@@ -0,0 +1,107 @@
use actix_web::web::{self, Data};
use actix_web::HttpResponse;
use index_scheduler::IndexScheduler;
use meilisearch_types::error::ResponseError;
use meilisearch_types::keys::actions;
use serde::{Deserialize, Serialize};
use crate::extractors::authentication::policies::ActionPolicy;
use crate::extractors::authentication::GuardedData;
use crate::extractors::sequential_extractor::SeqHandler;
pub fn configure(cfg: &mut web::ServiceConfig) {
cfg.service(
web::resource("")
.route(web::get().to(get_settings))
.route(web::patch().to(SeqHandler(patch_settings))),
);
}
async fn get_settings(
index_scheduler: GuardedData<
ActionPolicy<{ actions::CHAT_SETTINGS_GET }>,
Data<IndexScheduler>,
>,
) -> Result<HttpResponse, ResponseError> {
let settings = match index_scheduler.chat_settings()? {
Some(value) => serde_json::from_value(value).unwrap(),
None => GlobalChatSettings::default(),
};
Ok(HttpResponse::Ok().json(settings))
}
async fn patch_settings(
index_scheduler: GuardedData<
ActionPolicy<{ actions::CHAT_SETTINGS_UPDATE }>,
Data<IndexScheduler>,
>,
web::Json(chat_settings): web::Json<GlobalChatSettings>,
) -> Result<HttpResponse, ResponseError> {
let chat_settings = serde_json::to_value(chat_settings).unwrap();
index_scheduler.put_chat_settings(&chat_settings)?;
Ok(HttpResponse::Ok().finish())
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
pub struct GlobalChatSettings {
pub source: String,
pub base_api: Option<String>,
pub api_key: Option<String>,
pub prompts: ChatPrompts,
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
pub struct ChatPrompts {
pub system: String,
pub search_description: String,
pub search_q_param: String,
pub search_index_uid_param: String,
pub pre_query: String,
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
pub struct ChatIndexSettings {
pub description: String,
pub document_template: String,
}
const DEFAULT_SYSTEM_MESSAGE: &str = "You are a highly capable research assistant with access to powerful search tools. IMPORTANT INSTRUCTIONS:\
1. When answering questions, you MUST make multiple tool calls (at least 2-3) to gather comprehensive information.\
2. Use different search queries for each tool call - vary keywords, rephrase questions, and explore different semantic angles to ensure broad coverage.\
3. Always explicitly announce BEFORE making each tool call by saying: \"I'll search for [specific information] now.\"\
4. Combine information from ALL tool calls to provide complete, nuanced answers rather than relying on a single source.\
5. For complex topics, break down your research into multiple targeted queries rather than using a single generic search.";
/// The default description of the searchInIndex tool provided to OpenAI.
const DEFAULT_SEARCH_IN_INDEX_TOOL_DESCRIPTION: &str =
"Search the database for relevant JSON documents using an optional query.";
/// The default description of the searchInIndex `q` parameter tool provided to OpenAI.
const DEFAULT_SEARCH_IN_INDEX_Q_PARAMETER_TOOL_DESCRIPTION: &str =
"The search query string used to find relevant documents in the index. \
This should contain keywords or phrases that best represent what the user is looking for. \
More specific queries will yield more precise results.";
/// The default description of the searchInIndex `index` parameter tool provided to OpenAI.
const DEFAULT_SEARCH_IN_INDEX_INDEX_PARAMETER_TOOL_DESCRIPTION: &str =
"The name of the index to search within. An index is a collection of documents organized for search. \
Selecting the right index ensures the most relevant results for the user query";
impl Default for GlobalChatSettings {
fn default() -> Self {
GlobalChatSettings {
source: "openAi".to_string(),
base_api: None,
api_key: None,
prompts: ChatPrompts {
system: DEFAULT_SYSTEM_MESSAGE.to_string(),
search_description: DEFAULT_SEARCH_IN_INDEX_TOOL_DESCRIPTION.to_string(),
search_q_param: DEFAULT_SEARCH_IN_INDEX_Q_PARAMETER_TOOL_DESCRIPTION.to_string(),
search_index_uid_param: DEFAULT_SEARCH_IN_INDEX_INDEX_PARAMETER_TOOL_DESCRIPTION
.to_string(),
pre_query: "".to_string(),
},
}
}
}

View File

@@ -0,0 +1 @@
pub mod chat;

View File

@@ -882,7 +882,7 @@ pub fn add_search_rules(filter: &mut Option<Value>, rules: IndexSearchRules) {
}
}
fn prepare_search<'t>(
pub fn prepare_search<'t>(
index: &'t Index,
rtxn: &'t RoTxn,
query: &'t SearchQuery,

View File

@@ -820,6 +820,22 @@ async fn list_api_keys() {
"createdAt": "[ignored]",
"updatedAt": "[ignored]"
},
{
"name": "Default Chat API Key",
"description": "Use it to chat and search from the frontend",
"key": "[ignored]",
"uid": "[ignored]",
"actions": [
"search",
"chat.get"
],
"indexes": [
"*"
],
"expiresAt": null,
"createdAt": "[ignored]",
"updatedAt": "[ignored]"
},
{
"name": "Default Search API Key",
"description": "Use it to search from the frontend",

View File

@@ -28,7 +28,6 @@ async fn error_delete_unexisting_index() {
let (task, code) = index.delete_index_fail().await;
assert_eq!(code, 202);
index.wait_task(task.uid()).await.failed();
let expected_response = json!({
"message": "Index `DOES_NOT_EXISTS` not found.",
@@ -58,7 +57,7 @@ async fn loop_delete_add_documents() {
}
for task in tasks {
let response = index.wait_task(task).await.succeeded();
let response = index.wait_task(task).await;
assert_eq!(response["status"], "succeeded", "{}", response);
}
}

View File

@@ -52,28 +52,19 @@ async fn no_index_return_empty_list() {
#[actix_rt::test]
async fn list_multiple_indexes() {
let server = Server::new_shared();
let server = Server::new().await;
server.index("test").create(None).await;
let (task, _status_code) = server.index("test1").create(Some("key")).await;
let index_without_key = server.unique_index();
let (response_without_key, _status_code) = index_without_key.create(None).await;
server.index("test").wait_task(task.uid()).await.succeeded();
let index_with_key = server.unique_index();
let (response_with_key, _status_code) = index_with_key.create(Some("key")).await;
index_without_key.wait_task(response_without_key.uid()).await.succeeded();
index_with_key.wait_task(response_with_key.uid()).await.succeeded();
let (response, code) = server.list_indexes(None, Some(1000)).await;
let (response, code) = server.list_indexes(None, None).await;
assert_eq!(code, 200);
assert!(response["results"].is_array());
let arr = response["results"].as_array().unwrap();
assert!(arr.len() >= 2, "Expected at least 2 indexes.");
assert!(arr
.iter()
.any(|entry| entry["uid"] == index_without_key.uid && entry["primaryKey"] == Value::Null));
assert!(arr
.iter()
.any(|entry| entry["uid"] == index_with_key.uid && entry["primaryKey"] == "key"));
assert_eq!(arr.len(), 2);
assert!(arr.iter().any(|entry| entry["uid"] == "test" && entry["primaryKey"] == Value::Null));
assert!(arr.iter().any(|entry| entry["uid"] == "test1" && entry["primaryKey"] == "key"));
}
#[actix_rt::test]

View File

@@ -1,11 +1,10 @@
use crate::common::{shared_does_not_exists_index, Server};
use crate::common::Server;
use crate::json;
#[actix_rt::test]
async fn stats() {
let server = Server::new_shared();
let index = server.unique_index();
let server = Server::new().await;
let index = server.index("test");
let (task, code) = index.create(Some("id")).await;
assert_eq!(code, 202);
@@ -16,7 +15,7 @@ async fn stats() {
assert_eq!(code, 200);
assert_eq!(response["numberOfDocuments"], 0);
assert_eq!(response["isIndexing"], false);
assert!(response["isIndexing"] == false);
assert!(response["fieldDistribution"].as_object().unwrap().is_empty());
let documents = json!([
@@ -32,6 +31,7 @@ async fn stats() {
let (response, code) = index.add_documents(documents, None).await;
assert_eq!(code, 202);
assert_eq!(response["taskUid"], 1);
index.wait_task(response.uid()).await.succeeded();
@@ -39,7 +39,7 @@ async fn stats() {
assert_eq!(code, 200);
assert_eq!(response["numberOfDocuments"], 2);
assert_eq!(response["isIndexing"], false);
assert!(response["isIndexing"] == false);
assert_eq!(response["fieldDistribution"]["id"], 2);
assert_eq!(response["fieldDistribution"]["name"], 1);
assert_eq!(response["fieldDistribution"]["age"], 1);
@@ -47,11 +47,11 @@ async fn stats() {
#[actix_rt::test]
async fn error_get_stats_unexisting_index() {
let index = shared_does_not_exists_index().await;
let (response, code) = index.stats().await;
let server = Server::new().await;
let (response, code) = server.index("test").stats().await;
let expected_response = json!({
"message": format!("Index `{}` not found.", index.uid),
"message": "Index `test` not found.",
"code": "index_not_found",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#index_not_found"

View File

@@ -112,26 +112,6 @@ async fn simple_search() {
.await;
}
/// See <https://github.com/meilisearch/meilisearch/issues/5547>
#[actix_rt::test]
async fn bug_5547() {
let server = Server::new().await;
let index = server.index("big_fst");
let (response, _code) = index.create(None).await;
index.wait_task(response.uid()).await.succeeded();
let mut documents = Vec::new();
for i in 0..65_535 {
documents.push(json!({"id": i, "title": format!("title{i}")}));
}
let (response, _code) = index.add_documents(json!(documents), Some("id")).await;
index.wait_task(response.uid()).await.succeeded();
let (response, code) = index.search_post(json!({"q": "title"})).await;
assert_eq!(code, 200);
snapshot!(response["hits"], @r###"[{"id":0,"title":"title0"},{"id":1,"title":"title1"},{"id":10,"title":"title10"},{"id":100,"title":"title100"},{"id":101,"title":"title101"},{"id":102,"title":"title102"},{"id":103,"title":"title103"},{"id":104,"title":"title104"},{"id":105,"title":"title105"},{"id":106,"title":"title106"},{"id":107,"title":"title107"},{"id":108,"title":"title108"},{"id":1000,"title":"title1000"},{"id":1001,"title":"title1001"},{"id":1002,"title":"title1002"},{"id":1003,"title":"title1003"},{"id":1004,"title":"title1004"},{"id":1005,"title":"title1005"},{"id":1006,"title":"title1006"},{"id":1007,"title":"title1007"}]"###);
}
#[actix_rt::test]
async fn search_with_stop_word() {
// related to https://github.com/meilisearch/meilisearch/issues/4984

View File

@@ -416,381 +416,3 @@ async fn phrase_search_on_title() {
)
.await;
}
static NESTED_SEARCH_DOCUMENTS: Lazy<Value> = Lazy::new(|| {
json!([
{
"details": {
"title": "Shazam!",
"desc": "a Captain Marvel ersatz",
"weaknesses": ["magic", "requires transformation"],
"outfit": {
"has_cape": true,
"colors": {
"primary": "red",
"secondary": "gold"
}
}
},
"id": "1",
},
{
"details": {
"title": "Captain Planet",
"desc": "He's not part of the Marvel Cinematic Universe",
"blue_skin": true,
"outfit": {
"has_cape": false
}
},
"id": "2",
},
{
"details": {
"title": "Captain Marvel",
"desc": "a Shazam ersatz",
"weaknesses": ["magic", "power instability"],
"outfit": {
"has_cape": false
}
},
"id": "3",
}])
});
#[actix_rt::test]
async fn nested_search_on_title_with_prefix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Wildcard should match to 'details.' attribute
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["*.title"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_search_with_suffix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Wildcard should match to any attribute inside 'details.'
// It's worth noting the difference between 'details.*' and '*.title'
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "1"
},
{
"id": "2"
}
]"###);
},
)
.await;
// Should return 1 document (ids: 1)
index
.search(
json!({"q": "gold", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "1"
}
]"###);
},
)
.await;
// Should return 2 documents (ids: 1 and 2)
index
.search(
json!({"q": "true", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "1"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_search_on_title_restricted_set_with_suffix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
let (task, _status_code) =
index.update_settings_searchable_attributes(json!(["details.title"])).await;
index.wait_task(task.uid()).await.succeeded();
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_search_no_searchable_attribute_set_with_any_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["unknown.*", "*.unknown"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(response["hits"].as_array().unwrap().len(), @"0");
},
)
.await;
let (task, _status_code) = index.update_settings_searchable_attributes(json!(["*"])).await;
index.wait_task(task.uid()).await.succeeded();
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["unknown.*", "*.unknown"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(response["hits"].as_array().unwrap().len(), @"0");
},
)
.await;
let (task, _status_code) = index.update_settings_searchable_attributes(json!(["*"])).await;
index.wait_task(task.uid()).await.succeeded();
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["unknown.*", "*.unknown", "*.title"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_prefix_search_on_title_with_prefix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Nested prefix search with prefix wildcard should return 2 documents (ids: 2 and 3).
index
.search(
json!({"q": "Captain Mar", "attributesToSearchOn": ["*.title"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_prefix_search_on_details_with_suffix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
index
.search(
json!({"q": "Captain Mar", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "1"
},
{
"id": "2"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_prefix_search_on_weaknesses_with_suffix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Wildcard search on nested weaknesses should return 2 documents (ids: 1 and 3)
index
.search(
json!({"q": "mag", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "1"
},
{
"id": "3"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_search_on_title_matching_strategy_all() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Nested search matching strategy all should only return 1 document (ids: 3)
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["*.title"], "matchingStrategy": "all", "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
}
]"###);
},
)
.await;
}
#[actix_rt::test]
async fn nested_attributes_ranking_rule_order_with_prefix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Document 3 should appear before documents 1 and 2
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["*.desc", "*.title"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "1"
},
{
"id": "2"
}
]
"###
);
},
)
.await;
}
#[actix_rt::test]
async fn nested_attributes_ranking_rule_order_with_suffix_wildcard() {
let server = Server::new().await;
let index = index_with_documents(&server, &NESTED_SEARCH_DOCUMENTS).await;
// Document 3 should appear before documents 1 and 2
index
.search(
json!({"q": "Captain Marvel", "attributesToSearchOn": ["details.*"], "attributesToRetrieve": ["id"]}),
|response, code| {
snapshot!(code, @"200 OK");
snapshot!(json_string!(response["hits"]),
@r###"
[
{
"id": "3"
},
{
"id": "1"
},
{
"id": "2"
}
]
"###
);
},
)
.await;
}

View File

@@ -50,7 +50,7 @@ impl AttributePatterns {
///
/// * `pattern` - The pattern to match against.
/// * `str` - The string to match against the pattern.
pub fn match_pattern(pattern: &str, str: &str) -> PatternMatch {
fn match_pattern(pattern: &str, str: &str) -> PatternMatch {
// If the pattern is a wildcard, return Match
if pattern == "*" {
return PatternMatch::Match;

View File

@@ -32,13 +32,13 @@ impl ExternalDocumentsIds {
&self,
rtxn: &RoTxn<'_>,
external_id: A,
) -> heed::Result<Option<u32>> {
) -> heed::Result<Option<DocumentId>> {
self.0.get(rtxn, external_id.as_ref())
}
/// An helper function to debug this type, returns an `HashMap` of both,
/// soft and hard fst maps, combined.
pub fn to_hash_map(&self, rtxn: &RoTxn<'_>) -> heed::Result<HashMap<String, u32>> {
pub fn to_hash_map(&self, rtxn: &RoTxn<'_>) -> heed::Result<HashMap<String, DocumentId>> {
let mut map = HashMap::default();
for result in self.0.iter(rtxn)? {
let (external, internal) = result?;

View File

@@ -7,6 +7,7 @@ use crate::FieldId;
mod global;
pub mod metadata;
pub use global::GlobalFieldsIdsMap;
pub use metadata::{FieldIdMapWithMetadata, MetadataBuilder};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FieldsIdsMap {

View File

@@ -23,6 +23,7 @@ use crate::heed_codec::facet::{
use crate::heed_codec::version::VersionCodec;
use crate::heed_codec::{BEU16StrCodec, FstSetCodec, StrBEU16Codec, StrRefCodec};
use crate::order_by_map::OrderByMap;
use crate::prompt::PromptData;
use crate::proximity::ProximityPrecision;
use crate::vector::{ArroyStats, ArroyWrapper, Embedding, EmbeddingConfig};
use crate::{
@@ -79,6 +80,7 @@ pub mod main_key {
pub const PREFIX_SEARCH: &str = "prefix_search";
pub const DOCUMENTS_STATS: &str = "documents_stats";
pub const DISABLED_TYPOS_TERMS: &str = "disabled_typos_terms";
pub const CHAT: &str = "chat";
}
pub mod db_name {
@@ -1691,6 +1693,25 @@ impl Index {
self.main.remap_key_type::<Str>().delete(txn, main_key::FACET_SEARCH)
}
pub fn chat_config(&self, txn: &RoTxn<'_>) -> heed::Result<ChatConfig> {
self.main
.remap_types::<Str, SerdeBincode<_>>()
.get(txn, main_key::CHAT)
.map(|o| o.unwrap_or_default())
}
pub(crate) fn put_chat_config(
&self,
txn: &mut RwTxn<'_>,
val: &ChatConfig,
) -> heed::Result<()> {
self.main.remap_types::<Str, SerdeBincode<_>>().put(txn, main_key::CHAT, &val)
}
pub(crate) fn delete_chat_config(&self, txn: &mut RwTxn<'_>) -> heed::Result<bool> {
self.main.remap_key_type::<Str>().delete(txn, main_key::CHAT)
}
pub fn localized_attributes_rules(
&self,
rtxn: &RoTxn<'_>,
@@ -1917,6 +1938,13 @@ pub struct IndexEmbeddingConfig {
pub user_provided: RoaringBitmap,
}
#[derive(Debug, Default, Deserialize, Serialize)]
pub struct ChatConfig {
pub description: String,
/// Contains the document template and max template length.
pub prompt: PromptData,
}
#[derive(Debug, Deserialize, Serialize)]
pub struct PrefixSettings {
pub prefix_count_threshold: usize,

View File

@@ -52,18 +52,19 @@ pub use search::new::{
};
use serde_json::Value;
pub use thread_pool_no_abort::{PanicCatched, ThreadPoolNoAbort, ThreadPoolNoAbortBuilder};
pub use {charabia as tokenizer, heed, rhai};
pub use {arroy, charabia as tokenizer, heed, rhai};
pub use self::asc_desc::{AscDesc, AscDescError, Member, SortError};
pub use self::attribute_patterns::AttributePatterns;
pub use self::attribute_patterns::PatternMatch;
pub use self::attribute_patterns::{AttributePatterns, PatternMatch};
pub use self::criterion::{default_criteria, Criterion, CriterionError};
pub use self::error::{
Error, FieldIdMapMissingEntry, InternalError, SerializationError, UserError,
};
pub use self::external_documents_ids::ExternalDocumentsIds;
pub use self::fieldids_weights_map::FieldidsWeightsMap;
pub use self::fields_ids_map::{FieldsIdsMap, GlobalFieldsIdsMap};
pub use self::fields_ids_map::{
FieldIdMapWithMetadata, FieldsIdsMap, GlobalFieldsIdsMap, MetadataBuilder,
};
pub use self::filterable_attributes_rules::{
FilterFeatures, FilterableAttributesFeatures, FilterableAttributesPatterns,
FilterableAttributesRule,
@@ -84,8 +85,6 @@ pub use self::search::{
};
pub use self::update::ChannelCongestion;
pub use arroy;
pub type Result<T> = std::result::Result<T, error::Error>;
pub type Attribute = u32;

View File

@@ -105,10 +105,10 @@ impl Prompt {
max_bytes,
};
// render template with special object that's OK with `doc.*` and `fields.*`
this.template
.render(&template_checker::TemplateChecker)
.map_err(NewPromptError::invalid_fields_in_template)?;
// // render template with special object that's OK with `doc.*` and `fields.*`
// this.template
// .render(&template_checker::TemplateChecker)
// .map_err(NewPromptError::invalid_fields_in_template)?;
Ok(this)
}

View File

@@ -52,7 +52,6 @@ pub use self::geo_sort::Strategy as GeoSortStrategy;
use self::graph_based_ranking_rule::Words;
use self::interner::Interned;
use self::vector_sort::VectorSort;
use crate::attribute_patterns::{match_pattern, PatternMatch};
use crate::constants::RESERVED_GEO_FIELD_NAME;
use crate::index::PrefixSearch;
use crate::localized_attributes_rules::LocalizedFieldIds;
@@ -121,37 +120,17 @@ impl<'ctx> SearchContext<'ctx> {
let searchable_fields_weights = self.index.searchable_fields_and_weights(self.txn)?;
let exact_attributes_ids = self.index.exact_attributes_ids(self.txn)?;
let mut universal_wildcard = false;
let mut wildcard = false;
let mut restricted_fids = RestrictedFids::default();
for field_name in attributes_to_search_on {
if field_name == "*" {
universal_wildcard = true;
wildcard = true;
// we cannot early exit as we want to returns error in case of unknown fields
continue;
}
let searchable_weight =
searchable_fields_weights.iter().find(|(name, _, _)| name == field_name);
// The field is not searchable but may contain a wildcard pattern
if searchable_weight.is_none() && field_name.contains("*") {
let matching_searchable_weights: Vec<_> = searchable_fields_weights
.iter()
.filter(|(name, _, _)| match_pattern(field_name, name) == PatternMatch::Match)
.collect();
if !matching_searchable_weights.is_empty() {
for (_name, fid, weight) in matching_searchable_weights {
if exact_attributes_ids.contains(fid) {
restricted_fids.exact.push((*fid, *weight));
} else {
restricted_fids.tolerant.push((*fid, *weight));
}
}
continue;
}
}
let (fid, weight) = match searchable_weight {
// The Field id exist and the field is searchable
Some((_name, fid, weight)) => (*fid, *weight),
@@ -181,7 +160,7 @@ impl<'ctx> SearchContext<'ctx> {
};
}
if universal_wildcard {
if wildcard {
self.restricted_fids = None;
} else {
self.restricted_fids = Some(restricted_fids);

View File

@@ -92,12 +92,12 @@ fn find_one_typo_derivations(
let mut stream = fst.search_with_state(Intersection(starts, &dfa)).into_stream();
while let Some((derived_word, state)) = stream.next() {
let derived_word = std::str::from_utf8(derived_word)?;
let derived_word = ctx.word_interner.insert(derived_word.to_owned());
let d = dfa.distance(state.1);
match d.to_u8() {
0 => (),
1 => {
let derived_word = std::str::from_utf8(derived_word)?;
let derived_word = ctx.word_interner.insert(derived_word.to_owned());
let cf = visit(derived_word)?;
if cf.is_break() {
break;

View File

@@ -0,0 +1,45 @@
use deserr::Deserr;
use serde::{Deserialize, Serialize};
use utoipa::ToSchema;
use crate::index::ChatConfig;
use crate::prompt::{default_max_bytes, PromptData};
use crate::update::Setting;
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq, Eq, Deserr, ToSchema)]
#[serde(deny_unknown_fields, rename_all = "camelCase")]
#[deserr(deny_unknown_fields, rename_all = camelCase)]
pub struct ChatSettings {
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
pub description: Setting<String>,
/// A liquid template used to render documents to a text that can be embedded.
///
/// Meillisearch interpolates the template for each document and sends the resulting text to the embedder.
/// The embedder then generates document vectors based on this text.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
pub document_template: Setting<String>,
/// Rendered texts are truncated to this size. Defaults to 400.
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<usize>)]
pub document_template_max_bytes: Setting<usize>,
}
impl From<ChatConfig> for ChatSettings {
fn from(config: ChatConfig) -> Self {
let ChatConfig { description, prompt: PromptData { template, max_bytes } } = config;
ChatSettings {
description: Setting::Set(description),
document_template: Setting::Set(template),
document_template_max_bytes: Setting::Set(
max_bytes.unwrap_or(default_max_bytes()).get(),
),
}
}
}

View File

@@ -1,4 +1,5 @@
pub use self::available_ids::AvailableIds;
pub use self::chat::ChatSettings;
pub use self::clear_documents::ClearDocuments;
pub use self::concurrent_available_ids::ConcurrentAvailableIds;
pub use self::facet::bulk::FacetsUpdateBulk;
@@ -13,6 +14,7 @@ pub use self::words_prefix_integer_docids::WordPrefixIntegerDocids;
pub use self::words_prefixes_fst::WordsPrefixesFst;
mod available_ids;
mod chat;
mod clear_documents;
mod concurrent_available_ids;
pub(crate) mod del_add;

View File

@@ -13,7 +13,7 @@ use time::OffsetDateTime;
use super::del_add::{DelAdd, DelAddOperation};
use super::index_documents::{IndexDocumentsConfig, Transform};
use super::IndexerConfig;
use super::{ChatSettings, IndexerConfig};
use crate::attribute_patterns::PatternMatch;
use crate::constants::RESERVED_GEO_FIELD_NAME;
use crate::criterion::Criterion;
@@ -22,11 +22,11 @@ use crate::error::UserError;
use crate::fields_ids_map::metadata::{FieldIdMapWithMetadata, MetadataBuilder};
use crate::filterable_attributes_rules::match_faceted_field;
use crate::index::{
IndexEmbeddingConfig, PrefixSearch, DEFAULT_MIN_WORD_LEN_ONE_TYPO,
ChatConfig, IndexEmbeddingConfig, PrefixSearch, DEFAULT_MIN_WORD_LEN_ONE_TYPO,
DEFAULT_MIN_WORD_LEN_TWO_TYPOS,
};
use crate::order_by_map::OrderByMap;
use crate::prompt::default_max_bytes;
use crate::prompt::{default_max_bytes, PromptData};
use crate::proximity::ProximityPrecision;
use crate::update::index_documents::IndexDocumentsMethod;
use crate::update::{IndexDocuments, UpdateIndexingStep};
@@ -185,6 +185,7 @@ pub struct Settings<'a, 't, 'i> {
localized_attributes_rules: Setting<Vec<LocalizedAttributesRule>>,
prefix_search: Setting<PrefixSearch>,
facet_search: Setting<bool>,
chat: Setting<ChatSettings>,
}
impl<'a, 't, 'i> Settings<'a, 't, 'i> {
@@ -223,6 +224,7 @@ impl<'a, 't, 'i> Settings<'a, 't, 'i> {
localized_attributes_rules: Setting::NotSet,
prefix_search: Setting::NotSet,
facet_search: Setting::NotSet,
chat: Setting::NotSet,
indexer_config,
}
}
@@ -453,6 +455,14 @@ impl<'a, 't, 'i> Settings<'a, 't, 'i> {
self.facet_search = Setting::Reset;
}
pub fn set_chat(&mut self, value: ChatSettings) {
self.chat = Setting::Set(value);
}
pub fn reset_chat(&mut self) {
self.chat = Setting::Reset;
}
#[tracing::instrument(
level = "trace"
skip(self, progress_callback, should_abort, settings_diff),
@@ -1239,6 +1249,45 @@ impl<'a, 't, 'i> Settings<'a, 't, 'i> {
Ok(())
}
fn update_chat_config(&mut self) -> heed::Result<bool> {
match &mut self.chat {
Setting::Set(ChatSettings {
description: new_description,
document_template: new_document_template,
document_template_max_bytes: new_document_template_max_bytes,
}) => {
let mut old = self.index.chat_config(self.wtxn)?;
let ChatConfig {
ref mut description,
prompt: PromptData { ref mut template, ref mut max_bytes },
} = old;
match new_description {
Setting::Set(d) => *description = d.clone(),
Setting::Reset => *description = Default::default(),
Setting::NotSet => (),
}
match new_document_template {
Setting::Set(dt) => *template = dt.clone(),
Setting::Reset => *template = Default::default(),
Setting::NotSet => (),
}
match new_document_template_max_bytes {
Setting::Set(m) => *max_bytes = NonZeroUsize::new(*m),
Setting::Reset => *max_bytes = Some(default_max_bytes()),
Setting::NotSet => (),
}
self.index.put_chat_config(self.wtxn, &old)?;
Ok(true)
}
Setting::Reset => self.index.delete_chat_config(self.wtxn),
Setting::NotSet => Ok(false),
}
}
pub fn execute<FP, FA>(mut self, progress_callback: FP, should_abort: FA) -> Result<()>
where
FP: Fn(UpdateIndexingStep) + Sync,
@@ -1276,6 +1325,7 @@ impl<'a, 't, 'i> Settings<'a, 't, 'i> {
self.update_facet_search()?;
self.update_localized_attributes_rules()?;
self.update_disabled_typos_terms()?;
self.update_chat_config()?;
let embedding_config_updates = self.update_embedding_configs()?;

View File

@@ -33,6 +33,7 @@ pub struct EmbeddingSettings {
///
/// - Defaults to `openAi`
pub source: Setting<EmbedderSource>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
@@ -55,6 +56,7 @@ pub struct EmbeddingSettings {
/// - For source `openAi`, defaults to `text-embedding-3-small`
/// - For source `huggingFace`, defaults to `BAAI/bge-base-en-v1.5`
pub model: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
@@ -75,6 +77,7 @@ pub struct EmbeddingSettings {
/// - When `model` is set to default, defaults to `617ca489d9e86b49b8167676d8220688b99db36e`
/// - Otherwise, defaults to `null`
pub revision: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<OverridePooling>)]
@@ -96,6 +99,7 @@ pub struct EmbeddingSettings {
///
/// - Embedders created before this parameter was available default to `forceMean` to preserve the existing behavior.
pub pooling: Setting<OverridePooling>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
@@ -118,6 +122,7 @@ pub struct EmbeddingSettings {
///
/// - This setting is partially hidden when returned by the settings
pub api_key: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
@@ -141,6 +146,7 @@ pub struct EmbeddingSettings {
/// - For source `openAi`, the dimensions is the maximum allowed by the model.
/// - For sources `ollama` and `rest`, the dimensions are inferred by embedding a sample text.
pub dimensions: Setting<usize>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<bool>)]
@@ -167,6 +173,7 @@ pub struct EmbeddingSettings {
/// first enabling it. If you are unsure of whether the performance-relevancy tradeoff is right for you,
/// we recommend to use this parameter on a test index first.
pub binary_quantized: Setting<bool>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<bool>)]
@@ -183,6 +190,7 @@ pub struct EmbeddingSettings {
///
/// - 🏗️ When modified, embeddings are regenerated for documents whose rendering through the template produces a different text.
pub document_template: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<usize>)]
@@ -201,6 +209,7 @@ pub struct EmbeddingSettings {
///
/// - Defaults to 400
pub document_template_max_bytes: Setting<usize>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<String>)]
@@ -219,6 +228,7 @@ pub struct EmbeddingSettings {
/// - 🌱 When modified for source `openAi`, embeddings are never regenerated
/// - 🏗️ When modified for sources `ollama` and `rest`, embeddings are always regenerated
pub url: Setting<String>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<serde_json::Value>)]
@@ -236,6 +246,7 @@ pub struct EmbeddingSettings {
///
/// - 🏗️ Changing the value of this parameter always regenerates embeddings
pub request: Setting<serde_json::Value>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<serde_json::Value>)]
@@ -253,6 +264,7 @@ pub struct EmbeddingSettings {
///
/// - 🏗️ Changing the value of this parameter always regenerates embeddings
pub response: Setting<serde_json::Value>,
#[serde(default, skip_serializing_if = "Setting::is_not_set")]
#[deserr(default)]
#[schema(value_type = Option<BTreeMap<String, String>>)]